copyright (c) 1999 by R. I. 'Scibor-Marchocki
This is a collections of e-mail letters, which I wrote to my very good friend Deb, as a set of lessons in Linear Algebra. I have deleted each salutation, to protect her privacy. If anybody shows interest, someday, I might edit these lessons into a better organized tutorial. So, please write me at webmaster@rism.com.
Last time, I said that an eigen-vector is unique. I meant, in the sense of an *axial*-vector. We consider any non-zero real-multiple of an axial-vector to be the same axial-vector. Of course, multiplication by a negative number may lead to an improper set of eigen-vectors. The concept of proper _vs._ improper is not applicable to the individual eigen-vectors -- only to the whole set; because propriety is defined in terms of the determinant of the ortho-normal matrix.
Consider an n-th degree polynomial. Its leading coefficient -- that of x**n -- cannot be zero; otherwise, the polynomial would be of lower degree. Divide through by this coefficient, to obtain a new polynomial -- called "monic" -- whose leading coefficient is one.
The coefficient a(n) of the constant term is zero iff (= if and only if) the polynomial equation f(x)=0 has a zero as one of its roots; that is, iff f(0) = 0.
Let us consider only those monic polynomials for which f(0) is not zero; that is, those that do not have zero as a root.
For x a root of the polynomial, that is for an x such that f(x) = 0, let us write
g(x) = (1 - f(x) / a(n)) / x = 1 / x
The left-side g(x) is a new polynomial in x -- there is no term in 1/x. This equation provides a way of computing the reciprocal of x, without performing any divisions. Also, by taking the reciprocals of each side, we obtain
x = 1 / g(x)
It gets even better -- or worse -- depending upon your point of view.
Theorem For a symmetric (Hermitian) matrix A, the matrix satisfies its eigen-polynomial f(x); that is f(A) = 0.
Corollary The (inverse of A) = g(A) and A = I * (inverse of g(A))
Thus, we have a way of finding the inverse of a matrix, without performing any divisions. Admittedly, it is slower than some other algorithms.
However, the last equation gives us the original matrix A. So what, you may ask? Well, supposing we have forgotten its value; then we can recover A. During the 1960's, the drone aircraft, which flew over Vietnam, employed this scheme for both ECC (= Error Checking and Correcting) and cryptography. We employed a polynomial of degree 63, with binary coefficients. These coefficients would be set arbitrarily upon launch of each flight. We considered that with the contemporary technology, it would take a crypto-analyst at least a week to guess the polynomial, by brute force. By that time, we would not care that he could decode the encrypted message.
Definition A symmetric (Hermitian) matrix is said to be *positive-definite* iff each of its eigen-values is greater than zero. Hence, it is not singular.
Theorem The determinant of a positive-definite symmetric (Hermitian) matrix is the exponential of the trace of the logarithm of the matrix. If the matrix is of infinite order, we would employ a Lebesque-Stieltjes integral -- rather than a sum -- to compute the trace.
You may ask, "How does one compute the logarithm of a matrix?" Well, how do you find any logarithm -- say of a positive-definite real number -- do it the same way. In the older days, the student would say that he looks-up the logarithm in a table; but that tables do not list matrices. Well, that is just passing the buck. Someone had to compute the table. I will take away your table. Now, how do you compute the logarithm of a positive-definite real number? You say that you do not know? Well, *learn*! We already had shown how to compute logarithms, in our study of Trigonometry.
Nowadays, the student would say that he employs his calculator to find logarithms; but that it does not accept a matrix. The rejoinder would be to get a better calculator!
As an interlude, let us consider an equilateral triangle. Directly compute its area, the radius of the circum-circle, and the radius of the in-circle. Do so for each of the plane and spherical cases. Check our general formulae, in the Trigonometry textbook.
------------------
Draw a circle, with a radius equal to two. Draw six radii, each every pi/3. Connect the tips of one set of three alternate radii. Draw a concentric circle of radius one.
Now, you have an equilateral triangle, with sides a equal to 2*sqrt(3), a circum-scribing circle of radius R = 2, and an inscribing circle of radius r = 1. The area of the triangle is 3*sqrt(3).
How does it check with our general formulae in the Trigonometry textbook?
R**2 = a**6 / (4 * 3 * a**4 - (3 * a)**4) = (1 / 3) * a**2
R = a * sqrt(3) / 3
R = (2 * sqrt(3)) * sqrt(3) / 3 = 2 check
r = a**3 / (2 * R * 3 * a) = a**2 / (6 * R)
r = (2 * sqrt(3))**2 / (6 * 2) = 1 check
area**2 = s * (s - a)**3, where s = 3 * a / 2
s - a = a / 2
area**2 = (3 * a / 2) * (a / 2)**3
area**2 = (3 * (2 * sqrt(3)) / 2) * ((2 * sqrt(3)) / 2)**3
= 18
area = 3 * sqrt(3) check
Draw a sphere, with a radius equal to one. Draw the three great-circles defining an octant. This is an equilateral triangle. Draw the circum-circle and the in-circle.
It is conventional to express distances on a sphere in terms of the sine of the corresponding central angle. Hence, these sines are equal to the actual radii of the two circles, in their own planes. Therefore, all we need do, is find the straight-line distances between the vertices of the inscribed triangles.
The area of the spherical triangle is 4 * pi / 8 = pi / 2. The distance between a pair of the vertices of the triangle is sqrt(2). Hence, the circum-radius is
sqrt(2) * R / a = sqrt(2) * 2 / (2 * sqrt(3)) = sqrt(6) / 3.
By symmetry, the in-circle is tangent to the mid-points of the triangle. A pair of these points are at
(i + j) / sqrt(2) and (i + k) / sqrt(2)
The cosine of the central angle is their dot product = 1 / 2.
Hence, the central angle is pi / 3. Thus, the distance between these points is 1. The in-radius is
1 * R / a = 1 * 2 / (2 * sqrt(3)) = sqrt(3) / 3.
How does it check with our general spherical triangle formulae?
This example is worked-out and the circum-radius and in-radius checks. l'Huilier's Theorem for the area gives
s = 3 * side / 2 = 3 * (pi / 2) / 2 = 3 * pi / 4
s - side = 3 * pi / 4 - (pi / 2) = pi / 4
sin(pi/4) = cos(pi/4) = 1 / sqrt(2)
tan(pi/8) = (1 - cos(pi 4)) / sin(pi/4) = sqrt(2) - 1
tan(3 * pi/8) = tan(pi/4 + pi/8) =
(tan(pi/4) + tan(pi/8) / (1 - tan(pi/4) * tan(pi/8)) =
(1 + (sqrt(2) - 1)) / (1 - 1 * (sqrt(2) - 1)) =
sqrt(2) / (2 - sqrt(2)) = sqrt(2) + 1
tan(E/4) = sqrt((sqrt(2) + 1) * (sqrt(2) - 1)**3) = sqrt(2) - 1
E/4 = pi/8 Hence area = E = pi/2 check.
These checks neither prove the formulae, nor even verify them. However, these checks are gratifying. Also, the most likely error it the derivation of any formula is something like a missing factor of two. These checks would catch such an error. Thus, they increase our confidence in the correctness of the formulae.
------------------
My very own Web site, hosted by hostway, at the URL
http//www.rism.com
has made its appearance. Now, I will be able to configure my FTP and up-load the pages. I will look into it tomorrow. I will keep you posted as to these developments.
Let us consider a matrix A, which is more general than a symmetric (Hermitian) matrix.
Can we find eigen-values or eigen-vectors? Will they pair-up into eigen-pairs? Will the eigen-vectors be ortho-normal (unitary)? Will the matrix factor
A = O * D * Otr = O * D * (inverse of O) ?
What will happen if we run any of the several computer programs, which had been successful on a symmetric (Hermitian) matrix, on this new matrix A? Try several different matrices A, are some more amenable to the programs?
The answer is that, depending on the vagaries of the coding-implementation, a program may crash or yield something, sometimes. But, the results are inconsistent and do not provide a satisfactory factorization of A.
Some people bemoan our inability to write a successful algorithm. They admonish others to try harder.
It is akin to asking for an algorithm to solve x**2 - 2 = 0, in terms of rational numbers, or to solve x**2 + 1 = 0, in terms of real numbers. It is not our lack of skill or ingenuity -- it just *cannot* be done.
Remember, we had a theorem that says that for any given ortho-normal (unitary) matrix O and any given diagonal matrix D, the product
A = O * D * Otr = O * D * (inverse of O)
is a symmetric (Hermitian) matrix. The converse we proved from the Jacobi algorithm. Thus we are stuck. Something must give.
Depending upon how we relax our requirements (expectations), we might obtain *something* -- perhaps, something interesting. Certainly, depending on what we do, we would expect to obtain something *different*. Or, the possibility certainly exists that it might be impossible to obtain anything.
Some people have obtained some things, sometimes at least mildly interesting. Each person has named his/her specific results "eigen-value, eigen-vector, eigen-pairs". These concepts are not cross-compatible, however.
When other people write textbooks, they tend to mix and match, usually with disastrous results of incompatibility. Do not believe anything you see in print regarding non-symmetric (non-Hermitian) matrices, without exceedingly careful verification. Even anything regarding Hermitian matrices is suspect.
I will attempt to provide a consistent terminology.
P. A. M. Dirac (Paul Anton Maurice) had a profound knowledge of Mathematics; but he wrote only Physics books. At best, we can only infer the underlying Mathematics, from the Physics problems, which he discusses. Thus, I will admit his inspiration upon me; but it is impossible to tell how much of my results are original, how much are independently obtained, and how much are learned from him.
Dirac did introduce a notation, which I will employ. The () are parentheses, the [] are brackets, and the {} are braces. Direc introduced the <>, which he called pointed-brackets. He also gave each one its individual name The < is the *bra* and the > is the *ket*.
It never fails, the girls, who had been sleeping through class, all semester, suddenly wake up and become very attentive, when we start talking about exposing, finding, or removing the bra. And making a distinction between the external and the internal bra, or between the large and small bra! Did you, Debra, wake up? -)
Definition A permutation matrix is a permutation of the identity matrix.
Definition An elementary permutation matrix is a permutation matrix, with all except two of its elements corresponding to the identity matrix.
Theorem A *permutation* matrix is an ortho-normal matrix.
Theorem A product of permutation matrices is a permutation matrix.
Theorem Any permutation matrix may be factored into a product of elementary permutation matrices.
Remember the definition of an ortho-normal (unitary) matrix
Definition O * Otr = I
from which it follows that (1/O) = Otr. In particular, the determinant of O is not zero (actually has an absolute value of one) and O spans the space -- it is not a divisor of zero.
Theorem Given any real (complex) matrix A and any ortho-normal (unitary) matrix O. If the product O * A = 0 or the product A * O = 0, then the matrix A is null; that is, A = 0. Proof follows from the fact that O is not a divisor of zero.
Theorem Any real (complex) matrix A may be decomposed into the symmetric (Hermitian) and skew-symmetric (skew-Hermitian) parts, respectively
A = (A + Atr) / 2 + (A - Atr) / 2
Furthermore, these two parts added together reconstitute the original matrix A.
Theorem For any real (complex) matrix A, the product B = A * Atr is a symmetric (Hermitian) matrix. Reconstitution of A from B is not obvious.
Theorem For any symmetric (Hermitian) matrix B, the matrix C = B * Btr is positive; that is, each of its eigen-values is non-negative.
Corollary For any real (complex) matrix A, the matrix C = B * Btr is positive; that is, each of its eigen-values is non-negative. Where B = A * Atr.
Theorem For a non-singular real (complex) matrix A, its determinant is the exponential of one-fourth of the trace of the logarithm of C. If the matrix is of infinite order n, the trace has to be computed as a Lebesque-Stieltjes integral. Where C = B * Btr and B = A * Atr.
Theorem Given any real (complex) matrix A. It can be factored
A = <S>
The bra < and the ket > each is a related ortho-normal (unitary) matrix. The *spectrum* S is a diagonal matrix. Where the bra < = O, the spectrum S = sqrt(abs(D)), and the ket > = (1 / S) * Otr * A. And where the matrix B = A * Atr had been factored -- by any suitable algorithm, in particular, the Jacobi algorithm -- into B = O * D * Otr, with O being an ortho-normal (unitary) matrix and D being a diagonal -- of necessity real -- matrix. The absolute value is in the sense of real -- rather than complex -- numbers.
Proof
(1/<) * A * (1/>) = (Otr) * A * ((1/A) * O * S) = S
(1/<) * (<S>) * (1/>) = S
(1/<) * (A - <S>) * (1/>) = 0
A - <S> = 0
A = <S>
QED.
Corollary Iff (= if and only if) the matrix A is symmetric (Hermitian); then the ket is the transpose of the bra and the spectrum is the set of eigen-values. That is, we have our previous factorization, with det(A - xI) = 0 yielding the eigen-values and the O being the set of corresponding eigen-vectors.
Discussion When I first implemented this algorithm, I decided to compute the ket utilizing the same program as had provided the bra, but operating upon the transpose of A. It indeed yielded something equivalent to the ket; but it did not want to correspond to the bra. The ket might have to be multiplied by a proper permutation matrix. And some of its columns might have to be multiplied by minus one. And the columns corresponding to any degeneracy only span the same space as do the corresponding rows of the bra. The ket *could* be fixed-up. But there has to be a better way-- see the foregoing theorem.
It was only after considerable experience with the algorithm, that I realized the necessity of doing something about the occasional negative element in the diagonal matrix D. Negative values did not show-up during the prolonged early testing. When negative values did turn-up, at first I attempted to absorb them into the ket. The theorem provides a better solution.
This is a powerful theorem, with profound ramifications.
It becomes apparent how to invert an ill-conditioned matrix -- in either of two algorithms. Even a singular matrix may be inverted easily. The determinant of a non-square matrix may be computed -- to the amazement of the mathematics instructors at Mt. San Antonio Community College.
An algorithm can be devised for the factorization of the partitioned matrix of the three-channels-in-cascade problem of Information Theory. From which, among other things, in Statistics, the simultaneous principal-factors and regression can be calculated. Thus, resolving the anomaly related to the worsening of the correlation when additional factors are introduced into the mufti-factor regression expression.
The next lesson will be on the twice-inversion. This is your last chance to discover this algorithm.
webmaster@rism.com
Last modified on Thursday 22-nd July 1999
Copyright (c) 1999 by R. I. 'Scibor-Marchocki