copyright (c) 1999 by R. I. 'Scibor-Marchocki
This is a collections of e-mail letters, which I wrote to my very good friend Deb, as a set of lessons in Linear Algebra. I have deleted each salutation, to protect her privacy. If anybody shows interest, someday, I might edit these lessons into a better organized tutorial. So, please write me at webmaster@rism.com.
There are three (3) ways of introducing complex-number algebra. I list them in order of decreasing popularity.
Zero. Define i as the square-root of minus one. And proceed in the usual way with a+ib, where a and b are real-numbers. I call this method "zero"; because, it is flawed logically. It makes no sense to define anything in terms of something which you do not know. The initial statement, "Define i as the square-root of minus one." is meaningless. You do not know what the square-root of minus one is. Hence, strike this method.
One. Begin with a division-ring, for instance, the field of real-numbers. Remember, any field, of necessity, is a division-ring. Now, adjoin i as the root of the equation i**2 + 1 = 0. By saying, "adjoin", we mean that we do not claim to have even the faintest idea of what i might be. However, we are telling that whenever we encounter i**2, we are going to replace it by -1. Another way of saying it is, we are going to perform all calculations modulo the aforementioned equation.
In passing, the ECC (= Error Correcting Code) employed in disk-drives and the RAM of some computers is designed around arithmetic modulo a primitive polynomial; but typically of degree around 30.
Two. Begin with a division ring. Consider an ordered-pair (a b). Define addition as (a b) + (c d) = (a+c b+d). Define multiplication by a scalar u as u*(a b) = (u*a u*b). Define multiplication as (a b) * (c d) = (a*c-b*d b*c+a*d). It is easy to show that we have a new division-ring. Indeed, had be begun with a field, we would have obtained a field. It is easy to show that this algebra is isomorphic to that of method one.
Optionally, we may define the complex-number one as (1 0) and the complex number i as (0 1). Now, we have obtained the usual complex-algebra.
A disadvantage of this method is that we had to introduce an ad-hoc algebra for these ordered-pairs.
This method is obscure, however.
Three. Begin with a division-ring. Consider the matrix (a b -b a). It is easy to show that the usual matrix-algebra is isomorphic to each of the preceding two methods. Taking the transpose of this matrix yields the complex-conjugate of the transpose of each element -- just what we want. It is called a Hermitian-transpose.
Optionally, we may define the complex number one as (1 0 0 1) and the complex number i as (0 1 -1 0). Now, we have obtained the usual complex algebra. Furthermore, we did not need to introduce any ad-hoc algebra; thus did not need to prove that we have a division ring.
Again, had be begun with a field, we would have obtained a new field.
A disadvantage of this method is that it requires twice the storage, twice the amount of additions, and twice the amount of multiplications as method two. Unless, that is, one writes specific subroutines. The advantage of this method is that no specific subroutines are required. Thus saving the effort of writing, checking, distributing, storing, and maintaining the specific subroutines.
This method is well-known only among the *very* few people who know it.
At the level of mathematical-algorithms, any algorithm for a real-matrix immediately takes care of the complex situation. So does any computer subroutine. However, each subroutine could be specialized for complex, at a saving of storage and computing-time.
There are a set of names for complex-matrices. I give some, with their real counterparts
Hermitian symmetric
skew-Hermitian skew-symmetric
non-Hermitian non-symmetric
unitary ortho-normal
Hermitian-transpose transpose
Frobenious matrix norm Frobenious matrix norm
Hermitian-transpose transpose
and take the complex-conjugate
Frobenious matrix norm absolute value of a complex number
Electrical (including electronics) engineers -- and to a lesser extend physicists -- like to use complex-algebra. While, of course, mathematician *know* complex-algebra, they seldom use it. After all, the existence of the isomorphism between method three and the other methods subsumes complex-algebra into real-algebra.
Thus, for the most part, I will not mention complex-algebra any more.
Are you interested in these tutorials? Are they comprehensible to you? Are they at the right level -- would you like more or less detail? More background? Do you have any questions? Please be specific. Also, please take part in the implicit discussion of the subject.
The word "inverse" has various meanings, dependent upon the context. The inverse of a function -- usually written as the -1 exponent of the function -- is entirely different from an additive inverse -- the negative of a number -- or the multiplicative inverse -- the reciprocal of a number.
When we say "inverse of a matrix", we mean the *multiplicative* -- rather than the additive -- inverse. Thus, the inverse of a matrix is its reciprocal.
What is the reciprocal of the complex number 3+i4 ? Said another way, what is 1 / (3+i4) ?
You undoubtedly were taught the answer as an ad-hoc inscrutable (3-i4)/25 . Try it employing matrices. The given number is (3 4 -4 3) . Use my algorithm for inverting this matrix, to obtain (3 -4 4 3)/(3*3-(4*(-4))) = (3 -4 4 3)/25 . My algorithm -- or any other algorithm for finding the inverse of a matrix -- was blissfully unaware that it had been dealing with a complex number!
Was it not much easier? Nothing ad-hoc. Nothing new to learn or to have to remember.
Any two matrices A and B commute under addition; that is, B * A = A * B . Proof Under addition, any ring is Abelian (commutative). This property of matrices follows from the same property of the division-ring over which the matrices are defined.
However, only certain matrices commute under multiplication.
Theorem Given a matrix A and two rational algebraic functions f(x) and g(x), f(A) and g(A) commute; that is, g(A)*f(A) = f(A) * g(A).
Proof The identity matrix I and A commute. By Mathematical Induction, I and A**n commute. By Mathematical Induction, A**m and A**n commute. A matrix and its inverse commute. This property of matrices follows from the same property of the division-ring over which the matrices are defined. Alternatively, it is obvious from my algorithm for the inverse of a matrix.
In particular (I - A) and the inverse of (I+A) commute. Also, (I+A) and the inverse of (I-A) commute.
A *bilinear function* is the quotient of two linear functions of the same variable.
The bilinear function f(x)=(1-x)/(1+x) is *idem potent*; that is f(f(x)) = x, for any x. Proof (1-(1-x)/(1+x))/(1+(1-x)/(1+x)) simplifies to x. Likewise, the bilinear function g(x)=(1+x)/(1-x). The concept of idem potency and that these functions are idem potent are not widely known.
Performing the indicated division, f(x) = 1 - 2x/(1+x) and g(x) = 1 + 2x/(1-x) . If x is a matrix, we would replace each of the one's by a compatible identity matrix.
It is well known by the *very* few people who know, that if x is a skew-symmetric (skew-Hermitian) matrix, then f(x) and g(x) are ortho-normal (unitary). Also that if x is ortho-normal (unitary), then f(x) and g(x) are skew-symmetric (skew-Hermitian). Proofs are obvious.
Remember that a skew-symmetric (skew-Hermitian) matrix multiplied by a scalar still is skew-symmetric (skew-Hermitian).
Given any ortho-normal (unitary) matrix x, the function
h(phi) = f(f(x)*tan(phi/4))
is a family of ortho-normal (unitary) matrices. We have h(pi) is x.
This function h(phi) is a generalization (to n-dimensions for an nxn matrix) of the concept of a rotation around an axis through an angle phi. It also may be considered a generalization of the cross-product. The function g(x) works, also.
We will return to this function h(phi), when we have eigen-pairs. Then we will be able to find the axis of this rotation.
Most instructors/professors prepare several hours for each hour of lecture, even though they teach the same class repeatedly. Indeed, they specialize in only a few classes -- I would be bored to death! Also, they usually lecture from their own notes, or at least, from the textbook.
I have not seen a textbook of linear algebra, in many years. And I had not written this material previously -- so there are no notes for reference. Besides, I am too lazy ever to prepare for a lecture. -) Furthermore, most of what I am writing now is not available readily in any textbook. Indeed, its unavailability is my motivation for writing it now.
Please pardon the lack of polish -- consider these notes as a first draft. And please check for factual errors.
Also, any comments, criticisms, suggestions, or questions are most welcome. Suggestions for additional material which I should include, lilkewise are welcome.
Present-day textbooks of College Algebra have been dumbed-down.
No longer do they define a determinant. They present only a cursory description of how to evaluate one. Admittedly, it is not practical to evaluate, numerically, a determinant, from its definition; because it would require n! (the factorial of n) multiplications to evaluate the determinant of an n by n matrix.
Furthermore, Kramer's rule requires the evaluation of (n+1) n-size determinants, to solve a system of n simultaneous linear algebraic equations, for a total of (n+1)! multiplications. Entirely not practical, especially in view of the much smaller requirement of other algorithms -- mostly one to eight times the cube of n.
On the other hand, textbooks of beginning Linear Algebra assume that the student already knows determinants; thus they neither define determinants nor present algorithms for their calculation.
We have to turn to older textbooks. In the 1940's and early 1950's, the _College Algebra_ by Rider was the most prominent one -- we used it in our Highland Park Senior High School.
A determinant of a square matrix is the sum of the products of its elements, taken one from each row and from each column. A minus sign is to be prefixed to any term, whose factors are an odd permutation of the principal diagonal.
This definition immediately leads -- by Mathematical Induction -- to the recursive evaluation by minors of a row (or column). The minor of an element is the matrix obtained by removing the row and the column which contain the given element. The value of the determinant of a matrix is the sum of the products of each of the elements of a row (or column) by the determinant of its minor. Depending on the parity of the taxicab-metric distance from the left-upper corner of the original matrix to the given element, a minus sign is to be prefixed for an odd parity.
Now, employ either the Gauss-Jordan algorithm or the Gaussian elimination algorithm, to make all except the given element in its row (or column) zero. Then this algorithm requires only n cubed multiplications. Times 1.5, in the case of the easier Gauss-Jordan algorithm..
Now, employing the Kramer's rule, the solution of the equations requires one or 1.5 times the cube of (n+1) multiplications.
However either the Gaussian elimination or the Gauss-Jordan algorithm may be employed to solve the equations directly. Then only one or 1.5 times the cube of n multiplications are required.
In the previous e-mail, I employed the taxicab metric. In retrospect, I realize that you may not be familiar with it.
Let us take a ride in a taxicab!
As you may remember, in the later 1970's, I was a tutor at the Mt. San Antonio Community College -- where I am employed currently.
One day, a student had come to me, asking that I compute the distance between a pair of points for him. I responded that I would be glad to do so, if he were to tell me how to do it. He was indignant! He said that if he knew how, he would not have bothered to ask me. He departed in a huff!
He went to our supervisor, who must have concluded that I did not know how to solve the problem -- rightly so, that is precisely what I had told the student. Then the supervisor would have proceeded to solve the problem in the only way he knew -- employing the Euclidean metric.
There are various metrics. There are certain properties one expects a metric to posses; but not every metric possess all of them.
It is said to be a *pseudo-metric* iff (= if and only if) there are *null* (of zero length) lines. That is, if there exists a pair of distinct points, the distance between which is zero. Even worse, the square of the distance may be negative, in which case the metric is not positive. A metric is *positive-definite* iff the distance between any pair of distinct points is strictly greater than zero.
For instance, in Special Relativity, the distance between two points is the square-root of
(x2-x1)**2 + (y2-y1)**2 + (z2-z1)**2 - (t2-t1)**2
A *World-line*; that is, a line connecting the source and the sink of a light-beam, has a length of zero. And a line of a *tachichron* has a negative value for the foregoing expression -- its square-root would be purely-imaginary. Thus, Relativity is not positive-definite. And, if such particles exist -- they never have been observed and their theoretical possibility is dubious -- Relativity is not positive.
The triangle law. The sum of the lengths of two sides of a triangle is strictly greater than the length of the third size. Equality iff the triangle is degenerate -- the apex is on the third side. A metric may violate this law.
Reflectivity law. Given any pair of points, the distance from B to A is equal to that from A to B. A metric may violate this law.
Conservation law. The distance around any closed path is zero. Another way of saying this, is that the distance between two points is independent of the path. Sometimes, we may stipulate that the path not surround any *singular* points. There are some metrics that satisfy this law. For instance, by Maxwell's electromagnetism, the energy required to transport a charged particle is conservative. The energy to transport a puck on an air-table comes close. However, if friction is present, then we have a non-conservative system.
Another conservative system is Thermodynamics.
We define a**x as the exponential of x times the logarithm of a.
The p-metric is defined as the p-root (the exponent is the reciprocal of p) of the sum (or more generally, the Lebesque-Stieltjes integral) of the p-power (the exponent is p) of the absolute value of the difference of each coordinate.
With p=2, we have the usual Euclidean-metric. With p=1, we have the taxicab-metric. This is what the odometer of a taxicab shows, as it drives on a rectangular grid.
On the other hand, the taxicab might be concerned with the fuel consumption. Then we have the conservative law iff he has perfect regenerative-braking, and no frictional or air-resistance losses. Otherwise, he has a non-conservative metric.
.... How much longer do you want me to talk?
Like I said to that student, I will be glad to compute the distance, if he explains to me precisely how he wants me to do so!
Do you want to take another ride in a taxicab?
The following can be done in n-dimensional space; the examples will be shown in four-dimensional space, however.
Remember the p-metric? It was the p-root of the sum (or Lebesque-Stieltjes integral) of the p-power of the absolute value of the difference of each of the coordinates of a pair of vectors. the p may be any real-number, except zero. The p-metric is isotropic (the same in each direction), positive-definite, orthogonal (the coordinates are at right-angles to each other, in pairs), and homogeneous (the same at each point of the space).
With p equal to one, we have the taxicab-metric The sum of the absolute values of the difference of each of the coordinates.
In the limit, as p increases without bound -- for short, we may say, p is infinite -- we have the metric The largest of the absolute values of the difference of each of the coordinates, provided that there is only one largest one.
With p equal to two, we have the Euclidean-metric The square-root of the sum of the squares of the difference of each of the coordinates. This is the square-root of the product (b-a)*(b-a). Multiply it out, to obtain the square-root of the tetra-nomial (a*a - a*b - b*a + b*b).
Let us be lazy -- I *always* attempt to do so. If we are going to be studying the properties of the foregoing expression, why bother taking square-roots all of the time? We may as well consider each of the vectors a and b to be unit; that is, their squares to be one. By now, we are left with (2 - (a*b + b*a)). Finally, let us study just the inner-product a*b. Whenever we want b*a, we just interchange the vectors.
Tensors may have any integral rank, greater than or equal to zero. Zero corresponds to a scalar. One corresponds to a vector. Two corresponds to a matrix. There are no specific names above that; but, through seven is common.
Tensors of the whose rank has the same parity are related A generalization or a specialization always preserves the parity of the rank of the tensor.
Let us generalize the foregoing metric, as the product a*m*b, where m is a second-rank tensor. Since, before the generalization, the middle factor was a one -- a scalar -- the generalization has to yield a *second* rank tensor.
In the Euclidean case, our example is x1*x2+y1*y2+z1*z2+w1*w2. Its metric-tensor is the identity (1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1).
In the case of Special Relativity, we have x1*x2+y1*y2+z1*z2-t1*t2. Its metric-tensor is the diagonal-matrix (1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 -1). It is not isotropic; because the values along the diagonal are different (more generally, we would say that the eigen-values are not all the same). It is not positive-definite; because of the minus sign.
If the coordinates are not orthogonal, the metric-tensor will have a matrix which is not diagonal.
Finally, the metric-tensor may not be a constant function of position.
Now, we have arrived at Riemannian-geometry. If you chose to disembark from the taxicab, read the book _Gravitation_, written by three professors of the California Institute of Technology, in the early 1970's. It discuses Riemannian-geometry, which being mathematics, is timeless. It provides only a cursory look at Special-Relativity. It discusses General-Relativity, as implemented in Riemannian-geometry. Again, since it does not depart much from mathematics, it likewise is timeless. Of course, there is no such thing as gravity or a gravitational-force. All we have is the geometry of the space. Objects always travel in a straight-line (which we chose to call a "geodesic"). Thus, the only thing this book does *not* discuss it the of its title! Finally, this book discusses Cosmology (no, not cosmetology -- it is the *same* word; but spelled differently), which, being physics, has become somewhat dated.
The last man to whom I recommended this book, had bought it, looked at it, and he has not talked to me since. But, you are a *woman*, so it should be safe to have you look at it. -)
A vector is said to have been normalized or to be normal iff (= if and only if) its norm is one; that is a*a = 1.
A pair of vectors are said to be orthogonal iff their inner (dot) product is zero; that is a*b = 0. Just to confuse the terminology, we may say that the vectors are normal to each other.
Combining the two concepts, we say that a pair of vectors is ortho-normal iff a*a = 1, a*b = 0, and b*b = 1.
A set of n vectors in n-dimensional space may be written as a square nxn matrix, with the vectors being, say, the rows. Alternatively, they could be placed as the columns.
A matrix is said to be orthogonal iff O * Otr is a diagonal matrix. The "tr" indicates that we are taking the transpose of the matrix.
A matrix is said to be ortho-normal (unitary) iff O * Otr = I, where I is an identity matrix of the same size.
Theorem If O is an ortho-normal (unitary) matrix, then Otr * O = I. The proof is by Mathematical Induction. Because of our recursive definition of a matrix (Do you remember partitioned matrices? If not, then these tutorials are being wasted on you!), it suffices to prove for a 2x2 ortho-normal matrix. The most general skew-symmetric 2x2 matrix, with a determinant of one is (0 1 -1 0). We have constrained ourselves to an arbitrary fixed determinant; because, we are going to be multiplying by an arbitrary variable scalar. Employ a bilinear transformation (remember?), to obtain the most general 2x2 ortho-normal matrix O = (cos(phi/2) sin(phi/2) -sin(phi/2) cos(phi/2)). As a check, O * Otr = I. But, also Otr * O = 1. QED.
Corollary An ortho-normal (unitary) matrix commutes with its transpose. Proof By the previous theorem, both products are the identity matrix.
I had asserted -- without proof -- that matrices constitute a division-ring. The proof is straight-forward -- mostly Mathematical Induction --; but, tedious.
One of the properties of a division ring is that the inverse of a product (a * b) is (inverse of b) * (inverse of a). Notice that the two factors transpose.
Theorem The inverse of a product of elements of a division ring is the product -- in reverse order -- of their inverses. Proof is by Mathematical Induction upon the amount of factors. This theorem is true of matrices, in particular.
Theorem The determinant of a product of matrices is the product of their determinants. The proof, again, is straight-forward Mathematical Induction upon the size of the matrices for a pair of matrices. Employ the recursive (partitioning) of a matrix. Then Mathematical Induction upon the amount of factors.
Corollary The determinant of an ortho-normal (Hermitian) matrix is either plus or minus one. Proof Hint Multiply the matrix by its transpose and find the determinant.
Proposition The determinant of the transpose of a matrix is equal to that of the matrix.
Proposition The determinant of the inverse of a matrix is the reciprocal of the determinant of the matrix.
Theorem The product of ortho-normal (unitary) matrices is ortho-normal (unitary). The proof is by Mathematical Induction upon the amount of factors.
Corollary The transpose of a product of ortho-normal (unitary) matrices is the product -- in reverse order -- of their transposes,
Theorem Given an ortho-normal (unitary) matrix O and a symmetric (Hermitian) matrix S, the product O*S*Otr is symmetric (Hermitian). Proof is obvious. This theorem is a subtle generalization of the theorem we had last time, which said that O*Otr is symmetric (Hermitian).
Corollary The determinant of the aforementioned product O*S*Otr is equal to the determinant of S.
Corollary Given any scalar x, the determinant of the expression (O*S*Otr - x I) is equal to the determinant of (S - x I).
Proposition A diagonal matrix is symmetric (Hermitian).
Theorem A diagonal symmetric (Hermitian) matrix is real. For a diagonal symmetric matrix, it is trivial; because, a symmetric matrix is real, by definition. For a diagonal-Hermitian matrix, it is a surprise. It follows from the fact that a diagonal matrix has zero in the right-upper and left-lower partitioned-quadrants.
This theorem and the preceding proposition may not sound like much right now, but they will have an important consequence when we study eigen-values.
Theorem For a given n by n matrix A, the determinant of (A - x I) is of degree n.
Corollary The equation det(A - x I) = 0 has exactly n roots. They are called the eigen-values of the matrix A.
Is it feasible to compute with matrices?
Consider an nxn matrix.
Polynomial time is feasible. Neither exponential nor factorial times are feasible.
In 1845, Jacobi invented his algorithm for finding the eigen-pairs of a matrix. By 1847, he had worked-out an n=6 matrix, by hand.
In 1962, I planed that I never would need more than n=20, to demonstrate everything. I might need to do a few dozen matrices, all told. In those ancient days, computers could do a few thousand floating-point multiplications per second.
Addition takes n**2 additions; no multiplications. Certainly feasible.
Multiplication takes n**3 multiplications. You would not want to do it by hand; but fast on any computer.
Going by the definition of a determinant and Kramer's rule, inversion of matrices takes (n+1)! multiplications. Not feasible.
Several algorithms are available which require one to ten times n**3 multiplications, to invert a matrix. That is, no more than ten times as long as a multiplication. Certainly feasible.
Now, we come to the difficult thing finding eigen-pairs.
From more than a century of experience, the Jacobi algorithm takes between n**4 and n**5 multiplications -- when it works. But, it would be only around 1980 that mathematicians succeeded in establishing a proof of convergence. So, in the 1960's, there was no guarantee for any given matrix. Very good; but not dependable. Cannot bet the farm on it.
For a non-singular, non-degenerate matrix, the multiplication algorithm is understood well. Its convergence formula is available. It would require thousands, perhaps tens of thousands, of matrix multiplications. Not completely impossible; but not something you would want to do, even with a computer.
So, I invented my squaring algorithm. It would require only tens of matrix multiplications. Certainly feasible. Eventually, when I implemented it on a midi-computer, I found that on typical matrices, between six and sixteen matrix multiplications sufficed. Because of its close relationship to the multiplication algorithm; the convergence guarantee carries over. This could become my workhorse.
I studied matrices. I have worked-out the intricacies of the degenerate matrices. Between 1976 and 1981, I wrote all of the subroutines involved. The first matrix which I computed was n=4, on a Texas Instruments TI-49 hand-held calculator. It took 35 minutes to find all of its eigen-pairs.
In 1981, on a midi-computer, employing the Jacobi algorithm, it took from less than a second for n=12 to less than an hour for n=20. However, while it can handle degenerate matrices; it does not display the structure of the degeneracies. My squaring algorithm was only slightly slower. It displays all of the structure of the degeneracies.
All of the nasties could be demonstrated with just one n=12 matrix. Thus we could perform all of the desirable demonstrations. However, for practical, interesting problems, n=120 would be desirable. It was out of reach at that time.
Today, the Jacobi algorithm should be able to do n=120 in seconds. With a good user interface, computations would be faster than the time necessary to decide upon a particular analysis and to set it up.
In 1981, we were ahead of our time. We could not interest anyone in our work. Today? Well, I am beginning to re-implement everything. We will see if anyone is interested.
Will anyone be able to understand it. The Linear Algebra, perhaps. The Information Theory -- upon which it all rests -- doubtful.
Actually, all that I intend to accomplish, is to submit the result as a pair of theses for a master's degree and a pair of dissertations for a Ph.D. degree, each with a double-major mathematics and computer science. I will attempt to interest some graduate school. We will go a step at a time.
In 1990, I planed to be able to afford to commence this work in the Summer of 2000. I expected that it would take some three years. So, I am starting a year ahead of schedule. But transcribing some of my old notes may take much longer that I originally expected.
I will post the results, as they become available, on my very-own WWW site
http//www.rism.com
I already paid, in advance, just under $200 for the first year. It should be up and running in a matter of a week or two. We will wait and see if anyone becomes interested. I will be most astonished if I receive *any* inquiries!!!
Consider a polynomial, of degree n, in one variable. It is a conditional equation, when the polynomial is set equal to zero. Find the roots.
Gauss Fundamental Theorem of Algebra -- but it takes Calculus to prove -- tells that there is one root. As a corollary, there are exactly n roots. If the coefficients are real, these roots may come in complex-conjugate pairs. If n is odd, there thus must be at least one real root.
For n=1 -- a linear equation -- the root is obvious.
For n=2 -- a quadratic equation -- the have the quadratic formula.
For n=3 -- a cubic equation -- and n=4 -- a quartic equation -- the algorithms had been discovered before the time of Galois. How someone accomplished it, is beyond me. -)
It also had been shown that an alternating group of order five or higher is simple. "Simple" does not mean "easy", however. The decompositions of the alternating groups of lower order had been accomplished, as well.
Galois -- with his Galois Field Theory -- provided a systematic way of obtaining algorithms for the first four degrees. By the relationship to the alternating group, he showed that an algorithm for n greater-than-or-equal-to-five is impossible.
Consider the problem of finding eigen-values and eigen-vectors; that is, eigen-pairs.
Usually, a textbook of Linear Algebra will restrict itself to a symmetric matrix. Notice, that I am *not* placing Hermitian in parentheses.
Anything beyond symmetric becomes confusing and ambiguous. Not least because of inconsistencies in terminology. Furthermore, some people apparently are unaware of the theorem that the product O*S*Otr is symmetric. Thus they either attempt to find such a factorization of a non-symmetric (non-Hermitian) matrix or decry its unattainabilty.
For the purpose of our discussion, let us restrict ourselves to symmetric matrices, which are neither singular (no eigen-value equal to zero) nor degenerate (no repeated eigen-values). Of course, unless we had constructed the problem from known elements, we cannot know the structure of a matrix, before we analyze it. But, if we take an arbitrary, (pseudo) random matrix, the probability of it being either singular or degenerate is zero -- not impossible, but certainly unlikely. In passing, did you know that a probability of zero does not preclude the occurrence of an event?
Once we are thoroughly familiar with symmetric matrices, I will show what can be accomplished with other matrices.
It means that there is no (finite) algorithm possible for finding the eigen-values -- hence eigen-pairs -- of an arbitrary matrix of size greater than n=4. In practice, the analytic solution for n=3 or n=4 is more difficult than a numerical solution employing either the squaring or Jacobi algorithm.
It had been know for some time, that once the set of eigen-values is known, the corresponding eigen-vectors can be obtained. And that the eigen-vectors are orthogonal. They may be normalized. Thus, the set of eigen-values becomes an ortho-normal matrix.
The given matrix A then may be factored as
A = O * D * Otr
where D is a diagonal matrix -- a list of the eigen-values. From a theorem some time ago, it follows that the eigen-values of not only a symmetric, but also of a Hermitian, matrix are all real. Surprise, we have a polynomial equation
0 = determinant of (A - x I)
with complex coefficients; but with guaranteed *real* roots!
There are several proofs required for the foregoing factorization -- I do not have time for them tonight.
webmaster@rism.com
Last modified on Friday 23-rd July 1999
Copyright (c) 1999 by R. I. 'Scibor-Marchocki