copyright (c) 1999 by R. I. 'Scibor-Marchocki
This is a collections of e-mail letters, which I wrote to my very good friend Deb, as a set of lessons in Linear Algebra. I have deleted each salutation, to protect her privacy. If anybody shows interest, someday, I might edit these lessons into a better organized tutorial. So, please write me at webmaster@rism.com.
In better days, you may remember, at the URL
http//www.wp.com/rism
I *used* to have a few Web pages, which included my work-history. This site had been hosted by Four11. Last Autumn, Yahoo purchased all of Four11. Immediately, they announced to us that they would close-down the Web server. This Spring, they did so. Someday, when I will have a spare $200 (per year, payable yearly, in advance), I will re-establish those pages at my *officially* registered (at a cost of $35 per year, payable two years in advance) URL
http//www.rism.com
At present, you will obtain an error message to the effect that a DNS (= Domain Naming Service) entry does not exist.
If you ever looked at the old place, you might have observed that from 1962 to 1968, I worked as a Senior Scientist for the Nortronics division of Northrup Corporation. Actually, I was stationed at an outlying facility in Anaheim, California. It was an hour drive each way.
Northrup had a Research facility on the coast of the Pacific Ocean. Even if they had offered -- they never did offer -- me a position there, it would not have been feasible to drive two-and-a-half hours each way.
In 1996, mostly to brag about how good their Research facility is -- but also to solicit bushiness for himself -- a mathematician from the Research facility came to us in Anaheim to talk about inverting a matrix.
He described the LDU (= Lower Diagonal Upper) and the Gauss-Jordan algorithms. The latter is a simplified version of the original Chinese Gaussian-elimination algorithm. He said that each of these -- as well as most of the other standard good algorithms -- provides similarly good results and is about as fast. That is, as long as the matrix is well-conditioned.
The usual textbook technique for dealing with an ill-conditioned matrix of employing a pseudo-inverse does help -- but not much -- and only if the matrix is only slightly ill-conditioned.
If the matrix is badly ill-conditioned, he has a *secret* way of inverting it. Just bring your ill-conditioned matrix; I will give you its inverse. "Now,", he continued, "if you think that I am going to tell you how I do it; I am not as stupid as I look. This is how I make my living. All I will say is that I invert the matrix twice." He did not elaborate. Obviously, he considered that his cryptic remark would be of no practical value to anyone in his audience. And he did not reveal anything to me anytime thereafter.
I thought about what he said. In a few days, I devised an algorithm that I considered would do well with an ill-conditioned matrix. In 1980, I implemented it. It was fantastic. It worked much better than I had expected and upon extremely ill-conditioned matrices. Now, I have no way of knowing if I had re-discovered his algorithm.
My challenge to you -- or to anybody else, for that matter -- is, come up with an algorithm for inverting an ill-conditioned matrix. See if yours is the same as mine. If it is, then they both might be the same as his. On the other hand, yours may turn out to be even better than mine.
You might choose to post this challenge on our GeoCities site, if you so desire.
Let us all compare notes.
Continuation of our discussion of the inversion of a matrix.
To keep life simple, let us confine ourselves to symmetric matrices, for the purpose of this discussion.
A matrix is said to be singular iff (= if and only if) its determinant is zero.
As follows from the theory of determinants and, in particular, Kramer's rule, a matrix has an inverse iff its determinant is not zero; that is, iff it is non-singular. Hence, we further narrow our discussion to matrices which are non-singular.
As an aside, it certainly is possible to invert a matrix which is not symmetric. It even is possible to invert a singular matrix. However, as we said at the outset, we are trying to keep this discussion simple.
As you may remember, we defined R as the precision ratio. That is, R is the quotient of a typical number by the smallest change that can be distinguished in it. For double-precision floating-point numbers, R is about 10**14.
In the theory of eigen-values, it can be shown that a matrix is singular iff it has an eigen-value of zero. Consider the magnitudes of the eigen-values. It follows that the smallest one is greater than zero. Then, we define the ill-conditioning ratio ill as the ratio of the largest to the smallest absolute-value-of eigen-value. Qualitatively, we say that a matrix is ill-conditioned iff its ill-conditioning ratio exceeds the square-root of R.
The Frobenius matrix norm is defined as the square-root of the trace of the product of the given matrix by its transpose. The trace of a matrix, as you should remember, is the sum of the elements along its principal diagonal.
We define the precision of a computed (alleged) inverse of a matrix A as
Frobenius-matrix-norm(I - A * (computed inverse of A))
where I is an identity matrix of the same size as the matrix A. For an *exact* inverse, this precision is zero. However, the best that one can expect from a floating-point computation is 1 / R . This would be from a matrix whose ill-conditioning ratio is about one, which is as easy as it gets.
Then, for the usual algorithms for finding the inverse of a matrix, e.g., Gaussian-elimination, Gauss-Jordan, LDU (= Lower Diagonal Upper), the precision of the computed inverse is approximately ill / R .
With my "twice" algorithm, if ill / R < 0.1, then the precision that I observed was approximately 1 / R . That is, the best that can be expressed in the available precision. This algorithm requires 4 * n ** 3 multiplications -- four times as many as that required by the Gausian-elimination algorithm. This is a small price to pay for the improvement in precision of the computed inverse of an ill-conditioned matrix.
However, it will take some time until I build-up the necessary library of matrix subroutines, before it will be possible to re-implement this algorithm. Until I do re-implement the algorithm, you have time to see if you can figure out an algorithm of your own that does as well.
Yesterday, Sunday 20-th June 1999, was the last day of the Renaissance Pleasure Faire. We said good-bye to all of our friends, for another year.
I have given you more than that visiting mathematician had given us.
I have told you both now many multiplications my algorithms requires and a quantitative expression for the precision that my algorithm attains. He only stated that his algorithm works for ill-conditioned matrices and gives usable results. He did not provide any quantitative values for either.
Now that I have given quantitative performance values, you have a specific target at which to shoot.
My algorithm is a vast improvement over the best available in the linear-algebra mathematics or computer-science textbooks. The best that they have to offer is the pseudo-inverse, which is not significantly better then the standard algorithms Gausian-elimination, Gauss-Jordan, or LDU.
And hurry-up. It took me only three or four days to devise my algorithm.
The definition of a determinant provides an algorithm for its computation. However, for anything except small matrices, the amount of multiplications is unacceptably large n! That is, for an n x n matrix, it requires n factorial multiplications to compute the determinant.
Kramer's rule gives the necessary and sufficient condition for the existence of the inverse of a matrix, namely that its determinant be non-zero.
Then, Kramer's rule requires (n + 1) determinants to be evaluated, to find the inverse of a matrix. Together that is (n + 1)! multiplications.
This computation may be speeded-up. Each of the usual algorithms for inverting a matrix -- running in one to less than ten times n cubed multiplications -- yields the determinant of the matrix, as a by product. Then, employing Kramer's rule to invert a matrix, we need only one to ten times n to the fourth power multiplications.
However, the first computation of a determinant already gave us the inverse of the matrix. It is much faster to quit there, while we are ahead.
Thus, while both the definition of a determinant and Kramer's rule are vital for the development of the theory of matrix inversion, neither is practical for the actual computation.
So, let me know when you have found how to invert an ill=conditioned matrix.
The Spring semester 1999, at our Mt San Antonio Community College, we have a new instructor for FORTRAN. I know, it is an obsolescent language; but we cannot afford to purchase a modern C/C++ compiler in enough copies for the two or three laboratories where the computer-science students would need a compiler.
This instructor has his own problems, which he assigns to the students. Near the end of the semester, he asked the students to write multi-precision integer add and multiply subroutines.
Most of the students did not know how to add or multiply by hand. They always use calculators. They needed extensive help. And they did not even think about division or floating point. They made multi-precision arithmetic look *very* difficult!
Actually, multi-precision arithmetic is no different from any other multi-digit arithmetic. However, since it has to be implemented in software, it is *much* slower than the single-precision running in hardware.
What is the state of the art of ill-conditioned matrix inversion?
Ill-conditioning is in the eye of the beholder. Remember, we said that a matrix is ill-conditioned iff (= if and only if) its ill-conditioning ratio exceeds the square-root of the precision R. Furthermore, it may be shown that the worst possible ill-conditioning ratio is equal to the precision R.
It follows that if one doubles the precision (actually, squares R), the given matrix will not be ill-conditioned, any more.
A further complication is that multi-precision arithmetic is implemented upon integer hardware arithmetic. In most computers, the integer arithmetic has only half the precision of the double floating-point hardware arithmetic. Thus, to obtain twice the precision of double, one has to employ quadruple-precision of long. And implement carry and the exponent in software.
Hence, it takes 4x4=16 multiplications to accomplish one multiplication. And there is the overhead for the carry and the exponent.
As a result, it takes much more than sixteen times as long to invert an ill-conditioned matrix -- employing multi-precision arithmetic -- than it does to invert a well-conditioned matrix. And great complexity of coding.
In contrast, my algorithm is easy to code and requires only exactly four times as many multiplications. Thus, it is more than four times as fast as the multi-precision approach.
Probably, with modern computers, the speed or complexity of an algorithm is not important, any more? Just throw more hardware at it. By Moor's law, the speed of the CPU doubles every 18 months. Then, my algorithm is equivalent to some four to six years of progress in the hardware.
The foregoing is just my *modest* bragging. -)
webmaster@rism.com
Last modified on Thursday 22-nd July 1999
Copyright (c) 1999 by R. I. 'Scibor-Marchocki