Linear Algebra

Selected theorems

and

Documentation of the Shareware Library

 

=== algorithms for inversion of a real matrix ===

In the Complex chapter, we will show how to extend this Inversion chapter to be applicable to a complex matrix.  At this time, however, we consider each matrix to be real.

Theorem:  Given a non-singular matrix A and a horizontal vector B, the matrix equation A Xtr = Btr may be solved for the unknown horizontal vector X by pre-multiplication by the inverse of A, to obtain Xtr = Ainv Btr.  Kramer's rule states that the j-th element of X may be obtained as the determinant of a matrix obtained by the replacement of the j-th column of A by the Btr.  This determinant to be divided by the determinant of A.

Proof:  Take A = (a, b; c, d), B = (u, v), and X = (x, y).  We know that the inverse of A is Ainv = (d, -b; -c, a) / det(A).  Hence, Xtr = (d, -b; -c a) (u, v)tr / (a d - b c) = (d u - b v; a v - c u) / (a d - b c) = (x, y).  The Kramer's rule gives us x = det(u, b; v, d) / det(a, b; c, d) = (d u - b v) / (a d - b c) and y = det(a, u; c, v) / det(a, b; c, d) = (a v - c u) / (a d - b c).  Hence Xtr = (d u - b v; a v - c u) / (a d - b c), which is the same result.  Just to make sure, let us check this answer by substitution into the original equation:  Btr =? A Xtr = (a, b; c, d) ((d u - b v, a v - c u) / (a d - b c)) = (a d u - a b v + a b v - b c u; c d u - b c v + a d v - c d u) / (a d - b c) = (u; v) = Xtr.  QED.

Lemma:  For any two arbitrary ortho-normal matrices O and Q and a given non-singular matrix A, its inverse is Ainv = Q (O A Q)inv O.

Proof:  Ainv =? Q (O A Q)inv O = Q (Qtr Ainv Otr) O = (Q Qtr) Ainv (Otr O) = I Ainv I = Ainv.  QED.

Lemma:  For any arbitrary ortho-normal matrix R and a given non-singular matrix A with a given factorization A = U V W, the inverse of A is Ainv = Rtr ((R U Rtr) (R V Rtr) (R W Rtr))inv R = Rtr (R W Rtr)inv (R V Rtr)inv (R U Rtr)inv) R.

Proof:  Continue, to obtain = Rtr (R Winv Rtr) (R Vinv Rtr) (R Uinv Rtr)) R = Winv Vinv Uinv = (U V W)inv = Ainv.  QED.

Theorem:  Obviously, we may combine these two lemmata.  For the special cases where each of the ortho-normal matrices is a permutation matrix, these operations are known as pivoting.

Reputedly, pivoting improves the precision of the computation of an inverse of a matrix.  Forsooth, examples may be contrived of non-singular matrices which cannot be inverted without the use of pivoting.  For instance, the proper permutation matrix (0, 1; -1, 0) cannot be inverted, even with pivoting along the diagonal.  It requires pivoting along a row, column, or the whole remaining body.  However, in practice, for most matrices the improvement in precision is modest, at best; while the CPU resources required to pivot are comparable to those required to perform the multiplications of the inversion.

The tricky part of the LDU and Gauss-Jordan algorithms is to perform the steps in such a sequence that only one unknown occurs in each step and that whatever the value that is determined for this unknown overwrites is not needed for any future step.

== The LDU algorithm  ==

In the LDU algorithm, L means the left-lower triangle.  D means the diagonal.  U means the right-upper triangle.  Since the L and the D has each of its diagonal elements equal to one, they are not stored so, explicitly.  DU means the product of the D and the U.  Since its diagonal has non-trivial elements, it is stored explicitly.  Likewise, LDinv means the product of the Dinv and the Linv.  Again, since its diagonal has non-trivial elements, it is stored explicitly.

The strategy of the LDU algorithms consists of the five steps:  Factor the matrix A into A = L DU and, on the fly, invert the D.  Factor DU into DU = D U.  Invert the L and the U.  Multiply Dinv Linv = (Dinv Linv).  Multiply Uinv (Dinv Linv) = Ainv.  There is a quota of at most n divisions for the whole algorithm.  As with the following Gauss-Jordan algorithm, the challenge is to perform the inversion insitu.

Step one.  Factor the matrix A into A = L DU.  Proceed from the top left corner in a herring-bone pattern.  Indirectly pivot.  First the row, right from the diagonal.  Perform the usual matrix multiplication of the L by the DU, up to but not including the diagonal element of the L.  Subtract the result from the next element.  Second, on the fly, invert the diagonal element.  If a particular element of D had been zero, by the convention of this algorithm, its reciprocal is zero.  This is our “best effort” at the inversion of a singular matrix A.  Then third, the column, down from just under the diagonal.  Set a temporary variable to the original value of the element which you are calculating,  Subtract each of the partial products.  Multiply by the diagonal element, which you just have inverted.  Store the result into the element, which you hereby have finished calculating.  Continue down.  Repeat these three sub-steps, in the herring-bone

Step two.  Factor the DU = D U.  Each row is independent.  It is easiest to start from the top.  Take the element of the D and multiply each element to its right by the value of the element of the D.

Step three.  Invert the L and the U.  These inversions could be accomplished in either sequence.  However, we intersperse them into a single step.  Start from the top left corner for the L and the bottom right corner for the U.  Intersperse the corresponding elements – first that of the L then that of the U.  For the L, work down each column, from the diagonal.  Then, do the next column, to the right.  For the U, work up each column, from the diagonal.  Then, do the next column, to the left.

Step four.  Multiply Dinv Linv, to obtain (Dinv Linv) = LDinv.  Each row is independent.  It is easiest to start from the top.  For each element of the Dinv, multiply each element of Linv, which is to the left of the aforementioned element of the Dinv, by that element of the Dinv.

Step five.  Pre-multiply the foregoing product by Uinv, to obtain (Uinv LDinv) = (LD U)inv = Ainv.  Proceed from the top left corner in a herring-bone pattern.  First the column, down from the diagonal.  Then, the row, right from just the right of the diagonal.  Repeat these two sub-steps, in the herring-bone.  Perform the usual matrix multiplication of the Uinv by the LDinv, up to but not including the diagonal element of the U.  Then, the diagonal element, by implication, is one.

Step six.  Un-pivot by dropping the indirection.  There is no code involved to do so -- just exit the routine.

=== The QR algorithm ===

For a given matrix A, employ the Gramm-Schmidt ortho-normalization algorithm, to obtain the ortho-normal matrix O and the triangular matrix R.  Thus, A = O R.  Its inverse then is Ainv = Rinv Oinv = Rinv Otr.  We already have learned hjow to invert a triangular matrix, as a part of the foregoing LDU algorithm.  The inversion of an ortho-normal matrix is easy -- just take its transpose.

== The Gauss-Jordan algorithm ==

Conceptually (but, not actually), augment the given matrix with an identity matrix to its right.
Work down the principal-diagonal, from the left-top corner.
In each column:
            Indirectly pivot, to place the largest (in magnitude) remaining diagonal element here.
            Write the reciprocal of the diagonal element in its place.  If a particular diagonal element had been zero, by the convention of this algorithm, its reciprocal is zero.  This is our “best effort” at the inversion of a singular matrix A.
            In each row:
                        Under the diagonal element, write its product by the diagonal element.
                        In each position to its right, subtract the product of the previous line by the element to the right of the diagonal element, which is above the current element.

Now you have constructed the echelon.

Work up the principal-diagonal, from the right-bottom corner.
In each column:
            In each row:
                        Above the diagonal element, write its product by the diagonal element.
                        In each position to its left, subtract the product of the previous line by the element to the left of the diagonal element, which is below the current element.

Un-pivot by dropping the indirection.  There is no code involved to do so -- just exit the routine.

=== Inversion of an ill-conditioned matrix ===

Definition of the ill-conditioning ratio:  Given any matrix A.  Factor it as A = < D >.  The ill-conditioning ratio is defined as the ratio of the absolute value of the largest in absolute value element of D to the absolute value of the smallest in absolute value non-zero element of D.

The pivoting in each of the foregoing two algorithms is optional.  Intuitively, it is evident that because it delays the inversion of small in magnitude elements -- thus, isolating the inevitable round-off errors --; it causes less overall loss of precision.  Experimentally, we observe that pivoting usually improves the precision of the inversion of moderately ill-conditioned matrices by two orders of magnitude.  However, occasionally on badly ill-conditioned matrices, pivoting actually degrades the resulting precision.  Thus, the usual admonition:  When you are attempting to invert an ill-conditioned matrix, test the results.

Theorem:  Twice.  Given any two non-singular matrices A and B.  Let C = B A. Then obviously.  Ainv = Cinv B.

The only inverse required is that of the matrix C.  Thus, if we can arrange for C to be well-conditioned; any usual algorithm for its inversion will perform satisfactorily.  It suffices to take B as an approximate inverse of A.  When A is ill-conditioned, any usual inversion algorithm will give us such an approximate inverse.

[I discovered/devised this algorithm in 1967 and demonstrated it in around 1980.]

Observe that once the Jacobi algorithm provides the factorization A = < D >, the inverse of the matrix may be computed as Ainv = >tr Dinv <tr.  Since the elements of the D are isolated, the inverse of the D retains the full precision of the arithmetic employed.  Since the transposition of the bra or ket does not involve any inversion – only a rearrangement of the elements, there is no loss of precision.  Thus, for a symmetric (or Hermitian) matrix, this method yields excellent precision.  Unfortunately, for a general real (or complex) matrix, the square-root operation – in D = sqrt(E) – loses half of the available precision.  Someday, mayhap, I might investigate the combination of the Jacobi algorithm and the twice algorithm for the inversion of a general real (or complex) matrix.  Of course, since the Jacobi algorithm is computationally expensive (compared to the Gauss-Jordan or LDU algorithms), one would employ it only in desperation.

=== Basis ===

Each of the foregoing algorithms for the inversion of a matrix may be considered to be an application of the concepts discussed here.

Definition:  A space of vectors is a collection of vectors, the cardinality of elements of each of the vectors being exactly n.

Each subsequent vector is a member of this space.  Each subsequent set of vectors is a subset of this space.

Definition:  A vector V is said to be linearly-dependent upon the set B; if it can be represented as a linear combination of vectors in B.  That is, there exist a set of scalars {c}, such that V = co bo + c1 b1 + c2 b2 + ....  Otherwise, V is said to be linearly independent of the set B.  In Algebraic Topology, we would replace the preceding summation by a Lebesque-Stieltjes integral.

Definition:  A basis set of vectors B is a maximal set of linearly independent vectors.

Lemma:  Transitivity.  If V is linearly dependent upon B' and each vector of B' is linearly dependent upon B; then V is linearly dependent upon B.

Proof is obvious.

Theorem:  Given a set B of vectors, of cardinality at least n, with n being finite.  The set B is a basis iff (= if and only if) the determinant of B is not zero.

Proof:  Employ either the LDU, QR, of Gauss-Jordan algorithm to compute the determinant.  In each of these algorithms, the computation consists of a sequence of linear combinations.  The conclusion follows by Mathematical Induction. QED.  The proof for infinite n is beyond the scope of Linear Algebra.

This theorem could have been taken as the definition of a basis.  Then, the previous definition would become a theorem.

Corollary:  The cardinality of a basis is exactly n.  Hence, a basis exists.

Proof:  A non-square matrix is singular.

Theorem:  Given that B is a basis.  For any non-singular matrix A, each of the products A B or B A is a basis.

Proof is obvious.

Corollary:  Given that B is a basis.  Subject it to a Gramm-Schmidt ortho-normalization, to obtain the ortho-normal (or unitary) matrix O. Then O is a basis.

Proof:  Employ the QR matrix inversion algorithm.  We obtain B = Q R. Post-multiply by Rinv, to obtain Q = B Rinv.

Corollary:  Given that B is an ortho-normal (or unitary) basis.  For any ortho-normal (or unitary) matrix O, each of the products O B or B O is an ortho-normal (or unitary) basis.  Furthermore, if both B and O are proper; then, so is each of the aforementioned products.

Proof follows from the theorem that states that the product of orth-normal matrices is an ortho-normal matrix.

Converse:  If B and B' are two bases; then, each of the equations of the theorem may be solved for A.  Of course, these two values for A are distinct; unless the commutator of one of the A and the B is zero.  Moreover, if both B and B' are ortho-normal (unitary); then so is A.

Proof:  Hint:  Each basis is an n by n non-singular matrix.

 

 

 

UP        NEXT

Copyright (c) 2003, 4 by R.I. ‘Scibor-Marchocki.  Last revised Thursday 22-nd September 2005.