and
In this Division Ring chapter, we state each definition and theorem for both the real and complex cases; but, prove each of the theorems only for the real case. In the Complex chapter, we will show how to extend the proofs contained in this Division Ring chapter to be applicable to a complex matrix.
Definition: A matrix is a rectangular array over a given division-ring. That is, the elements are taken from the division-ring.
Theorem: Matrices constitute a division-ring. Hence, the definition may be applied recursively. Thus, we need prove each theorem only for a two-by-two matrix.
Proof: Please read the whole page, to understand the properties of a division-ring. We will employ these properties, henceforth, without explicit reference. In particular, when we say “inverse of a matrix”, we mean the multiplicative-inverse. Conventionally, the inverse of a matrix is indicated by a post-superscripted minus-one. However, for the lack of such a glyph, we employ the orthographic of a suffixed "inv".
What commutes? The additive-group of a division-ring is Abelian (that is, commutative). Hence, matrices commute under addition. What about multiplication? The multiplicative-group of a division-ring is non-commutative. However, certain elements of a non-commutative group do commute: The identity element commutes with each element of the group. The inverse of a given element commutes with that element. There are certain matrices which do commute. We will study some such.
Definition: The commutator of the matrices A and B is defined as the expression A B - B A.
Theorem: The matrices A and B commute iff (= if and only if) their commutator is zero.
Proof is obvious.
Since the following theorem regarding scalar-multiplication is not part of the division-ring, we will state it explicitly.
Definition: A scalar is either a real or complex number.
Definition: Scalar multiplication. The product of a scalar z and a matrix A is a matrix each of whose elements is the product of z and the corresponding element of A.
A purist may prefer to call each of the preceding two definitions, "axiom".
Theorem: The scalar multiplication commutes: z A = A z.
Proof is obvious; because, at the lowest level, we consider each element of the matrix to belong to a field.
Lemma: Given a matrix A, its additive inverse – call it A’ -- is A’ = (- 1) A.
Proof is obvious.
Theorem: A matrix and its additive inverse commute, under multiplication.
Proof is obvious.
Did you read the whole page of the foregoing hyperlink reference? Wondrous! So, now you know what a vector is? Very well. Consider only the additive group operations of matrices. (That is, ignore matrix multiplication.) Augment with the scalar multiplication. Now you will observe that the set of n1 by n2 matrices are vectors within an n1*n2 dimensional space. Nice, but, not very fruitful.
It is more interesting to consider each row of a given matrix to be a vector. There are several places where either an alternative proof may be constructed or greater insight may be provided thereby. We will point out such occurrences.
Definition: The principal-diagonal of a matrix begins at the upper-left corner and proceeds down and to the right, with a slope of minus one.
Definition: A diagonal matrix D is one each of whose elements is zero, except for those on the principal diagonal.
Lemma: If b a = a b, d c = c d, d a = a d, and c b = b c; then the inverse of a two-by-two matrix A = (a, b; c, d) is Ainv = (d, -b; -c, a) / (a d - b c). Also, the determinant of A is equal to the determinant of the denominator. That is, det(A) = det(a d – b c). Caution: Each of the elements {a, b, c, d} of the matrix A is a matrix in its own right. Being drawn from only a division ring, their multiplication is not necessarily commutative.
Proof: Verify it by multiplication. I =? (a, b; c, d) * ((d, -b; -c, a) / (a d – b c)) = ((a, b; c, d) (d, -b; -c, a)) / (a d – b c) = (a d – b c, - a b + b a; c d – d c, - c b + d a) / (a d – b c) = (1, (b a – a b) / (a d – b c); (c d – d c) / (a d – b c), (d a – c b) / (a d – b c)). Now, invoke the hypothesis. = (1, 0; 0; 1) = I. QED.
Theorem/algorithm: The partition and conquer matrix inversion. Apply it recursively. At any stage where we have an odd n, augment the matrix with a single column of zero at the right and a single row of zero at the bottom. At the principal-diagonal, place a one.
[The partition and conquer was discovered/devised by me on 03-rd June 1999.] It is valid only if at each level, each of the pairs (a, b), (a, d), (b, c), and (c, d) commutes. Except for a two-by-two matrix over a field (e.g., either the real or complex numbers) or very specialized matrices, the hypothesis is too restrictive to make this algorithm useful. However, there actually is an important matrix for which this algorithm is applicable, namely, the fifth matrix (of the nine matrices) of the large-style representation of a channel. This matrix is all zero, except for the principal diagonal and the diagonals of each of the upper-right and lower-left quadrants.
Definition: The transpose of a matrix is its reflection in the principal diagonal. Another way of saying it is to interchange the corresponding rows and columns. Conventionally, the transpose of a matrix is indicated by a post-superscripted dagger. However, for the lack of such a glyph, we employ the orthographic of a suffixed "tr".
Theorem: The transpose of the transpose of a matrix is the original matrix itself.
Proof is obvious.
Theorem: The transpose of a product of two matrices is the product of their transposes, taken in reverse order: (A B)tr = Btr Atr.
Proof: Work it out for a pair of two-by-two matrices.
Theorem: The inverse of a product of two matrices is the product of their inverses, taken in reverse order: (A B)inv = Binv Ainv.
Proof: Let C = A B and D = Binv Ainv. It suffices to show that the product C D is the identity-matrix. I =? C D = (A B) (Binv Ainv) = A (B Binv) Ainv = A I Ainv = A Ainv = I. QED.
Lemma: Given a pair of same-size square matrices a and b and another like pair c and d. Let the matrices A and B be A = (a, 0; 0, c) and B = (b, 0; 0. d). Then, the sum A + B = (a + b, 0; 0, c + d), the product A * B = (a * b, 0; 0, c * d), and the determinant det(A) = det(a) * dec(c), are as shown. Hence, the difference A - B = (a - b, 0; 0, c - d) and the inverse Ainv = (ainv, 0; 0, cinv) are as shown, likewise.
Proofs are obvious.
Theorem: Non-interaction of operations upon diagonal partitioned matrices. The foregoing lemma generalizes to matrices with any amount of diagonal partitions.
Proof is by Mathematical Induction.
Definition: A trace of a matrix A is the sum of the elements upon its principal-diagonal.
Definition: A matrix S is said to be symmetric, if S - Str = 0, the zero matrix. The corresponding complex matrix is called Hermitian.
Theorem: The product of two symmetric (or Hermitian) matrices A and B, whose commutator is zero, is symmetric (or Hermitian)..
Proof: (A B)tr = Btr Atr = B A = A B. QED.
Definition: A matrix K is said to be skew-symmetric, if K + Ktr = 0, the zero matrix. The corresponding complex matrix is called skew-Hermitian.
Lemma: Each of the elements on the principal diagonal of a skew-symmetric matrix is zero.
Proof is obvious.
Theorem: The trace of a skew-symmetric matrix K is zero. That is, trace(K) = 0.
Proof: Version #1. Follows immediately from the lemma. Version #2. From the definition of a skew-symmetric matrix, we have 0 = K + Ktr. Take the trace of each side of this equation and simplify, to obtain 0 = trace(K + Ktr) = trace(K) + trace(Ktr) = trace(K) + trace(K) = 2 trace(K). Divide by two, to obtain the equation of the conclusion QED.
Neither the lemma nor the theorem generalizes to complex matrices. However, pray, see some related theorems at the end of the Complex chapter.
Definition: A matrix O is said to be ortho-normal, if O Otr = I, the identity matrix. An ortho-normal matrix O is said to be proper if its determinant (defined below) is positive. The complex counterpart of an ortho-normal matrix is called unitary and is symbolized as U. Thus, we have U Utr = I. This forward reference does not constitute a circular logical argument; because, the referenced definition does not employ any properties of an ortho-normal (unitary) matrix.
Lemma: The product of two ortho-normal (or unitary) matrices is an ortho-normal (unitary) matrix.
Proof: Let O = O1 O2, where O1 O1tr = O2 O2tr = I. Substitute, to obtain O = (O1 O1tr) (O2 O2tr). Then, we need to show that I =? O Otr = ((O1 O1tr) (O2 O2tr)) ((O1 O1tr) (O2 O2tr))tr = O1 O1tr O2 O2tr O2 O2tr O1 O1tr = I. QED.
Theorem: The product of ortho-normal (or unitary) matrices is an ortho-normal (unitary) matrix.
Proof by Mathematical Induction.
Corollary: The product of proper ortho-normal (or unitary) matrices is a proper ortho-normal (unitary) matrix.
Proof: The product of positive numbers is positive. Hence, the result follows from the theorem for the determinant of a product of matrices, below. QED. This forward reference does not constitute a circular logical argument; because, the corollary is not employed until further in this presentation of Linear Algebra.
Lemma: The inverse of an ortho-normal (or unitary) matrix O is an ortho-normal (unitary) matrix.
Proof: Its inverse is its transpose. Thus, Oinv = Otr, which, obviously, is an ortho-normal matrix.
Theorem: The set of ortho-normal (or unitary) matrices constitutes a subgroup of the multiplicative group matrices.
Proof is obvious from the properties of subgroups.
Theorem: If O is a skew-symmetric ortho-normal (or skew-Hermitian unitary) matrix; then Oinv = - O, Otr = - O, O O = - I, (I+ O) (I - O) = 2 I, (I + O) (I + O) = 2 O, and (I - O) (I - O) = - 2 O.
Proof: The definition of ortho-normal is O Otr = I. Pre-multiply by Oinv to obtain the first equation Otr = Oinv. The definition of skew-symmetric is O + Otr = 0. Subtract O, to obtain the second equation Otr = - O. Substitute this equation into the defining equation for ortho-normal, to obtain the third equation O O = - I. Finally, the remaining three equations are alike, namely, multiply-out the product (I+ O) (I - O) = I + (O - O) - (O O) = I + 0 + I = 2 I, which is the fourth equation. (I + O) (I + O) = I + (O + O) + O O = I + 2 O - I = 2 O, which is the fifth equation. And, (I - O) (I - O) = I - (O + O) + O O = I - 2 O - I = - 2 O. QED.
Do such matrices exist? The next theorem provides a constructive proof of existence.
Lemma: If o is an n by n ortho-normal (or unitary) matrix; then the partitioned matrix O = (0, o; - otr, 0) is an 2 n by 2 n positive skew-symmetric ortho-normal (or skew-Hermitian unitary, respectively) matrix.
Proof: We have to establish each of the four parts of the conclusion.
1) Size. It is obvious that the size of the matrix O is 2n by 2n.
2) Positive. By the partition and conquer theorem, the determinant of the matrix O is equal to the
determinant of the matrix(0 - (o (- otr)) = o otr. Since this matrix is the result of folding of a matrix, it is positive. OK. The forward reference to the property of a
folded matrix does not constitute a circular logical argument; because, we do not make any use of this theorem until much later -- in the proofs of
the properties of bilinear transformations.
3) Skew-symmetric. The transpose of the matrix O is Otr = (0, - o; otr, 0). To show that the matrix O is skew-symmetric, we need to show that 0 =? O + Otr = (0, o; - otr, 0) + (0, - o; otr, 0) = (0, 0; 0, 0) = 0. OK.
4) Ortho-normal. By the definition of ortho-normal and from the hypothesis that o is ortho-normal, we have
o otr = I. To show that the matrix O is ortho-normal, we need to show that I =? O Otr = (0, o; - otr, 0) (0, - o; otr, 0) = (o otr, 0; 0, (-
otr) (- o)0 = (I, 0; 0, I) = I. OK. QED.
Lemma: If o1 is a positive skew-symmetric ortho-normal (or skew-Hermitian unitary) matrix of order 2 n1 (that is, its size is 2 n1 by 2 n1) and o2 is a like matrix of order 2 n2; then the partitioned matrix O = (o1, 0; 0, o2) is a like matrix of order 2 (n1 + n2).
Proof: Hint: The non-null sub-matrices of O lie on its diagonal.
Theorem: Arrange an arbitrary amount of matrices like the matrix O of the first lemma along the diagonal of the partitioned matrix. The resulting matrix has the same properties as the matrix O and its order will be the sum of the orders of the individual matrices.
Proof by Mathematical Induction.
Theorem: The order of a skew-symmetric ortho-normal (or skew-Hermitian unitary) matrix is even.
Proof: An ortho-normal matrix of odd order of necessity has to have a non-zero element somewhere on the principal diagonal. Thus, is cannot be skew-symmetric.
Theorem: The construction in the penultimate theorem exhausts the possibilities for a skew-symmetric ortho-normal (or skew-Hermitian unitary) matrix.
Proof: The partitions have to be isolated.
Analogous theorems could be proven for symmetric ortho-normal (or Hermitian unitary) matrices. However, we leave the effort to do so to the interested reader.
Definition: A permutation-matrix P has all of its elements zero, except for a single one or minus one in each row and column. A complex permutation-matrix has its non-zero elements drawn, with replacement, from the set {-1, -i, i, 1}.
Theorem: A permutation-matrix is ortho-normal (or unitary).
Proof is obvious.
Converse: An ortho-normal (or unitary) matrix, each of whose elements is drawn, with replacement, from the set {-1, 0, 1} xor {-1, -i, 0, i, 1}, is a permutation matrix.
Proof is obvious.
Lemma: The product of two permutation matrices is a permutation matrix.
Proof is obvious.
Theorem: The product of permutation matrices is a permutation matrix.
Proof by Mathematical Induction.
Lemma: The inverse of a permutation matrix P is a permutation matrix.
Proof: Since a permutation matrix is an ortho-normal matrix, its inverse is its transpose. Thus, Pinv = Ptr, which, obviously, is a permutation matrix.
Theorem: The set of permutation matrices constitutes a subgroup of the multiplicative group of ortho-normal (or unitary) matrices.
Proof is obvious from the properties of subgroups.
Corollary: The set of permutation matrices constitutes a subgroup of the multiplicative group of matrices.
Proof follows from the transitivity of the subgroup property.
Lemma: The amount of permutations of n things, taken n at a time, is n!
Proof: The amount of permutations of n things, taken m at a time, is P = P(n, m) = n! / (n - m)! Take m = n. QED.
Lemma: There are 2^n n! real (4^n n! complex) permutation matrices.
Proof is obvious.
Lemma: Of these, there are n! equivalence classes of distinct permutations. Checks with the penultimate lemma.
Proof: Hint: Consider that each vector of the permutation matrix is an axial vector (a forward reference).
Theorem: Many real permutation matrices specify the same permutation.
Proof: It is obvious that 2^n n! (or, in the complex case, 4^n n!) is greater than n!.
Various additional properties accrue to the permutation matrices by virtue of the permutation matrix being a special case of an ortho-normal (or unitary) matrix.
Lemma: Given an no by no permutation matrix p. Let n = 2 no. Construct the matrix P = (0, p; - ptr, 0). The matrix P is an n by n positive skew-symmetric permutation matrix. There are no! equivalence classes of distinct such permutation matrices.
Proof: The first four properties accrue as a virtue of a permutation matrix being a special case of an ortho-normal matrix.. The fifth property, namely, the amount of such matrices, is obvious.
Theorem: Given a set of permutation matrices {p1, p2, p3, ... } of respective orders (that is sizes) {n1, n2, n3, ....}. Let n = 2 (n1 + n2 + n3 + ....). Take a null n by n matrix. Place these permutation matrices flanking the principal diagonal. The resulting matrix -- call it P -- is an n by n skew-symmetric permutation matrix. There are the product (n1! * n2! * n3! * ....) equivalence classes of distinct such permutation matrices.
Proof: The first four properties accrue as a virtue of a permutation matrix being a special case of an ortho-normal matrix.. The fifth property, namely, the amount of such matrices, may be established by means of Mathematical Induction.
Any ordered set of strictly-positive integers that adds up to half of n yields an n by n skew-symmetric permutation matrix. A total of how many such equivalence classes of distinct permutation matrices are there? I am not good enough at Combinatronics to provide you with a quantitative formula. Sorry, :-) Here are some that I counted up for small n.
| n | |||
| 2 | 2 (1) | 1! = 1 | 1 |
| 4 | 2 (2) | 2! = 2 | 2 |
| 6 | 2 (3) | 3! = 6 | 11 |
| 2 (2 + 1), 2 (1 + 2) | 2 (2! * 1!) = 4 | ||
| 2 (1 + 1+ 1) | 1! * 1! * 1! = 1 | ||
| 8 | 2 (4) | 4! = 24 | 47 |
| 2 (3 + 1), 2 (1 + 3) | 2 (3! * 1!) = 12 | ||
| 2 (2 + 2) | 2! * 2! = 4 | ||
| 2 ( 2 + 1 + 1), 2 (1 + 2 + 1), 2 (1 + 1 + 2) | 3 (2! * 1! * 1!) = 6 | ||
| 2 (1 + 1 + 1 + 1) | 1! * 1! * 1! * 1! = 1 | ||
| 10 | 2 (5) | 5! = 120 | 231 |
| 2 (4 + 1), 2 (1 + 4) | 2 (4! * 1!) = 48 | ||
| 2 (3 + 2), 2 (2 + 3) | 2 (3! * 2!) = 24 | ||
| 2 (3 + 1 + 1), 2 (1 + 3 + 1), 2 (1 + 1 + 3) | 3 (3! * 1! * 1!) = 18 | ||
| 2 (2 + 2 + 1), 2 (2 + 1 + 2), 2 (1 + 2 + 2) | 3 (2! * 2! * 1!) = 12 | ||
| 2 (2 + 1 + 1 + 1), 2 (1 + 2 + 1 + 1), 2 (1 + 1 + 2 + 1), 2 (1 + 1 + 1 + 2) | 4 (2! * 1! * 1! * 1!) = 8 | ||
| 2 (1 + 1 + 1 + 1 + 1) | 1! * 1! * 1!* 1! * 1!) = 1 |
Theorem: Any permutation of n elements may be decomposed (often, we say "factored") into at most (n - 1)! transpositions.
Proof is by Mathematical Induction.
Corollary: A n by n permutation matrix may be factored into at most (n - 1)! transposition matrices.
Proof is obvious. This is just another way of saying the same thing as was said in the theorem.
Definition: A horizontal vector is a one by n matrix. A vertical vector is an n by one matrix.
Definition of a determinant det of a matrix A: The determinant is the sum of the products over all of the permutations of one element from each row and column. Odd permutations of the principal diagonal are to be pre-pended with a minus sign.
A calculation of a determinant directly from the definition requires n! multiplications. Hence, it is not practical. Instead, the determinant is computed as a by-product of either the inversion or the factorization of a matrix.
Lemma: The determinant of the product of two matrices is the product of their determinants. That is, given that C = B A; we have det(C) = det(B) det(A).
Proof is obvious (though, tedious) for a pair of 2x2 matrices.
Theorem: The determinant of the product of matrices is the product of their determinants.
Proof by Mathematical Induction.
Definition: A minor of an element of a matrix is the matrix obtained by removing the row and the column which pass through the given element.
Lemma: Evaluation by minors. The value of the determinant of a matrix is equal to the sum of the products of the elements in a given row (or column) times the determinant of each respective minor, with a sign pre-pended depending upon the parity of the permutation necessary to reach the principal diagonal from each of these elements.
Proof is obvious.
Lemma: If a row (or column) of a determinant is multiplied by a constant; the value of the resulting determinant is this constant times the value of the original determinant.
Proof by Mathematical Induction.
Lemma: If a product of a row (or column) by a constant is added to another row (or column, respectively); the value of the determinant is unchanged.
Proof by Mathematical Induction.
Lemma. A permutation of rows (or columns) of a determinant yields a new determinant whose value is equal to either that of the original determinant or its negative, depending upon the parity of the permutation being even or odd, respectively.
Proof by Mathematical Induction first on the size of the determinant and then on multiplicity of elementary permutations.
Theorem/algorithm: Gauss-Jordan algorithm for the insitu inversion of a matrix. Also, a determinant may be evaluated by Gaussian elimination. Since as yet we have not learned how to perform pivoting, we will have to skip the pivoting step. However, as a result, we will be unable to invert certain matrices. Pray, refer to the formal description of this algorithm, which is provided several chapters hence.
| an informal simplified description of the Gauss-Jordan algorithm | |
| initialization | set det=1 and augment (to the right) the given matrix with an identity matrix. |
| first pass | outer loop: for each row, from the top down
optionally, pivot -- see description in the next box set u=value of the diagonal element; multiply det by u; give up and quit if u is zero. set v=1/u. multiply each element of the row by v. inner loop: for each row below, going down set u=value of the element in this row directly below the aforementioned diagonal element of the outer loop. set v=-u. in a scratch location, multiply each element of the row in the outer loop by v. add the result to the current row of the inner loop. observe that its element directly below the aforementioned diagonal element of the outer loop becomes zero. at the completion of this pass, we obtain (in the left-side; i.e., the original un-augmented matrix) what is known as the "echelon form". |
| second pass | outer loop: for each row, from the bottom up
observe that its diagonal element is one inner loop: for each row above, going up set u=value of the element in this row directly above the aforementioned diagonal element of the outer loop. set v=-u. in a scratch location, multiply each element of the row in the outer loop by v add the result to the current row of the inner loop. observe that its element directly above the aforementioned diagonal element of the outer loop becomes zero. at the completion of this pass, we obtain (in the left-side; i.e., the original un-augmented matrix) an identity matrix. the det becomes equal to the determinant of the given matrix. |
| an informal description of pivoting |
| in the first outer loop, one may select any remaining element, instead of the next diagonal element.
if the next diagonal element is zero; one has to pivot to a non-zero element -- provided that one such exists that is, in this situation, pivoting is required |
| when performing the calculations in fixed-precision floating-point, pivoting to the largest in magnitude remaining element tends to improve the precision of the calculation |
| an attempt to pivot so as to preserve any existing zero-value elements saves computational steps in the inner loop |
| when performing an exact calculation by hand, pivoting to the smallest in magnitude element makes the calculation easier |
Proof: Apply the previous lemmata to evaluate the value of the determinant.
Corollary: Let A be a given matrix of coefficients, C be a row-vector of constants, and X be a row-vector of unknowns. The matrix equation A Xtr = Ctr may be solved by pre-multiplication by the inverse of A, to obtain (Ainv (A Xtr)) = (Ainv A) Xtr = I Xtr = Xtr = Ainv Ctr, where the Ainv has been obtained by, for instance, the Gauss-Jordan algorithm.
Proof is obvious.
Theorem: The same problem may be solved by Kramer’s rule: The denominator of each quotient is the determinant of A. The numerator for x(j) is the determinant of a matrix obtained by the replacement of the j-th column of A by the transpose of C. An alternative spelling of the name is "Cramer". It is named after Gabriel Cramer (31st July 1704 - 04th January 1752).
Proof is like that of the previous theorem.
Theorem: The rows of a matrix constitute a set of linearly independent vectors; iff (= if and only if) the matrix is non-singular.
Proof is obvious.
Theorem: The trace of the sum of two matrices is the sum of their traces. trace(A + B) = trace(A) + trace(B).
Proof is obvious.
Definition: An elementary rotation is given by the ortho-normal matrix O = O(x) = (cos(x), sin(x); -sin(x), cos(x)). Sometimes, it is convenient to employ cos(x/2) and sin(x/2), instead. In higher-dimensional space, begin with an identity matrix. Place these four elements in a square pattern, with the cosines on the principal diagonal.
Theorem: Addition theorem of elementary rotations. O(x + y) = O(x) O(y). In higher-dimenional space, each of these elementary rotations must occupy the same position in the matrix. This theorem does not generalize to ortho-normal matrices.
Proof follows for the addition theorem for the sine and cosine functions.
Definition of a similarity. For any matrix A and ortho-normal (or unitary) matrix O, the matrix B = O A Otr is said to be similar.
Lemma: The trace of a matrix is invariant under a similarity transformation, by an elementary rotation.
Proof: Let O = (cos(x), sin(x); -sin(x), cos(x)) and A = (a, b; c, d). The product O A = (a cos(x) + c sin(x), b cos(x) + d sin(x); -a sin(x) + c cos(x), -b sin(x) + d cos(x)). The transpose of O is Otr = (cos(x), -sin(x); sin(x), cos(x)). The product (O A) Otr = (a (cos(x))^2 + (b + c) sin(x) cos(x) + d (sin(x))^2, - a sin(x) cos(x) + b (cos(x))^2 - c (sin(x))^2 + d sin(x) cos(x); -a sin(x) cos(x) - b (sin(x))^2 + c (cos(x))^2 + d sin(x) cos(x). a (sin(x))^2 - (b + c) sin(x) cos(x) + d (cos(x))^2). Its trace is trace(O A Otr) = a + c = trace(A). QED.
Theorem: The trace of a matrix is invariant under a similarity transformation.
Proof follows, by Mathematical Induction, from the (forward reference) theorem which states that any ortho-normal matrix may be approximated as a product of elementary rotations. This forward reference does not constitute a circular logical argument; because, neither the theorem nor its corollary is employed until further in this presentation of Linear Algebra. The foregoing lemma, however, is employed in the development of the Jacobi theorem/algorithm.
Corollary: also under B = O A Oinv.
Proof: Since O is ortho-normal, we have Oinv = Otr. Then, apply the theorem. QED.
This corollary generalizes.
Lemma: The determinant of the transpose of a real matrix is equal to the determinant of the matrix. The determinant of the transpose of a complex matrix is equal to the complex-conjugate of the determinant of the matrix.
Proof is obvious.
Theorem: Any ortho-normal (or unitary) matrix is non-singular.
Proof follows immediately from the definition of an ortho-normal matrix and the observation that the real-numbers constitute a field under the usual addition and multiplication.
Theorem: The determinant of an ortho-normal matrix is either one xor negative one. All that we are able to say regarding the determinant of an unitary matrix is that its magnitude is one.
Proof (for an ortho-normal, that is, real, matrix): Since, by definition of ortho-normal, we have O Otr = I, take determinants: det(O) det(Otr) = det(I) = 1. Then, det(O) = sqrt(1). QED.
Proof (for an unitary, that is, complex, matrix): For an unitary matrix, we have detU) det(Utr) = 1. Since the determinant of the transpose of an unitary matrix is the complex-conjugate of the original unitary matrix, the phase-angles cancel in this product. Hence, there is no constraint on the phase-angle of the determinant of an unitary matrix. The magnitude of the determinant is constrained to be the square-root of one. The square-root always has two values. In this case, since the magnitude (of anything) has to be non-negative, the negative result is extraneous. Thus, we have obtained the conclusion. QED.
Theorem: The determinant is invariant under a similarity transformation. That is for any matrix A and an arbitrary ortho-normal (or unitary) matrix O, given that B = O A Otr; we have det(B) = det(A).
Proof: det(B) = det(O A Otr) = det(O) det(A) det(Otr) = det(A). QED.
Definition: The Frobenious matrix norm of a matrix A is defined as norm(A) = sqrt(trace(A Atr)). The "sqrt" indicates that a square-root is to be taken of its argument.
Corollary: The norm of a matrix is equal to the square-root of the sum of the squares of its elements.
The actual subroutine for the calculation of the Frobenious matrix norm, matrix_norm, provides an option to forego the extraction of the square-root. There are two advantages of this option: 1) In principle, doing so saves some CPU time. However, on the Pentium-class microprocessors, this time saving is negligible. 2) The extraction of a square-root inherently loses half of the available precision. Hence, the comparison of norms is better performed without the extraction of square-roots. The following theorem legitimizes doing so.
Theorem: Given two matrices A and B. norm(A) > norm(B) iff (=if and only if) (norm(A))^2 > (norm(B))^2.
Proof is obvious.
Definition: The inner (also known as dot) product of two vectors is the sum of the products of their corresponding components.
Definition: The norm of a vector is the square-root of the inner product of the vector by itself. Obviously, it is equal to the Frobenious matrix norm of the vector.
Theorem/algorithm: Gramm-Schmidt
ortho-normalization. Given a set of imx linearly-independent vectors, each of which has jmx components. Hence, imx <=
jmx. To obtain a corresponding set of ortho-normal vectors, proceed as follows:
for(i1=0; i1<imx; i1++) {
x = 0.0;
a = 1.0 / norm(vect[i1]);
vect[i1] *= a; // hence, the norm of the resulting vect[i1]
becomes one.
for(i2=i1+1; i2<imx; i2++) {
for(i3=i1; i3<i2; i3++) {
a = dot(vect[i3], vect[i2]); // this is the projection of the vect[i2] upon the vect[i3].
vect[i2] -= a vect[i3]; }; }; }; // hence, the projection of the resulting vect[i2] upon the vect[i3] becomes zero.
Proof is obvious.
In this description of the algorithm, we consider each row to be a vector. We make use of the Frobenious matrix norm of such a vector and the inner-product of a pair of such vectors.
If we are given an insufficient quantity of linearly independent vectors; we may adjoin a q.s. (quantum sufficit = sufficient quantity) of random vectors. In a latter chapter, we will learn that this quantity is called the nullity of the given matrix and that these additional vectors comprise the null-space of the given matrix.
While we are ortho-normalizing the matrix A to obtain the ortho-normal (or unitary) matrix Q, optionally, we may record the transformation employed into a triangular matrix R. Along the diagonal, we place the normalizing factor for each vector. In the triangle, we place the inner-products. If we proceed by columns, from left to right; we obtain the right triangular matrix. Then, we have A = Q R, from which it follows that R = Qtr A. This result is known as the QR-factorization. On the other hand, if we proceed by rows, from top to bottom, we obtain the left triangular matrix, with A = L O and L = A Otr.
Theorem: Stability. The ortho-normal (or unitary) matrix is invariant under the Gramm-Schmidt ortho-normalization algorithm. The resulting R matrix will be the identity matrix.
Proof is obvious.
Copyright (c) 2003, 4 by R.I. ‘Scibor-Marchocki. Last revised Saturday 24-th September 2005. Broken link removed Monday 20-th February 2006.