Linear Algebra and Its Applications, Exercise 1.6.21

Exercise 1.6.21. Factor the following symmetric matrices

A = \begin{bmatrix} 1&3&5 \\ 3&12&18 \\ 5&18&30 \end{bmatrix} \qquad A = \begin{bmatrix} a&b \\ b&d \end{bmatrix}

into the form LDL^T.

Answer: We perform Gaussian elimination on the first matrix A and start by multiplying the first row by the multiplier l_{21} = 3 and subtracting it from the second row:

\begin{bmatrix} 1&3&5 \\ 3&12&18 \\ 5&18&30 \end{bmatrix} \rightarrow \begin{bmatrix} 1&3&5 \\ 0&3&3 \\ 5&18&30 \end{bmatrix}

We then multiply the first row by the multiplier l_{31} = 5 and subtract it from the third row:

\begin{bmatrix} 1&3&5 \\ 0&3&3 \\ 5&18&30 \end{bmatrix} \rightarrow \begin{bmatrix} 1&3&5 \\ 0&3&3 \\ 0&3&5 \end{bmatrix}

Finally we multiply the second row by the multiplier l_{32} = 1 and subtract it from the third row:

\begin{bmatrix} 1&3&5 \\ 0&3&3 \\ 0&3&5 \end{bmatrix} \rightarrow \begin{bmatrix} 1&3&5 \\ 0&3&3 \\ 0&0&2 \end{bmatrix}

We then have

L = \begin{bmatrix} 1&0&0 \\ l_{21}&1&0 \\ l_{31}&l_{32}&1 \end{bmatrix} = \begin{bmatrix} 1&0&0 \\ 3&1&0 \\ 5&1&1 \end{bmatrix}

(substituting for the l_{ij} entries) and

D = \begin{bmatrix} 1&0&0 \\ 0&3&0 \\ 0&0&2 \end{bmatrix}

(taking the diagonal values from the matrix after the final elimination step).

The factorization of A is then

A = LDL^T = \begin{bmatrix} 1&0&0 \\ 3&1&0 \\ 5&1&1 \end{bmatrix} \begin{bmatrix} 1&0&0 \\ 0&3&0 \\ 0&0&2 \end{bmatrix} \begin{bmatrix} 1&3&5 \\ 0&1&1 \\ 0&0&1 \end{bmatrix}

For the second matrix we also do Gaussian elimination by multiplying the first row by l_{21} = b/a and subtracting it from the second:

\begin{bmatrix} a&b \\ b&d \end{bmatrix} \rightarrow \begin{bmatrix} a&b \\ 0&d-b^2/a \end{bmatrix}

This completes elimination. We then have

L = \begin{bmatrix} 1&0 \\ l_{21}&1 \end{bmatrix} = \begin{bmatrix} 1&0 \\ b/a&1 \end{bmatrix}

(substituting for the l_{ij} entries) and

D = \begin{bmatrix} a&0 \\ 0&d-b^2/a \end{bmatrix}

(taking the diagonal values from the matrix after the final elimination step).

The factorization of A is then

A = LDL^T = \begin{bmatrix} 1&0 \\ b/a&1 \end{bmatrix} \begin{bmatrix} a&0 \\ 0&d-b^2/a \end{bmatrix} \begin{bmatrix} 1&b/a \\ 0&1 \end{bmatrix}

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fourth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | Leave a comment

Linear Algebra and Its Applications, Exercise 1.6.20

Exercise 1.6.20. Suppose A is a 3 by 3 matrix for which the third row is the sum of the first and second rows. Show that there is no solution to the equation Ax = \begin{bmatrix} 1&2&4 \end{bmatrix}^T, and determine whether A has an inverse or not.

Answer: For x = \begin{bmatrix} u \\ v \\ w \end{bmatrix} we have the following system of three equations:

Ax = \begin{bmatrix} 1 \\ 2 \\ 4 \end{bmatrix} \rightarrow \begin{bmatrix} a_{11}&a_{12}&a_{13} \\ a_{21}&a_{22}&a_{23} \\ a_{31}&a_{32}&a_{33} \end{bmatrix} \begin{bmatrix} u \\ v \\ w \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \\ 4 \end{bmatrix}

\rightarrow \setlength\arraycolsep{0.2em}\begin{array}{rcrcrcr}a_{11}u&+&a_{12}v&+&a_{13}w&=&1 \\ a_{21}u&+&a_{22}v&+&a_{23}w&=&2 \\ a_{31}u&+&a_{32}v&+&a_{33}w&=&4 \end{array}

If we subtract the sum of the first two equations from the third equation, we obtain the following:

[a_{31} - (a_{11}+a_{21})]u + [a_{32} - (a_{12}+a_{22})]v + [a_{33} - (a_{13}+a_{23})]w = 4 - (1+2)

\rightarrow 0\cdot u + 0 \cdot v + 0 \cdot w = 1 \rightarrow 0 = 1

Since we have reached a contradiction we conclude that the system of equations has no solutions. The corresponding matrix A is singular and therefore has no inverse.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fourth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | 2 Comments

Linear Algebra and Its Applications, Exercise 1.6.19

Exercise 1.6.19. Given the matrix A where

A = \begin{bmatrix} a&b&c \\ d&e&0 \\ f&0&0 \end{bmatrix} \quad \rm or \quad A = \begin{bmatrix} a&b&0 \\ c&d&0 \\ 0&0&e \end{bmatrix}

what values must the entries of A have in order for A to be invertible?

Answer: We know that a (square) matrix A is invertible if and only if it is nonsingular. We therefore perform Gaussian elimination on the first matrix A to see under which conditions it is nonsingular. We start by noting that f must be nonzero; otherwise there will be no pivot in the last row and elimination will fail.

Since we know for certain that f is nonzero, we can exchange the first and third rows to obtain the following matrix:

\begin{bmatrix} f&0&0 \\ d&e&0 \\ a&b&c \end{bmatrix}

For the next elimination step we multiply the first row by d/f; recall that we know from above that f is nonzero, so dividing d by f is possible. We then subtract the result from the second row, obtaining the following matrix:

\begin{bmatrix} f&0&0 \\ 0&e&0 \\ a&b&c \end{bmatrix}

We next multiple the first row by a/f and subtract the result from the third row, obtaining the following matrix:

\begin{bmatrix} f&0&0 \\ 0&e&0 \\ 0&b&c \end{bmatrix}

We note at this point that e must also be nonzero, for the same reason that f must be nonzero: If e were zero then we would have at least one row without a pivot. Since e is nonzero we can divide b by e, multiply the second row by b/e, and then subtract the result from the third row. This elimination step produces the following matrix:

\begin{bmatrix} f&0&0 \\ 0&e&0 \\ 0&0&c \end{bmatrix}

Finally, we note that c must also be nonzero, or else its row will not have a pivot.

Combining this with our previous conclusions, we see that the first matrix A will be nonsingular, and thus invertible, if and only if c, e, and f are all nonzero.

We now consider the second matrix A. Note that this is a block diagonal matrix consisting of two submatrices: a 2 by 2 matrix with elements a, b, c, and d, and a 1 by 1 submatrix with the single element e:

\begin{bmatrix} a&b&\vline&0 \\ c&d&\vline&0 \\ \hline 0&0&\vline&e \end{bmatrix}

As I discussed in the previous post, a block diagonal matrix is invertible if and only if all of its diagonal submatrices are invertible. From the formula for the inverse of a 2 by 2 matrix we know that the first submatrix is invertible only if ad - bc \ne 0, and since the inverse of the 1 by 1 matrix is simply the reciprocal of its (sole) entry, the second submatrix is invertible only if e \ne 0. Combining these results, we must have ad - bc \ne 0 and e \ne 0 for this matrix A to be invertible.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fourth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | Leave a comment

The inverse of a block diagonal matrix

In the previous post I discussed multiplying block diagonal matrices as part of my series on defining block diagonal matrices and partitioning arbitrary square matrices uniquely and maximally into block diagonal form (part 1, part 2, part 3, part 4, and part 5). In this final post in the series I discuss the inverse of a block diagonal matrix. In particular I want to prove the following claim:

If A is a block diagonal matrix with q submatrices on the diagonal then A is invertible if and only if A_{\alpha} is invertible for 1 \le \alpha \le q. In this case A^{-1} is also a block diagonal matrix, identically partitioned to A, with (A^{-1})_{\alpha} = (A_{\alpha})^{-1} so that

A^{-1} = \begin{bmatrix} A^{-1}_{1}&0&\cdots&0 \\ 0&A^{-1}_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A^{-1}_{q} \end{bmatrix}

Proof: This is an if and only if statement, so I have to prove two separate things:

  • If A is invertible then A^{-1} is a block diagonal matrix that has the form described above.
  • If there is a block diagonal matrix as described above then it is the inverse A^{-1} of A.

a) Let A be an n by n square matrix partitioned into block diagonal form with q row and column partitions:

A = \begin{bmatrix} A_{1}&0&\cdots&0 \\ 0&A_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A_{q} \end{bmatrix}

and assume that A is invertible. Then a unique n by n square matrix B exists such that AB = BA = I.

We partition both B and I into block matrices in a manner identical to that of A. In our framework identically partitioned means that the q partitions of A can be described by a partition vector u of length q+1, with A_{\alpha} containing u_{\alpha+1} - u_{\alpha} rows and columns. We can then take that partition vector u and use it to partition B and I in an identical manner. (This works because B and I are also n by n square matrices.)

We then have

B = \begin{bmatrix} B_{11}&B_{12}&\cdots&B_{1q} \\ B_{21}&B_{22}&\cdots&B_{2q} \\ \vdots&\vdots&\ddots&\vdots \\ B_{q1}&B_{q2}&\cdots&B_{qq} \end{bmatrix} \qquad I = \begin{bmatrix} I_{11}&I_{12}&\cdots&I_{1q} \\ I_{21}&I_{22}&\cdots&I_{2q} \\ \vdots&\vdots&\ddots&\vdots \\ I_{q1}&I_{q2}&\cdots&I_{qq} \end{bmatrix}

Since I = AB, from the previous post on multiplying block matrices we have

I_{\alpha \beta} = \sum_{\gamma = 1}^{q} A_{\alpha \gamma} B_{\gamma \beta}

We can rewrite the above sum as follows:

I_{\alpha \beta} = \sum_{\gamma = 1}^{\alpha-1} A_{\alpha \gamma} B_{\gamma \beta} + A_{\alpha \alpha}B_{\alpha \beta} + \sum_{\gamma = \alpha+1}^{q} A_{\alpha \gamma} B_{\gamma \beta}

For both sums we have \gamma \ne \alpha for all terms in the sums, and since A is in block diagonal form we have A_{\alpha \gamma} = 0 for all terms in the sums, so that

I_{\alpha \beta} = \sum_{\gamma = 1}^{\alpha-1} 0 \cdot B_{\gamma \beta} + A_{\alpha \alpha}B_{\alpha \beta} + \sum_{\gamma = \alpha+1}^{q} 0 \cdot B_{\gamma \beta} = A_{\alpha \alpha}B_{\alpha \beta}

But I is the identity matrix, with 1 on the diagonal and zero for all other entries. If \alpha \ne \beta then the submatrix I_{\alpha \beta} will contain all off-diagonal entries, so that I_{\alpha \beta} = 0, and therefore A_{\alpha \alpha}B_{\alpha \beta} = 0 for \alpha \ne \beta. But A is an arbitrary matrix and thus A_{\alpha \alpha} may be nonzero. For the product of A_{\alpha \alpha} and B_{\alpha \beta} to always be zero when \alpha \ne \beta, we must have B_{\alpha \beta} = 0 when \alpha \ne \beta. Thus B is in block diagonal form when partitioned identically to A.

When \alpha = \beta we have I_{\alpha \beta} = I_{\alpha \alpha} = A_{\alpha \alpha} B_{\alpha \alpha}. But I_{\alpha \alpha} has 1 for all diagonal entries and 0 for all off-diagonal entries; it is simply a version of the identity matrix with u_{\alpha+1} - u_{\alpha} rows and columns. Since the product A_{\alpha \alpha}B_{\alpha \alpha} is equal to the identity matrix, B_{\alpha \alpha} is a right inverse of A_{\alpha \alpha}.

We also have I = BA, so that

I_{\alpha \beta} = \sum_{\gamma = 1}^{q} B_{\alpha \gamma} A_{\gamma \beta}

We can rewrite the above sum as follows:

I_{\alpha \beta} = \sum_{\gamma = 1}^{\beta-1} B_{\alpha \gamma} A_{\gamma \beta} + B_{\alpha \beta}A_{\beta \beta} + \sum_{\gamma = \beta+1}^{q} B_{\alpha \gamma} A_{\gamma \beta}

For both sums we have \gamma \ne \beta for all terms in the sums, and since A is in block diagonal form we have A_{\gamma \beta} = 0 for all terms in the sums, so that I_{\alpha \beta} = B_{\alpha \beta}A_{\beta \beta}. For \alpha \ne \beta both sides of the equation are zero (since both I and B are in block diagonal form), and for \alpha = \beta we have I_{\alpha \beta} = I_{\beta \beta} = B_{\beta \beta} A_{\beta \beta}. But I_{\beta \beta} is the identity matrix, and thus B_{\beta \beta} is a left inverse of A_{\beta \beta} for 1 \le \beta \le q.

Since B_{\alpha \alpha} is both a right and left inverse of A_{\alpha \alpha} for 1 \le \alpha \le q, we conclude that A_{\alpha \alpha} is invertible for 1 \le \alpha \le q and has inverse (A_{\alpha \alpha})^{-1} = B_{\alpha \alpha}. We also know that B = A^{-1} is partitioned into block diagonal form, so we conclude that

A^{-1} = \begin{bmatrix} B_{11}&0&\cdots&0 \\ 0&B_{22}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&B_{qq} \end{bmatrix} = \begin{bmatrix} A^{-1}_{1}&0&\cdots&0 \\ 0&A^{-1}_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A^{-1}_{q} \end{bmatrix}

b) Let A be an n by n square matrix partitioned into block diagonal form with q row and column partitions:

A = \begin{bmatrix} A_{1}&0&\cdots&0 \\ 0&A_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A_{q} \end{bmatrix}

and assume that A_{\alpha} is invertible for 1 \le \alpha \le q. Then for 1 \le \alpha \le q a unique u_{\alpha+1} - u_\alpha by u_{\alpha+1} - u_\alpha square matrix B_\alpha exists such that A_{\alpha}B_{\alpha} = B_{\alpha}A_{\alpha} = I.

We now construct block diagonal matrix B with the matrices B_\alpha as its diagonal submatrices:

B = \begin{bmatrix} B_{1}&0&\cdots&0 \\ 0&B_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&B_{q} \end{bmatrix}

Since each B_{\alpha} is a square matrix with the same number of rows and columns as the corresponding submatrix A_{\alpha} of A, the matrix B will also be a square matrix of size n by n, and as a block diagonal matrix B is partitioned identically to A.

Now form the product matrix C = AB, which is also an n by n matrix. Since A and B are identically partitioned block diagonal matrices, per the previous post on multiplying block diagonal matrices we know that C is also a block diagonal matrix, identically partitioned to A and B, with each C_\alpha = A_{\alpha}B_{\alpha}:

C = \begin{bmatrix} C_{1}&0&\cdots&0 \\ 0&C_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&C_{q} \end{bmatrix} = \begin{bmatrix} A_{1}B_{1}&0&\cdots&0 \\ 0&A_{2}B_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A_{q}B_{q} \end{bmatrix}

But we have A_{\alpha}B_{\alpha} = I, 1 \le \alpha \le q, and therefore C_{\alpha} = I, 1 \le \alpha \le q. Since every submatrix C_{\alpha} has 1 on the diagonal and zero otherwise, the matrix C itself has 1 on the diagonal and zero otherwise, so that C = AB = I. The matrix B is therefore a left right inverse for A.

Next form the product matrix D= BA, which is also an n by n block diagonal matrix, identically partitioned to A and B, with each D_\alpha = B_{\alpha}A_{\alpha}:

D = \begin{bmatrix} D_{1}&0&\cdots&0 \\ 0&D_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&D_{q} \end{bmatrix} = \begin{bmatrix} B_{1}A_{1}&0&\cdots&0 \\ 0&B_{2}A_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&B_{q}A_{q} \end{bmatrix}

But we have B_{\alpha}A_{\alpha} = I, 1 \le \alpha \le q, and therefore D_{\alpha} = I, 1 \le \alpha \le q. Since every submatrix D_{\alpha} has 1 on the diagonal and zero otherwise, the matrix D itself has 1 on the diagonal and zero otherwise, so that D = BA = I. The matrix B is therefore a right left inverse for A.

Since B is both a left and a right inverse for A, B is therefore the inverse A^{-1} of A. From the way B was constructed we then have

A^{-1} = B = \begin{bmatrix} B_{1}&0&\cdots&0 \\ 0&B_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&B_{q} \end{bmatrix} = \begin{bmatrix} A^{-1}_{1}&0&\cdots&0 \\ 0&A^{-1}_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A^{-1}_{q} \end{bmatrix}

Combining the results of (a) and (b) above, we conclude that if A is a block diagonal matrix with q submatrices on the diagonal then A is invertible if and only if A_{\alpha} is invertible for 1 \le \alpha \le q. In this case A^{-1} is also a block diagonal matrix, identically partitioned to A, with (A^{-1})_{\alpha} = (A_{\alpha})^{-1}.

UPDATE: Corrected two instances where I referred to the matrix B as a left inverse of A instead of a right inverse, and vice versa.

 Buy me a snack to sponsor more posts like this!

Posted in linear algebra | Leave a comment

Multiplying block diagonal matrices

In a previous post I discussed the general problem of multiplying block matrices (i.e., matrices partitioned into multiple submatrices). I then discussed block diagonal matrices (i.e., block matrices in which the off-diagonal submatrices are zero) and in a multipart series of posts showed that we can uniquely and maximally partition any square matrix into block diagonal form. (See part 1, part 2, part 3, part 4, and part 5.) With this as background I now discuss the general problem of multiplying two block diagonal matrices. In particular I want to prove the following claim:

If A and B are n by n square matrices identically partitioned into block diagonal form:

A = \begin{bmatrix} A_{1}&0&\cdots&0 \\ 0&A_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A_{q} \end{bmatrix} \qquad B = \begin{bmatrix} B_{1}&0&\cdots&0 \\ 0&B_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&B_{q} \end{bmatrix}

then their product C = AB is also a block diagonal matrix, identically partitioned to A and B:

C = \begin{bmatrix} C_{1}&0&\cdots&0 \\ 0&C_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&C_{q} \end{bmatrix}

with C_{\alpha} = A_{\alpha}B_{\alpha}.

Proof: Let A and B be n by n square matrices identically partitioned into block diagonal form with q row and column partitions. In our framework identically partitioned means that the q partitions of A and B can be described by a partition vector u of length q+1, with both A_{\alpha} and B_{\alpha} containing u_{\alpha+1} - u_{\alpha} rows and columns.

From the previous discussion on multiplying block matrices we know that the n by n matrix product C = AB can be described as a block matrix with q row partitions and q column partitions:

C = \begin{bmatrix} C_{11}&C_{12}&\cdots&C_{1q} \\ C_{21}&C_{22}&\cdots&C_{2q} \\ \vdots&\vdots&\ddots&\vdots \\ C_{q1}&C_{q2}&\cdots&C_{qq} \end{bmatrix}

with submatrices computed as follows:

C_{\alpha \beta} = \sum_{\gamma = 1}^{q} A_{\alpha \gamma} B_{\gamma \beta}

Note that since A_{\alpha \gamma} contains u_{\alpha+1} - u_\alpha rows and u_{\gamma+1} - u_\gamma columns, and B_{\gamma \beta} contains u_{\gamma+1} - u_\gamma rows and u_{\beta+1} - u_\beta columns, C_{\alpha \beta} contains u_{\alpha+1} - u_\alpha rows and u_{\beta+1} - u_\beta columns.

We can rewrite the above expression for  C_{\alpha \beta} as follows:

C_{\alpha \beta} = \sum_{\gamma = 1}^{\alpha-1} A_{\alpha \gamma} B_{\gamma \beta} + A_{\alpha \alpha}B_{\alpha \beta} + \sum_{\gamma = \alpha+1}^{q} A_{\alpha \gamma} B_{\gamma \beta}

For both sums we have \gamma \ne \alpha for all terms in the sums, and since A is in block diagonal form we have A_{\alpha \gamma} = 0 for all terms in the sums, so that

C_{\alpha \beta} = \sum_{\gamma = 1}^{\alpha-1} 0 \cdot B_{\gamma \beta} + A_{\alpha \alpha}B_{\alpha \beta} + \sum_{\gamma = \alpha+1}^{q} 0 \cdot B_{\gamma \beta} = A_{\alpha \alpha}B_{\alpha \beta}

Since B is also in block diagonal form, if \alpha \ne \beta we have B_{\alpha \beta} = 0 and

C_{\alpha \beta} = A_{\alpha \alpha}B_{\alpha \beta} = A_{\alpha \alpha} \cdot 0 = 0

Since C_{\alpha \beta} = 0 if \alpha \ne \beta, C is also in block diagonal form.

We then have C_{\alpha \alpha} = A_{\alpha \alpha}B_{\alpha \alpha} or in our shorthand notation C_{\alpha} = A_{\alpha}B_{\alpha} so that

C = AB = \begin{bmatrix} C_{1}&0&\cdots&0 \\ 0&C_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&C_{q} \end{bmatrix} = \begin{bmatrix} A_{1}B_{1}&0&\cdots&0 \\ 0&A_{2}B_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A_{q}B_{q} \end{bmatrix}

Note that if A and B are in maximal block diagonal form with only one partition then A = A_1 and B = B_1 so that this reduces to C_1 = A_1B_1 = AB = C.

On the other hand, if A and B are in maximal block diagonal form with n partitions, such that

A = \begin{bmatrix} a_1&0&\cdots&0 \\ 0&a_2&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&a_n \end{bmatrix} \qquad B = \begin{bmatrix} b_1&0&\cdots&0 \\ 0&b_2&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&b_n \end{bmatrix}

then A_\alpha = \begin{bmatrix} a_{\alpha} \end{bmatrix} and B_\alpha = \begin{bmatrix} b_{\alpha} \end{bmatrix} so that

C = AB = \begin{bmatrix} C_{1}&0&\cdots&0 \\ 0&C_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&C_{n} \end{bmatrix} = \begin{bmatrix} a_1b_1&0&\cdots&0 \\ 0&a_2b_2&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&a_nb_n \end{bmatrix}

In my next post I discuss inverting block diagonal matrices.

 Buy me a snack to sponsor more posts like this!

Posted in linear algebra | 1 Comment

Partitioning a matrix into block diagonal form, part 5

In previous posts in this series I informally explored how to partition a matrix into block diagonal form (part 1), described a method for constructing a partitioning vector v for an n by n square matrix A (part 2), showed that the vector thus constructed partitions A into block diagonal form (part 3), and proved a preliminary result about the relationship between that vector and any other vector partitioning A into block diagonal form (part 4). In this post I complete the task of showing that the vector v is maximal and unique: A cannot also be partitioned as a block diagonal matrix with s partitions, where s > r, and if we partition A into block diagonal form with r partitions using a vector w then we have w = v.

In the previous posts I showed that if we have a vector w that also partitions A into block diagonal form, using s partitions:

A = \begin{bmatrix} A'_{1}&0&\cdots&0 \\ 0&A'_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A'_{s} \end{bmatrix}

then for all \alpha where 1 \le \alpha \le s+1 we have w_{\alpha} = v_{\beta} for some \beta where 1 \le \beta \le r+1. In other words, any partition boundary specified by w is also a partition boundary specified by v.

Note in addition that if w_{\alpha} = v_{\beta} for some \alpha and \beta, and also w_{\gamma} = v_{\delta} for some \gamma \ne \alpha, then \delta \ne \beta. This follows from the fact that w and v are partition vectors: Assume that \alpha < \gamma; then we have w_{\alpha} < w_{\gamma} and thus v_{\beta} < v_{\delta}, which in turn implies that \beta < \delta. (The proof of this last statement is by contradiction: If we instead had \beta \ge \delta then because v is a partition vector we would have v_{\beta} \ge v_{\delta}.) Similarly \alpha > \gamma implies that \beta > \delta. So each entry w_{\alpha} of w corresponds to a unique entry v_{\beta} of v.

I’m now ready to prove that v is a maximal partitioning of A into block diagonal form with r partitions. Assume that w also partitions A into block diagonal form with s partitions. We know from above that each entry w_{\alpha} of w corresponds to a unique entry v_{\beta} of v. But there are only r+1 entries in v, so that the number of entries s+1 in w must be less than or equal to r+1. (Otherwise two or more entries of w would be equal to the same entry of v.) We thus have s \le r. The vector v is therefore a maximal partitioning of A into block diagonal form.

My final task is to prove that v is a unique partitioning of A into block diagonal form with r partitions. Suppose that w also partitions A into block diagonal form with r partitions:

A = \begin{bmatrix} A'_{1}&0&\cdots&0 \\ 0&A'_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A'_{r} \end{bmatrix}

Since w has r+1 entries, each equal to a unique entry of v, and v also has r+1 entries, each of the entries of v is in turn equal to a entry of w. In other words, for all \delta, where 1 \le \delta \le r+1, there exists a \gamma such that v_{\delta} = w_{\gamma}. (The proof is this is by contradiction: Suppose for \delta there was no \gamma such that v_{\delta} = w_{\gamma}. Then since w has r+1 entries, the same number as v, at least two of the entries of w would have to be equal to the same entry of v, and hence equal to each other. But this is impossible since w is a partition vector.)

Moreover, if v_{\delta} = w_{\gamma} for some \delta and \gamma, and v_{\beta} = w_{\alpha} for some \beta \ne \delta, then \gamma \ne \alpha. (The proof of this again follows directly from the fact that v and w are partition vectors: Assume without loss of generality that \delta < \beta; then we have v_{\delta} < v_{\beta} and thus w_{\gamma} < w_{\alpha}, which in turn implies that \gamma < \alpha.) So each entry v_{\delta} of v corresponds to a unique entry w_{\gamma} of w.

I nw claim that w_{\alpha} = v_{\alpha} for 1 \le \alpha \le r+1). We already have w_1 = v_1 by the definition of a partition vector. Now suppose w_{\alpha} = v_{\alpha} for some \alpha, and consider w_{\alpha +1} andv_{\alpha +1} (assuming they exist, that is where \alpha \le r).

We know that w_{\alpha +1} = v_{\beta} for some \beta where 1 \le \beta \le r+1. Suppose \beta < \alpha+1. Then we would have \beta \le \alpha and thus w_{\alpha +1} = v_{\beta} \le v_{\alpha} = w_{\alpha}. But this cannot be, since w is a partition vector and we are guaranteed that w_{\alpha +1} > w_{\alpha}. So we must have \beta \ge \alpha+1. We also know that v_{\alpha+1} = w_{\gamma} for some \gamma where 1 \le \gamma \le r+1. By a similar argument we must also have \gamma \ge \alpha+1.

Now since \gamma \ge \alpha+1 we have w_{\gamma} \ge w_{\alpha+1}, and since \beta \ge \alpha+1 we have v_{\beta} \ge v_{\alpha+1}. But we also have v_{\beta} = w_{\alpha+1} \le w_{\gamma} = v_{\alpha+1}. Since we have both  v_{\beta} \ge v_{\alpha+1} and v_{\beta} \le v_{\alpha+1} we must have v_{\beta} = v_{\alpha+1}. This in turn implies that \beta = \alpha+1 and thus w_{\alpha+1} = v_{\beta} = v_{\alpha+1}.

We thus have w_1 = v_1, and given w_{\alpha} = v_{\alpha} for 1 \le \alpha \le r we have w_{\alpha+1} = v_{\alpha+1}. By induction we have w_{\alpha} = v_{\alpha} for 1 \le \alpha \le r+1. Since both w and v have only r+1 entries we therefore have w = v.

Since w partitions A into block diagonal form with r partitions, and since w = v as just shown, for any diagonal submatrix A'_{\alpha} where 1 \le \alpha \le r we have

A'_{\alpha} = \begin{bmatrix} a_{w_\alpha+1, w_\alpha+1}&a_{w_\alpha+1,w_\alpha + 2}&\cdots&a_{w_\alpha+1,w_{\alpha+1}} \\ a_{w_\alpha+2, w_\alpha+1}&a_{w_\alpha+2,v_\alpha+2}&\cdots&a_{w_\alpha+2,w_{\alpha+1}} \\ \vdots&\vdots&\ddots&\vdots \\ a_{w_{\alpha+1}, w_\alpha+1}&a_{w_{\alpha+1},w_\alpha+2}&\cdots&a_{w_{\alpha+1},w_{\alpha+1}} \end{bmatrix}

= \begin{bmatrix} a_{v_\alpha+1, v_\alpha+1}&a_{v_\alpha+1,v_\alpha + 2}&\cdots&a_{v_\alpha+1,v_{\alpha+1}} \\ a_{v_\alpha+2, v_\alpha+1}&a_{v_\alpha+2,v_\alpha+2}&\cdots&a_{v_\alpha+2,v_{\alpha+1}} \\ \vdots&\vdots&\ddots&\vdots \\ a_{v_{\alpha+1}, v_\alpha+1}&a_{v_{\alpha+1},v_\alpha+2}&\cdots&a_{v_{\alpha+1},v_{\alpha+1}} \end{bmatrix} = A_{\alpha}

Since A'_{\alpha} = A_{\alpha} for 1 \le \alpha \le r, the vector v uniquely partitions A into block diagonal form with r partitions. Combined with the result above this concludes the proof: For any n by n square matrix A we can uniquely and maximally partition A into block diagonal form with one or more diagonal submatrices, using a vector v constructed according to the method outlined in part 2 of this series.

 Buy me a snack to sponsor more posts like this!

Posted in linear algebra | Leave a comment

Partitioning a matrix into block diagonal form, part 4

In my previous two posts I described a method for constructing a partitioning vector v for an n by n square matrix A (part 2 of this series) and showed that the vector thus constructed partitions A into block diagonal form (part 3). In this post I begin the task of showing that the vector v is maximal and unique: A cannot also be partitioned as a block diagonal matrix with s partitions, where s > r, and if we partition A into block diagonal form with r partitions using a vector w then we have w = v.

In the previous posts I showed that we can construct a vector v that will partition A into block diagonal form with r partitions (where r is one less than the length of v), such that

A = \begin{bmatrix} A_{1}&0&\cdots&0 \\ 0&A_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A_{r} \end{bmatrix}

Now suppose that we have a vector w that also partitions A into block diagonal form, using s partitions:

A = \begin{bmatrix} A'_{1}&0&\cdots&0 \\ 0&A'_{2}&\cdots&0 \\ \vdots&\vdots&\ddots&\vdots \\ 0&0&\cdots&A'_{s} \end{bmatrix}

My approach in this post is to try to find some sort of connection between w and v. In particular, I claim that for all \alpha where 1 \le \alpha \le s+1 we have w_{\alpha} = v_{\beta} for some \beta where 1 \le \beta \le r+1. In other words, any partition boundary specified by w is also a partition boundary specified by v.

My proof is by induction. Since w and v are both partition vectors we already have w_1 = v_1 = 0 by definition. The claim therefore holds true for \alpha = 1 (with \beta = 1 in this case). Suppose that for some \alpha we have w_{\alpha} = v_{\beta} for some \beta (with 1 \le \beta \le r+1), and consider w_{\alpha+1}. We want to show that w_{\alpha+1} = v_{\gamma} for some \gamma where 1 \le \gamma \le r+1.

We can select \gamma > \beta such that 1 \le \gamma \le r+1 and w_{\alpha} \le v_{\gamma-1} < w_{\alpha+1} \le v_{\gamma}. This is done as follows: We have w_{\alpha+1} \le n = v_{r+1} so we know that w_{\alpha+1} \le v_k for at least one k (i.e., k = r+1). We select \gamma to be the smallest k less than or equal to r+1 for which w_{\alpha+1} \le v_k. By the definition of \gamma and the fact that v_{\gamma-1} < v_{\gamma} we then conclude that w_{\alpha+1} > v_{\gamma-1}. (If we had w_{\alpha+1} \le v_{\gamma-1} then \gamma would not be the smallest k less than or equal to r+1 for which w_{\alpha+1} \le v_k.) We also have \gamma > \beta, which means that \gamma - 1 \ge \beta. We then have v_{\gamma-1} \ge v_\beta = w_{\alpha}. Finally, \gamma - 1 \ge \beta and \beta \ge 1 implies that \gamma \ge 2 and thus \gamma \ge 1. The result is that \gamma has the properties noted above.

From the above we see that w_{\alpha+1} \le v_{\gamma}. I now claim that w_{\alpha+1} = v_{\gamma}. The proof is by contradiction. Suppose instead that w_{\alpha+1} < v_{\gamma}. The submatrix A'_{\alpha+1} includes entries from rows w_\alpha + 1 through w_{\alpha+1} of A and columns w_\alpha + 1 through w_{\alpha+1} of A. Since w partitions A into block diagonal form we know that any submatrices to the right of and below A'_{\alpha+1} are zero (if such submatrices exist).

The requirement that all submatrices to the right of A'_{\alpha} be zero then means that we must have a_{ij} = 0 if w_{\alpha} + 1 \le i \le w_{\alpha+1} and j > w_{\alpha+1}, and the requirement that all submatrices below A'_{\alpha} be zero means that we must also have a_{ij} = 0 if i > w_{\alpha+1} and w_{\alpha} + 1 \le j \le w_{\alpha+1}.

From the way we selected \gamma we have w_{\alpha} \le v_{\gamma-1} < w_{\alpha+1}, which means that we also have a_{ij} = 0 if v_{\gamma-1} + 1 \le i \le w_{\alpha+1} and j > w_{\alpha+1}, and a_{ij} = 0 if i > w_{\alpha+1} and v_{\gamma-1} + 1 \le j \le w_{\alpha+1}.

But recall from the definition of v that v_{\gamma} was chosen to be the smallest k such that a_{ij} = 0 if either v_{\gamma-1} + 1 \le i \le k and j > k, or if i > k and v_{\gamma-1} + 1 \le j \le k; alternatively v_{\gamma} was set to n if no such k existed. In the former case we must have v_{\gamma} \le w_{\alpha+1}, which contradicts our assumption that v_{\gamma} > w_{\alpha+1}. In the latter case our assumption that v_{\gamma} > w_{\alpha+1} implies that there is indeed a k fulfilling the criteria above, namely k = w_{\alpha+1}, which contradicts our assumption about how v was defined.

We therefore conclude that given w_{\alpha} = v_{\beta} for some \beta (where 1 \le \beta \le r+1), there exists \gamma such that w_{\alpha+1} = v_{\gamma} (where 1 \le \gamma \le r+1). Combined with the fact that this is true for \alpha = 1, we conclude that for any \alpha (where 1 \le \alpha \le s+1) there exists \beta such that 1 \le \beta \le r+1 and w_{\alpha} = v_{\beta}, so that any partition boundary specified by w is also a partition boundary specified by v.

In part 5 of this series I use this result to complete the task of showing that the vector v is maximal and unique.

 Buy me a snack to sponsor more posts like this!

Posted in linear algebra | Leave a comment

Partitioning a matrix into block diagonal form, part 3

In part 2 of this series I described a method to construct a vector v that partitions an n by n square matrix A into r row and column partitions. In this post I show that the vector v thus constructed partitions A into block diagonal form.

For the case where r = 1 and v = \begin{bmatrix} 0&n \end{bmatrix} we have A = A_{11} (as previously discussed) and A is in block diagonal form by the convention we’ve adopted.

Suppose r > 1, i.e., A has been partitioned into at least two partitions by the vector v. We consider A_{\alpha \beta} where \alpha \ne \beta, beginning with the case where \alpha < \beta (i.e., we are in the upper right part of the matrix, and the submatrix A_{\alpha \beta} is to the right of submatrix A_{\alpha \alpha} on the diagonal).

Since v defines a valid partitioning of A we have

A_{\alpha \beta} = \begin{bmatrix} a_{v_\alpha+1, v_\beta+1}&a_{v_\alpha+1,v_\beta + 2}&\cdots&a_{v_\alpha+1,v_{\beta+1}} \\ a_{v_\alpha+2, v_\beta+1}&a_{v_\alpha+2,v_\beta+2}&\cdots&a_{v_\alpha+2,v_{\beta+1}} \\ \vdots&\vdots&\ddots&\vdots \\ a_{v_{\alpha+1}, v_\beta+1}&a_{v_{\alpha+1},v_\beta+2}&\cdots&a_{v_{\alpha+1},v_{\beta+1}} \end{bmatrix}

Since \alpha < \beta (as assumed) and \beta \le r (because \beta references a partition of the columns of A, and the columns of A have been divided into r partitions), we have \alpha < r. That means that v_{\alpha+1} was not assigned on the last step of the process described in the previous post (that step being the r^{th} step, and the value assigned being v_{r+1}), and that in turn means that v_{\alpha+1} was chosen so that a_{ij} = 0 if v_{\alpha} + 1 \le i \le v_{\alpha+1} and j > v_{\alpha+1}.

As noted above, the entries in the rows of A_{\alpha \beta} are drawn from  rows v_{\alpha} + 1 through v_{\alpha+1} of A. Since \alpha < \beta we also have

\beta \ge \alpha+1 \rightarrow v_{\beta} \ge v_{\alpha+1} \rightarrow v_{\beta} + 1 > v_{\alpha+1}

All entries of A_{\alpha \beta} are thus of the form a_{ij} where v_{\alpha} + 1 \le i \le v_{\alpha+1} (from the definition of A_{\alpha \beta} above) and j > v_{\alpha+1} (since the first column of A_{\alpha \beta} is taken from column v_{\beta} + 1 of A, and v_{\beta} + 1 > v_{\alpha+1}). As noted above all such entries are zero by the definition of v_{\alpha+1}. We therefore have A_{\alpha \beta} = 0 if \alpha < \beta.

We next consider the case where \alpha > \beta (i.e., we are in the lower left part of the matrix, and the submatrix A_{\alpha \beta} is below the submatrix A_{\beta \beta} on the diagonal). Since \beta < \alpha (as assumed) and \alpha \le r (because \alpha references a partition of the rows of A, and the rows of A have been divided into r partitions), we have \beta < r. That means that v_{\beta+1} was not assigned on the last step of the process described above, and that in turn means that v_{\beta}  and v_{\beta+1} were chosen so that a_{ij} = 0 if i > v_{\beta+1} and v_{\beta} + 1 \le j \le v_{\beta+1}.

As noted above, the entries in the columns of A_{\alpha \beta} are drawn from  columns v_{\beta} + 1 through v_{\beta+1} of A. Since \alpha > \beta we also have

\alpha \ge \beta+1 \rightarrow v_{\alpha} \ge v_{\beta+1} \rightarrow v_{\alpha} + 1 > v_{\beta+1}

All entries of A_{\alpha \beta} are thus of the form a_{ij} where v_{\beta} + 1 \le j \le v_{\beta+1} (from the definition of A_{\alpha \beta} above) and i > v_{\beta+1} (since the first row of A_{\alpha \beta} is taken from row v_{\alpha} + 1 of A, and v_{\alpha} + 1 > v_{\beta+1}). As noted above all such entries are zero by the definition of v_{\beta+1}. We therefore have A_{\alpha \beta} = 0 if \alpha > \beta.

Since A_{\alpha \beta} = 0 if \alpha < \beta and also if \alpha > \beta we then have A_{\alpha \beta} = 0 if \alpha \ne \beta, so that A as partitioned by v is in block diagonal form if r > 1.

Since by convention A is also in block diagonal form when r = 1, the vector v as constructed in the previous post always partitions A into block diagonal form. (For r = 1 the vector v would contain only the entries 0 and n.)

In part 4 of this series I take up the problem of showing that v uniquely and maximally partitions A into block diagonal form with r partitions.

 Buy me a snack to sponsor more posts like this!

Posted in linear algebra | Leave a comment

Partitioning a matrix into block diagonal form, part 2

In part 1 of this series I outlined an informal method to partition a square matrix into block diagonal form. In this post I provide a formal description of such a method, applicable to any n by n square matrix. In doing this I use the concept of a partitioning vector introduced in my original post on multiplying block matrices.

(Note that it’s possible that a better approach to partitioning exists, not to mention better notation. However this is the best I’ve been able to come up with.)

Let A be an n by n matrix. Following the discussion in my original post, we can divide the n rows and n columns of A into q partitions, where 1 \le q \le n, using a vector u of nonnegative integer values defined as follows:

u = \begin{bmatrix} u_1&u_2&\cdots&u_{q+1} \end{bmatrix}

u_1 = 0, u_{q+1} = n

u_i < u_{i+1} \text{ for } 1 \le i \le q

where u_i + 1 is the first row and column in the i^{th} partition, u_{i+1} is the last row and column in the i^{th} partition, and u_{i+1} - u_i \ge 1 is the number of rows and columns in the i^{th} partition.

We then define q^2 submatrices of A, with a given submatrix defined in terms of the entries of A as follows:

A_{\alpha \beta} = \begin{bmatrix} a_{u_\alpha+1, u_\beta+1}&a_{u_\alpha+1,u_\beta + 2}&\cdots&a_{u_\alpha+1,u_{\beta+1}} \\ a_{u_\alpha+2, u_\beta+1}&a_{u_\alpha+2,u_\beta+2}&\cdots&a_{u_\alpha+2,u_{\beta+1}} \\ \vdots&\vdots&\ddots&\vdots \\ a_{u_{\alpha+1}, u_\beta+1}&a_{u_{\alpha+1},u_\beta+2}&\cdots&a_{u_{\alpha+1},u_{\beta+1}} \end{bmatrix}

so that the (i, j) entry of A_{\alpha \beta} has the value (A_{\alpha \beta})_{ij} = a_{u_\alpha+i,u_\beta+j}.

When q = 1 we have u = \begin{bmatrix} 0&n \end{bmatrix} and A = A_{11}. When q = n we have u = \begin{bmatrix} 0&1&\cdots&n \end{bmatrix} so that u_i = i-1, u_j = j-1 and A_{\alpha \beta} = \begin{bmatrix} a_{\alpha \beta} \end{bmatrix}. (For a more detailed discussion of why this is so please see the previous post on multiplying block matrices.)

We now attempt to partition A into block diagonal form by constructing a partitioning vector v as follows:

We first set v_1 = 0, so that the first submatrix on the diagonal, A_1, will start with row 1 and column 1. Then for \alpha = 1, 2, 3, \dotsc, in turn we calculate v_{\alpha + 1} by looking for the smallest k > v_{\alpha} such that

a_{ij} = 0 \text{ if } v_{\alpha} + 1 \le i \le k \text{ and } j > k

and

a_{ij} = 0 \text{ if } i > k \text{ and } v_{\alpha} + 1 \le j \le k

Note that the first property corresponds to having all zeroes to the right of the submatrix A_{\alpha} on the diagonal while the second property corresponds to having all zeroes below A_{\alpha}.

If no such k exists then we set v_{\alpha+1} = n and have completed partitioning the matrix A, with the number of partitions produced by the partitioning vector v then being r = \alpha.

Otherwise (i.e., if such a k does exist) we set v_{\alpha+1} = k, and repeat the process for \alpha + 2 and so on until the process terminates.

Note that we are guaranteed that the process described above terminates in no more than n steps. As described above, we start by setting v_1 = 0, and then for our first step we set \alpha = 1 and compute v_2. If in computing v_2 we cannot find a suitable k then the process terminates in one step as we set v_2 = n, with the resulting vector v = \begin{bmatrix} 0&n \end{bmatrix}.

If, on the other hand, we are able to find a suitable k then by the definition of k we have v_2 = k > v_1 and therefore v_2 \ge v_1 + 1 = 1. We then repeat the process with \alpha = 2 to find v_3 and by similar reasoning have v_3 \ge v_2 + 1 and thus v_3 \ge 2. In general we see that v_{\alpha} \ge \alpha - 1.

Suppose that the process has not terminated when we come to \alpha = n (i.e., the n^{th} step in the process). From the previous paragraph we know that v_{n} \ge n-1. Substituting into the above conditions relating to k, in assigning v_{n+1} we are looking for k > v_n \ge n-1 where

a_{ij} = 0 \text{ if } v_n + 1 \le i \le k \text{ and } j > k

and

a_{ij} = 0 \text{ if } i > k \text{ and } v_n + 1 \le j \le k

Since k > v_n \ge n-1 we conclude that k \ge n. Then the first condition above cannot hold, because j > k \ge n implies that j > n, and A is an n by n matrix for which there are no elements a_{ij} for j > n. Similarly the second condition above cannot hold, because i > k \ge n implies that i > n, and there are no elements a_{ij} for i > n. Therefore we terminate the process and set v_{n+1} = n. We are thus guaranteed that the process of constructing the vector v will terminate in at most n steps.

Also note that the vector v thus defined is a valid partition of A according to our definition above:

  • The first element v_1 = 0.
  • All elements of v are nonnegative integers.  (v_1 is zero as noted, and all the other elements v_i are chosen to be row and column indices of A, and hence are positive integers.)
  • We have v_i < v_{i+1} for all i. (This follows from the way v_{\alpha+1} was defined relative to v_{\alpha}.)
  • Since there are at most n steps after the initial assignment of v_1 = 0, the maximum length of v is n+1, and for the number of partitions r (which is one less than the length of v) we thus have 1 \le r \le n.

Having constructed the partitioning vector v, in my next post I show that v partitions A into block diagonal form.

 Buy me a snack to sponsor more posts like this!

Posted in linear algebra | 2 Comments

Partitioning a matrix into block diagonal form, part 1

In my previous post I offered a more formal definition of a block diagonal matrix, and claimed that we can partition an arbitrary n by n square matrix A into block diagonal form in a way that is both maximal (i.e., no partitioning with more partitions exists) and unique (only one partitioning exists with that exact number of partitions). In this post I lay the groundwork for a formal proof of this by working through an example of partitioning such a matrix into block diagonal form.

Let’s start with an example matrix:

A = \begin{bmatrix} 0&x&0&0&0 \\ x&0&0&0&0 \\ 0&x&0&0&0 \\ 0&0&0&0&0 \\ 0&0&0&0&x \end{bmatrix}

and then consider alternative choices for the first partition:

\begin{bmatrix} 0&\vline&x&0&0&0 \\ \hline x&\vline&0&0&0&0 \\ 0&\vline&x&0&0&0 \\ 0&\vline&0&0&0&0 \\ 0&\vline&0&0&0&x \end{bmatrix} \quad \begin{bmatrix} 0&x&\vline&0&0&0 \\ x&0&\vline&0&0&0 \\ \hline 0&x&\vline&0&0&0 \\ 0&0&\vline&0&0&0 \\ 0&0&\vline&0&0&x \end{bmatrix} \quad \begin{bmatrix} 0&x&0&\vline&0&0 \\ x&0&0&\vline&0&0 \\ 0&x&0&\vline&0&0 \\ \hline 0&0&0&\vline&0&0 \\ 0&0&0&\vline&0&x \end{bmatrix}

In each alternative partition we move the partitioning lines down one row and right one column, each time considering a candidate for the first diagonal submatrix A_1. We stop at the point when there are only zeroes below and to the right of the candidate submatrix. Our first submatrix A_1 is then the 3 by 3 matrix in the upper left.

We then repeat the process, drawing new partitioning lines down one row and right one column:

\begin{bmatrix} 0&x&0&\vline&0&\vline&0 \\ x&0&0&\vline&0&\vline&0 \\ 0&x&0&\vline&0&\vline&0 \\ \hline 0&0&0&\vline&0&\vline&0 \\ \hline 0&0&0&\vline&0&\vline&x \end{bmatrix}

Since the 1 by 1 matrix containing zero as its only entry has only zeroes below it and to the right of it, we choose it as our second submatrix A_2.

At this point we cannot draw any further partitioning lines, so we choose as our last submatrix A_3 the 1 by 1 matrix in the lower right. We have thus partitioned A into block diagonal form with three submatrices on the diagonal.

Based on the process we followed it is apparent (or at least very plausible) that this is the only way to partition A into block diagonal form with three submatrices on the diagonal, and also that we could not have partitioned A into more than three submatrices on the diagonal and still have it be block diagonal according to the definition above.

Next we look at two extreme examples. In our first matrix the process of partitioning fails on the first step, in the sense that we cannot find suitable lines of partition for the first submatrix. No matter the choice of partition lines, the candidate submatrix never has only zeroes below it or to the right of it:

\begin{bmatrix} 0&\vline&x&0&0&0 \\ \hline x&\vline&0&0&0&0 \\ 0&\vline&x&0&x&0 \\ 0&\vline&0&0&0&0 \\ 0&\vline&0&0&x&x \end{bmatrix} \quad \begin{bmatrix} 0&x&\vline&0&0&0 \\ x&0&\vline&0&0&0 \\ \hline 0&x&\vline&0&x&0 \\ 0&0&\vline&0&0&0 \\ 0&0&\vline&0&x&x \end{bmatrix} \quad \begin{bmatrix} 0&x&0&\vline&0&0 \\ x&0&0&\vline&0&0 \\ 0&x&0&\vline&x&0 \\ \hline 0&0&0&\vline&0&0 \\ 0&0&0&\vline&x&x \end{bmatrix} \quad \begin{bmatrix} 0&x&0&0&\vline&0 \\ x&0&0&0&\vline&0 \\ 0&x&0&x&\vline&0 \\ 0&0&0&0&\vline&0 \\ \hline 0&0&0&x&\vline&x \end{bmatrix}

We conclude that this matrix can be considered to be in block diagonal form, but with only a single partition; in other words, r = 1 and A = A_1. As with the previous example this partitioning appears to be maximal, in that we can see no way to partition the matrix into two submatrices and have it be in block diagonal form. The partitioning is also obviously unique.

With the next example matrix the very first pair of partitioning lines we try produces a suitable submatrix with zeroes to the right and below, and each subsequent pair of lines does as well:

\begin{bmatrix} 0&\vline&0&0&0&0 \\ \hline 0&\vline&x&0&0&0 \\ 0&\vline&0&x&0&0 \\ 0&\vline&0&0&0&0 \\ 0&\vline&0&0&0&x \end{bmatrix} \quad \begin{bmatrix} 0&\vline&0&\vline&0&0&0 \\ \hline 0&\vline&x&\vline&0&0&0 \\ \hline 0&\vline&0&\vline&x&0&0 \\ 0&\vline&0&\vline&0&0&0 \\ 0&\vline&0&\vline&0&0&x \end{bmatrix} \quad \begin{bmatrix} 0&\vline&0&\vline&0&\vline&0&0 \\ \hline 0&\vline&x&\vline&0&\vline&0&0 \\ \hline 0&\vline&0&\vline&x&\vline&0&0 \\ \hline 0&\vline&0&\vline&0&\vline&0&0 \\ 0&\vline&0&\vline&0&\vline&0&x \end{bmatrix} \quad \begin{bmatrix} 0&\vline&0&\vline&0&\vline&0&\vline&0 \\ \hline 0&\vline&x&\vline&0&\vline&0&\vline&0 \\ \hline 0&\vline&0&\vline&x&\vline&0&\vline&0 \\ \hline 0&\vline&0&\vline&0&\vline&0&\vline&0 \\ \hline 0&\vline&0&\vline&0&\vline&0&\vline&x \end{bmatrix}

This matrix is a diagonal matrix with all entries off the diagonal being zero. We conclude that every diagonal matrix is also a block diagonal matrix with the number of partitions r = n, and with each diagonal submatrix A_{\alpha} = \begin{bmatrix} a_{\alpha \alpha} \end{bmatrix}. The partitioning is clearly both maximal and unique, since we cannot have more than n partitions of the rows and columns of an n by n matrix, and only one way exists to partition an n by n matrix into n partitions.

In part 2 of this series I formalize the method used above to partition the example matrices into block diagonal form.

 Buy me a snack to sponsor more posts like this!

Posted in linear algebra | Leave a comment