Multiplying block matrices

In doing exercise 1.6.10 in Linear Algebra and Its Applications I was reminded of the general issue of multiplying block matrices, including diagonal block matrices. This also came up in exercise 1.4.24 as well, which I answered without necessarily fully understanding the problem. I therefore wanted to go back and try to prove various claims about block matrices, starting with the following:

Let $A$ and $B$ be an $m$ by $p$ matrix and a $p$ by $n$ matrix respectively. Suppose we partition $A$ into submatrices with $q$ row partitions and $s$ column partitions:

$A = \begin{bmatrix} A_{11}&A_{12}&\cdots&A_{1s} \\ A_{21}&A_{22}&\cdots&A_{2s} \\ \vdots&\vdots&\ddots&\vdots \\ A_{q1}&A_{q2}&\cdots&A_{qs} \end{bmatrix}$

and we partition $B$ into submatrices with $s$ row partitions and $r$ column partitions:

$B = \begin{bmatrix} B_{11}&B_{12}&\cdots&B_{1r} \\ B_{21}&B_{22}&\cdots&B_{2r} \\ \vdots&\vdots&\ddots&\vdots \\ B_{s1}&B_{s2}&\cdots&B_{sr} \end{bmatrix}$

Show that the $m$ by $n$ matrix product $C = AB$ can be described as a block matrix with $q$ row partitions and $r$ column partitions

$C = \begin{bmatrix} C_{11}&C_{12}&\cdots&C_{1r} \\ C_{21}&C_{22}&\cdots&C_{2r} \\ \vdots&\vdots&\ddots&\vdots \\ C_{q1}&C_{q2}&\cdots&C_{qr} \end{bmatrix}$

with submatrices computed as follows:

$C_{\alpha \beta} = \sum_{\gamma = 1}^{s} A_{\alpha \gamma} B_{\gamma \beta}$

Proof: (Note that I’m sure a better proof of this exists, not to mention better notation. However this is the best I can come up with.)

We define a block matrix within $A$ as follows: First, we divide the $m$ rows of $A$ into $q$ partitions, where $1 \le q \le m$ . The partition is done using a vector $u$ of nonnegative integer values defined as follows:

$u = \begin{bmatrix} u_1&u_2&\cdots&u_{q+1} \end{bmatrix}$

$u_1 = 0, u_{q+1} = m$

$u_i < u_{i+1}, 1 \le i \le q$

This can be interpreted as follows: $u_i + 1$ is the first row in the $i^{th}$ partition, $u_{i+1}$ is the last row in the $i^{th}$ partition, and $u_{i+1} - u_i \ge 1$ is the number of rows in the $i^{th}$ partition.

We then divide the $p$ columns of $A$ into $s$ partitions, where $1 \le s \le p$ , using a vector $w$ of nonnegative integer values defined as follows:

$w = \begin{bmatrix} w_1&w_2&\cdots&w_{s+1} \end{bmatrix}$

$w_1 = 0, w_{s+1} = p$

$w_i < w_{i+1}, 1 \le i \le s$

The vector $w$ has a similar interpretation to the vector $u$ : $w_i + 1$ is the first column in the $i^{th}$ partition, $w_{i+1}$ is the last column in the $i^{th}$ partition, and $w_{i+1} - w_i \ge 1$ is the number of columns in the $i^{th}$ partition.

We then define $q$ times $s$ submatrices of $A$ , with a given submatrix defined in terms of the entries of $A$ as follows:

$A_{\alpha \beta} = \begin{bmatrix} a_{u_\alpha+1, w_\beta+1}&a_{u_\alpha+1,w_\beta + 2}&\cdots&a_{u_\alpha+1,w_{\beta+1}} \\ a_{u_\alpha+2, w_\beta+1}&a_{u_\alpha+2,w_\beta+2}&\cdots&a_{u_\alpha+2,w_{\beta+1}} \\ \vdots&\vdots&\ddots&\vdots \\ a_{u_{\alpha+1}, w_\beta+1}&a_{u_{\alpha+1},w_\beta+2}&\cdots&a_{u_{\alpha+1},w_{\beta+1}} \end{bmatrix}$

so that the $(i, j)$ entry of $A_{\alpha \beta}$ has the value

$(A_{\alpha \beta})_{ij} = a_{u_\alpha+i,w_\beta+j}$

Note that when $q = s = 1$ we have

$u = \begin{bmatrix} 0&m \end{bmatrix}, w = \begin{bmatrix} 0&p \end{bmatrix}$

so that

$A_{11} = \begin{bmatrix} a_{u_1+1,w_1+1}&a_{u_1+1,w_1 + 2}&\cdots&a_{u_1+1,w_2 } \\ a_{u_1 + 2, w_1+1}&a_{u_1 + 2,w_1 + 2}&\cdots&a_{u_1 + 2,w_2} \\ \vdots&\vdots&\ddots&\vdots \\ a_{u_2, w_1+1}&a_{u_2,w_1+2}&\cdots&a_{u_2,w_2} \end{bmatrix}$

$= \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1p} \\ a_{21}&a_{22}&\cdots&a_{2p} \\ \vdots&\vdots&\ddots&\vdots \\ a_{m1}&a_{m2}&\cdots&a_{mp} \end{bmatrix} = A$

In other words, in this case $A$ has only a single submatrix, namely $A$ itself.

Also note that when $q = m$ and $s = p$ we have

$u = \begin{bmatrix} 0&1&\cdots&m \end{bmatrix}, w = \begin{bmatrix} 0&1&\cdots&p \end{bmatrix}$

so that

$u_i = i-1, w_j = j-1$

We then have

$A_{\alpha \beta} = \begin{bmatrix} a_{u_\alpha+1, w_\beta+1}&\cdots&a_{u_\alpha+1,w_{\beta+1}} \\ \vdots&\vdots&\ddots&\vdots \\ a_{u_{\alpha+1}, w_\beta+1}&\cdots&a_{u_{\alpha+1},w_{\beta+1}} \end{bmatrix}$

$= \begin{bmatrix} a_{(\alpha-1)+1,(\beta-1)+1}&\cdots&a_{(\alpha-1)+1,(\beta+1)-1} \\ \vdots&\vdots&\ddots&\vdots \\ a_{(\alpha+1)-1,(\beta-1)+1}&\cdots&a_{(\alpha+1)-1,(\beta+1)-1} \end{bmatrix} = \begin{bmatrix} a_{\alpha \beta} \end{bmatrix}$

In other words, in this case $A$ has $m$ times $p$ submatrices, each containing a single entry of $A$ .

We next define a block matrix within $B$ as follows: First, we divide the $p$ rows of $B$ into $s$ partitions, using the same vector $w$ used to partition the columns of $A$ .

(Note: It is important that the rows of $B$ be partitioned in the exact same manner as the columns of $A$ , and not just that the number of partitions $s$ be the same. Otherwise we won’t be able to multiply submatrices of $A$ by the corresponding submatrices of $B$ , because the number of columns in a submatrix of $A$ won’t necessarily match the number of rows in the corresponding submatrix of $B$ . Unfortunately some references don’t make this clear, including at one point the block matrix article at Wikipedia.)

We then divide the $n$ columns of $B$ into $r$ partitions, where $1 \le r \le n$ , using a vector $v$ of nonnegative integer values defined as follows:

$v = \begin{bmatrix} v_1&v_2&\cdots&v_{r+1} \end{bmatrix}$

$v_1 = 0, v_{r+1} = n$

$v_i < v_{i+1}, 1 \le i \le r$

Again, the vector $v$ has the same interpretation as the vectors $u$ and $w$ : $v_i + 1$ is the first column in the $i^{th}$ partition, $v_{i+1}$ is the last column in the $i^{th}$ partition, and $v_{i+1} - v_i \ge 1$ is the number of columns in the $i^{th}$ partition.

We then define $s$ times $r$ submatrices of $B$ , with a given submatrix defined in terms of the entries of $B$ as follows:

$B_{\alpha \beta} = \begin{bmatrix} b_{w_\alpha+1 ,v_\beta+1}&b_{w_\alpha+1,v_\beta+2}&\cdots&b_{w_\alpha+1,v_{\beta+1}} \\ b_{w_\alpha+2, v_\beta+1}&b_{w_\alpha+2,v_\beta+2}&\cdots&b_{w_\alpha+2,v_{\beta+1}} \\ \vdots&\vdots&\ddots&\vdots \\ b_{w_{\alpha+1}, v_\beta+1}&b_{w_{\alpha+1},v_\beta+2}&\cdots&b_{w_{\alpha+1},v_{\beta+1}} \end{bmatrix}$

As in the case of $A$ , if $s = r = 1$ then there is only a single submatrix of $B$ , namely $B$ itself, and if $s = p$ and $r = n$ then each submatrix contains a single entry of $B$ . The $(i, j)$ entry of $B_{\alpha \beta}$ has the value

$(B_{\alpha \beta})_{ij} = b_{w_\alpha+i,v_\beta+j}$

Finally, consider the $m$ by $n$ matrix $C = AB$ with entries

$c_{ij} = \sum_{k=1}^p a_{ik}b_{kj}$

We can partition the rows of $C$ into $q$ partitions according to the same vector $u$ used to partition the rows of $A$ , and partition the columns of $C$ into $r$ partitions according to the same vector $v$ used to partition the columns of $B$ . We then have

$C_{\alpha \beta} = \begin{bmatrix} c_{u_\alpha+1, v_\beta+1}&c_{u_\alpha+1,v_\beta + 2}&\cdots&c_{u_\alpha+1,v_{\beta+1}} \\ c_{u_\alpha+2, v_\beta+1}&c_{u_\alpha+2,v_\beta+2}&\cdots&c_{u_\alpha+2,v_{\beta+1}} \\ \vdots&\vdots&\ddots&\vdots \\ c_{u_{\alpha+1}, v_\beta+1}&c_{u_{\alpha+1},v_\beta+2}&\cdots&c_{u_{\alpha+1},v_{\beta+1}} \end{bmatrix}$

with the $(i, j)$ entry of $C_{\alpha \beta}$ having the value

$(C_{\alpha \beta})_{ij} = c_{u_\alpha+i,v_\beta+j}$

Now consider the following matrix:

$D_{\alpha \beta} = \sum_{\gamma = 1}^{s} A_{\alpha \gamma} B_{\gamma \beta}$

for some $\alpha$ and $\beta$ where $1 \le \alpha \le q$ and $1 \le \beta \le r$ .

First, note that all the submatrices $A_{\alpha \gamma}$ have $u_{\alpha+1} - u_\alpha$ rows and $w_{\gamma+1} - w_\gamma$ columns, and all the submatrices $B_{\gamma \beta}$ have $w_{\gamma+1} - w_\gamma$ rows and $v_{\beta+1} - v_\beta$ columns. (This follows from the definitions of $u$ , $w$ , and $v$ .)

Since the number of columns in the submatrices $A_{\alpha \gamma}$ is the same as the number of rows in the submatrices $B_{\gamma \beta}$ , matrix multiplication is possible, and the resulting product matrices will have $u_{\alpha+1} - u_\alpha$ rows and $v_{\beta+1} - v_\beta$ columns. Since $D_{\alpha \beta}$ is the sum of the product matrices, it too will have $u_{\alpha+1} - u_\alpha$ rows and $v_{\beta+1} - v_\beta$ columns, the same number of rows and columns as $C_{\alpha \beta}$ (based on the partitioning of $C$ as described above).

Now consider the $(i, j)$ entry in $D_{\alpha \beta}$ . It will consist of the sum of all the $(i, j)$ entries in the product matrices. More formally,

$(D_{\alpha \beta})_{ij} = \sum_{\gamma = 1}^{s} (A_{\alpha \gamma} B_{\gamma \beta})_{ij}, 1 \le i \le u_{\alpha+1} - u_\alpha, 1 \le j \le v_{\beta+1} - v_\beta$

Since the number of columns in the submatrices $A_{\alpha \gamma}$ and the number of rows in the submatrices $B_{\gamma \beta}$ is $w_{\gamma+1} - w_\gamma$ , we then have

$(A_{\alpha \gamma} B_{\gamma \beta})_{ij} = \sum_{l=1}^{w_{\gamma+1} - w_\gamma} (A_{\alpha \gamma})_{il} (B_{\gamma \beta})_{lj}$

$= \sum_{l=1}^{w_{\gamma+1} - w_\gamma} a_{u_\alpha+i,w_\gamma+l}b_{w_\gamma+l,v_\beta+j} = \sum_{k=w_\gamma+1}^{w_{\gamma+1}} a_{u_\alpha+i,k}b_{k,v_\beta+j}$

so that

$(D_{\alpha \beta})_{ij} = \sum_{\gamma = 1}^{s} \sum_{k=w_\gamma+1}^{w_{\gamma+1}} a_{u_\alpha+i,k}b_{k,v_\beta+j}$

Note that the first inner sum (when $\gamma = 1$ ) starts at $k = 1$ (because $w_1 = 0$ by definition), that each subsequent inner sum starts at a value of $k$ one greater than the last, and that the last inner sum (when $\gamma = s$ ) ends at $k = p$ (because $w_{s+1} = p$ ). We thus conclude that $k$ takes on all values from 1 to $p$ , and only those values, so that we can rewrite the above as

$(D_{\alpha \beta})_{ij} = \sum_{k=1}^{p} a_{u_\alpha+i,k}b_{k,v_\beta+j} = c_{u_\alpha+i,v_\beta+j} = (C_{\alpha \beta})_{ij}$

Since $D_{\alpha \beta}$ and $C_{\alpha \beta}$ have the same number of rows and columns (as discussed above), and since the $(i, j)$ entry of $D_{\alpha \beta}$ is equal to the $(i, j)$ entry of $C_{\alpha \beta}$ for all $i$ and $j$ , we conclude that $C_{\alpha \beta} = D_{\alpha \beta}$ , and from the definition of $D$ that

$C_{\alpha \beta} = \sum_{\gamma = 1}^{s} A_{\alpha \gamma} B_{\gamma \beta}$

which was what we set out to prove.

Note that if $q = s = r = 1$ then

$A_{11} = A, B_{11} = B, C_{11} = C$

so that the above equation reduces to $C = AB$ .

On the other hand, if $q = m$ , $s = p$ , and $r = n$ then we have

$A_{\alpha \beta} = \begin{bmatrix} a_{\alpha \beta} \end{bmatrix}$

$B_{\alpha \beta} = \begin{bmatrix} b_{\alpha \beta} \end{bmatrix}$

$C_{\alpha \beta} = \begin{bmatrix} c_{\alpha \beta} \end{bmatrix}$

so that the equation

$C_{\alpha \beta} = \sum_{\gamma = 1}^{s} A_{\alpha \gamma} B_{\gamma \beta}$

is essentially equivalent to

$c_{\alpha \beta} = \sum_{\gamma = 1}^{s} a_{\alpha \gamma} b_{\gamma \beta}$

UPDATE: Since originally writing this post I slightly changed the definition of a partitioning vector $v$ . First, I clarified that $v$ contained nonnegative integer entries. Second, I used the condition $v_i < v_{i+1}$ instead of $v_i < v_j$ for $i < j$ , because I think it better corresponds to the intuitive idea of how a partition would be defined.

Note that we can derive the old condition from the new one:

Suppose $v_i < v_{i+1}$ for $1 \le i \le q$ and we have $v_j$ with $j > i$ . Let $l = j-i \ge 1$ . We have $v_i < v_{i+k}$ when $k = 1$ . (This is the new condition that we are assuming is true.) Now suppose $v_i < v_{i+k}$ for some $k \ge 1$ . We have $v_{i+k} < v_{(i+k)+1}$ based on the new condition (assuming that $(i+k) \le q$ so that $v_{(i+k)+1}$ has a value), so $v_i < v_{i+k}$ implies that $v_i < v_{i+(k+1)}$ .

By induction we then have $v_i < v_{i+k}$ for all $k \ge 1$ for which $v_{i+k}$ has a value. Since $l = j - i \ge 1$ and $v_{i+l}$ has a value (since $v_j = v_{i+l}$ does) we then have $v_i < v_{i+l} = v_j$ . Thus $v_i < v_{i+1}$ for $1 \le i \le q$ implies that for any $v_i$ and $v_j$ we have $v_i < v_j$ if $j > i$ .

Also, since $i+1 > i$ , the old condition that $v_i < v_j$ for $j > i$ implies the new condition that $v_i < v_{i+1}$ . Thus the old and new conditions are equivalent, and the choice between them in defining a partitioning vector is simply a matter of convenience.

Buy me a snack to sponsor more posts like this!

7 Responses to Multiplying block matrices

Alex says:

April 3, 2015 at 9:41 pm

Beautiful, clear explanation. In many books the block matrix multiplication is given without a proof or as an exercise. This is what I was searching for.
Thanks.

- hecker says:
  
  April 3, 2015 at 11:41 pm
  
  I’m glad you found this useful. I was *very* frustrated that I couldn’t find a good proof online of how to multiply block matrices. Similar experiences prompted me to write the posts on multiplying block diagonal matrices and partitioning matrices into block diagonal form.
  
Vadim says:

July 12, 2015 at 12:22 am

Thank you! It was useful to see this technical detail at least somewhere. You saved me a lot of time.

Martin says:

May 16, 2016 at 12:45 pm

This is awesome. Everything makes sense line by line. It just can’t be better.

Harsh Bhardwaj says:

September 21, 2018 at 3:24 pm

Thank you very much. I was stuck on proving this.

Giles says:

July 11, 2025 at 3:32 am

This is an interesting approach for understanding block matrix multiplication.

JING CHEN says:

March 19, 2026 at 6:59 am

Fantastic! We’ve finally found a way to prove it!