## All length-preserving matrices are unitary

I recently read the (excellent) online resource Quantum Computing for the Very Curious by Andy Matuschak and Michael Nielsen. Upon reading the proof that all length-preserving matrices are unitary and trying it out myself, I came to believe that there is an error in the proof as written, specifically with trying to show that off-diagonal entries in $M^\dagger M$ are zero if $M$ is length-preserving.

Using the identity $|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$, a suitable choice of $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ with $j \ne k$, and the fact that $M$ is length-preserving, Nielsen first shows that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ for $j \ne k$.

He then goes on to write “But what if we’d done something slightly different, and instead of using $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ we’d used $\left|\psi\right> = \left|e_j\right> - \left|e_k\right>$? … I won’t explicitly go through the steps – you can do that yourself – but if you do go through them you end up with the equation: $(M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0$.”

I was an undergraduate physics and math major, but either I never worked with bra-ket notation and Hermitian conjugates or I’ve forgotten whatever I knew about them. In any case in working through this I could not get the same result as Nielsen; I simply ended up once again proving that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$.

After some thought and experimentation I concluded that the key is to choose $\left|\psi\right> = \left|e_j\right> + i\left|e_k\right>$. Below is my (possibly mistaken!) attempt at a correct proof that all length-preserving matrices are unitary.

Proof: Let $M$ be a length-preserving matrix such that for any vector $\left|\psi\right>$ we have $|| M \left|\psi\right> || = || \left|\psi\right> ||$. We wish to show that $M$ is unitary, i.e., $M^\dagger M = I$.

We first show that the diagonal elements of $M^\dagger M$, or $(M^\dagger M)_{jj}$, are equal to 1.

To do this we start with the unit vectors $\left|e_j\right>$ and $\left|e_k\right>$ with 1 in positions $j$ and $k$ respectively, and 0 otherwise. The product $M^\dagger M \left|e_k\right>$ is then the $k$th column of $M^\dagger M$, and $\left$ is the $jk$th entry of $M^\dagger M$ or $(M^\dagger M)_{jk}$.

From the general identity $\left<\psi\right| M^\dagger M \left|\psi\right> = || M \left|\psi\right> ||^2$ we also have $\left = || M \left|e_j\right> ||^2$. But since $M$ is length-preserving we have $|| M \left|e_j\right> ||^2 = || \left|e_j\right> ||^2 = 1^2 = 1$ since $\left|e_j\right>$ is a unit vector.

We thus have $(M^\dagger M)_{jj} = \left = || M \left|e_j\right> ||^2 = 1$. So all diagonal entries of $M^\dagger M$ are 1.

We next show that the non-diagonal elements of $M^\dagger M$, or $(M^\dagger M)_{jk}$ with $j \ne k$, are equal to zero.

Let $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ with $j \ne k$. Since $M$ is length-preserving we have

$|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + \left|e_k\right> ||^2 = 1^2 + 1^2 = 2$

We also have $|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$ where $\left<\psi\right| = \left|\psi\right>^\dagger = (\left|e_j\right> + \left|e_k\right>)^\dagger$. From the definition of the dagger operation and the fact that the nonzero entries of $\left|e_j\right>$ and $\left|e_k\right>$ have no imaginary parts we have $(\left|e_j\right> + \left|e_k\right>)^\dagger = \left.

We then have

$|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$

$= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>$

$= (\left|e_j\right> + \left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + \left|e_k\right>)$

$= (\left + \left|e_k\right>)$

$= \left + \left + \left + \left$

$= (M^\dagger M)_{jj} + (M^\dagger M)_{jk} + (M^\dagger M)_{kj} + (M^\dagger M)_{kk}$

$= 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}$

since we previously showed that all diagonal entries of $M^\dagger M$ are 1.

Since $|| M \left|\psi\right> ||^2 = 2$ and also $|| M \left|\psi\right> ||^2 = 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}$ we thus have $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ for $j \ne k$.

Now let $\left|\psi\right> = \left|e_j\right> + i\left|e_k\right>$ with $j \ne k$. Again we have $|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2$ since $M$ is length-preserving, so that

$|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + i\left|e_k\right> ||^2$

$= (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)$

Since $i\left|e_k\right>$ has an imaginary part for its (single) nonzero entry, in performing the dagger operation and taking complex conjugates we obtain $(\left|e_j\right> + i\left|e_k\right>)^\dagger = \left. We thus have

$|| M \left|\psi\right> ||^2 = (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)$

$= (\left + i\left|e_k\right>)$

$= \left + \left - i \left - i \left$

$= \left + i\left - i \left - i^2\left$

$= \left + i\left - i\left + \left$

$= 1^2 + i\cdot 0 - i\cdot 0 + 1^2 = 2$

We also have

$|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$

$= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>$

$= (\left|e_j\right> + i\left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + i\left|e_k\right>)$

$= (\left + i\left|e_k\right>)$

$= \left + \left - i\left - i\left$

$= \left + i\left - i\left - i^2\left$

$= (M^\dagger M)_{jj} + i(M^\dagger M)_{jk} - i(M^\dagger M)_{kj} + (M^\dagger M)_{kk}$

$= 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$

Since $|| M \left|\psi\right> ||^2 = 2$ we have $2 = 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$ or $0 = i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$ so that $(M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0$.

But we showed above that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$. Adding the two equations the terms for $(M^\dagger M)_{kj}$ cancel out and we get $(M^\dagger M)_{jk} = 0$ for $j \ne k$. So all nondiagonal entries of $M^\dagger M$ are equal to zero.

Since all diagonal entries of $M^\dagger M$ are equal to 1 and all nondiagonal entries of $M^\dagger M$ are equal to zero, we have $M^\dagger M = I$ and thus the matrix $M$ is unitary.

Since we assumed $M$ was a length-preserving matrix we have thus shown that all length-preserving matrices are unitary.

## Linear Algebra and Its Applications, Exercise 3.4.28

Exercise 3.4.28. Given the plane $x_1 + x_2 + x_3 = 0$ and the following vectors

$\begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} \qquad \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} \qquad \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix}$

in the plane, find an orthonormal basis for the subspace represented by the plane. Report the dimension of the subspace and the number of nonzero vectors produced by Gram-Schmidt orthogonalization.

Answer: We start with the vector $a_1 = (1, -1, 0)$ and normalize it to create $q_1$:

$\|a_1\|^2 = 1^2 + (-1)^2 + 0^2 = 1 + 1 = 2$

$q_1 = a_1/\|a_1\| = \frac{1}{\sqrt{2}} a_1 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix}$

We then take the second vector $a_2 = (0, 1, -1)$ and create a second orthogonal vector $a_2'$ by subtracting from $a_2$ its projection on $q_1$:

$a_2' = a_2 - (q_1^Ta_2)q_1$

$= a_2 - \left[ \frac{1}{\sqrt{2}} \cdot 0 + (-\frac{1}{\sqrt{2}}) \cdot 1 + 0 \cdot (-1) \right]q_1 = a_2 - (-\frac{1}{\sqrt{2}})q_1 = a_2 + \frac{1}{\sqrt{2}}q_1$

$= \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix}$

We then normalize $a_2'$ to create $q_2$:

$\|a_2'\|^2 = (\frac{1}{2})^2 + (\frac{1}{2})^2 + (-1)^2 = \frac{1}{4} + \frac{1}{4} + 1 = \frac{3}{2}$

$q_2 = a_2'/\|a_2'\| = a_2'/\sqrt{\frac{3}{2}} = \frac{\sqrt{2}}{\sqrt{3}} \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix} = \begin{bmatrix} \frac{\sqrt{2}}{2\sqrt{3}} \\ \frac{\sqrt{2}}{2\sqrt{3}} \\ -\frac{\sqrt{2}}{\sqrt{3}} \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}$

Finally, we take the third vector $a_3 = (1, 0, -1)$ and attempt to create another orthogonal vector $a_3'$ by subtracting from $a_3$ its projections on $q_1$ and $q_2$:

$a_3' = a_3 - (q_1^Ta_3)q_1 - (q_2^Ta_3)q_2$

$= a_3 - \left[ \frac{1}{\sqrt{2}} \cdot 1 + (-\frac{1}{\sqrt{2}}) \cdot 0 + 0 \cdot (-1) \right]q_1- \left[ \frac{1}{\sqrt{6}} \cdot 1 + \frac{1}{\sqrt{6}} \cdot 0 + (-\frac{2}{\sqrt{6}}) \cdot (-1) \right] q_2$

$= a_3 - \frac{1}{\sqrt{2}}q_1 - \frac{3}{\sqrt{6}}q_2 = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} - \frac{3}{\sqrt{6}} \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}$

$= \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} - \begin{bmatrix} \frac{3}{6} \\ \frac{3}{6} \\ -\frac{6}{6} \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}$

Since $a_3' = 0$ we cannot create a third orthogonal vector to $q_1$ and $q_2$. The vectors

$q_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} \qquad q_2 = \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}$

are an orthonormal basis for the subspace, and the dimension of the subspace is 2.

(In hindsight we could have predicted this result by inspecting the original vectors $a_1$, $a_2$, and $a_3$ and noticing that $a_3 = a_1 + a_2$. Thus only $a_1$ and $a_2$ were linearly independent, $a_3$ being linearly dependent on the first two vectors, so that only two orthonormal basis vectors could be created from the three vectors given.)

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.27

Exercise 3.4.27. Given the subspace spanned by the three vectors

$a_1 = \begin{bmatrix} 1 \\ -1 \\ 0 \\ 0 \end{bmatrix} \qquad a_2 = \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} \qquad a_3 = \begin{bmatrix} 0 \\ 0 \\ 1 \\ -1 \end{bmatrix}$

find vectors $q_1$, $q_2$, and $q_3$ that form an orthonormal basis for the subspace.

Answer: We can save some time by noting that $a_1$ and $a_3$ are already orthogonal. We can normalize these two vectors to create $q_1$ and $q_3$:

$\|a_1\|^2 = 1^2 + (-1)^2 + 0^2 + 0^2 = 1 + 1 = 2$

$q_1 = a_1/\|a_1\| = \frac{1}{\sqrt{2}} a_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix}$

$\|a_3\|^2 = 0^2 + 0^2 + 1^2 + (-1)^2 = 1 + 1 = 2$

$q_3 = a_3/\|a_3\| = \frac{1}{\sqrt{2}} a_3 = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix}$

We can then compute a third orthogonal vector $a_2'$ by subtracting from $a_2$ its projections on $q_1$ and $q_3$:

$a_2' = a_2 - (q_1^Ta_2)q_1 - (q_3^Ta_2)q_3$

$= a_2 - \left[ \frac{1}{\sqrt{2}} \cdot 0 + (-\frac{1}{\sqrt{2}}) \cdot 1 + 0 \cdot (-1) + 0 \cdot 0 \right]q_1 - \left[ 0 \cdot 0 + 0 \cdot 1 + \frac{1}{\sqrt{2}} \cdot (-1) + (-\frac{1}{\sqrt{2}}) \cdot 0 \right]q_3$

$= a_2 - (-\frac{1}{\sqrt{2}})q_1 - (-\frac{1}{\sqrt{2}})q_3 = a_2 + \frac{1}{\sqrt{2}}q_1 + \frac{1}{\sqrt{2}}q_3$

$= \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \\ \frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix}$

Finally, we normalize $a_2'$ to create $q_2$:

$\|a_2'\|^2 = (\frac{1}{2})^2 + (\frac{1}{2})^2 + (-\frac{1}{2})^2 + (-\frac{1}{2})^2 = \frac{1}{4} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4} = 1$

$q_2 = a_2'/\|a_2'\| = a_2' = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix}$

An orthonormal basis for the space is therefore

$q_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix} \qquad q_2 = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} \qquad q_3 = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix}$

(It’s worth noting that the solution for this exercise on page 480 is different than the solution given above. That’s presumably because we computed the orthonormal vectors in the order $q_1$, $q_3$, $q_2$ rather than the standard order $q_1$, $q_2$, $q_3$, taking advantage of the fact that the original vectors $a_1$ and $a_3$ were already orthogonal. Recall that a basis set is not unique, so it is possible to have different orthonormal bases for the same subspace.)

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.26

Exercise 3.4.26. In the Gram-Schmidt orthogonalization process the third component $c'$ is computed as $c' = c - (q_1^Tc)q_1 - (q_2^Tc)q_2$. Verify that $c'$ is orthogonal to both $q_1$ and $q_2$.

Answer: Taking the dot product of $q_1$ and $c'$ we have

$q_1^Tc' = q_1^T \left[ c - (q_1^Tc)q_1 - (q_2^Tc)q_2 \right] = q_1^Tc - q_1^T(q_1^Tc)q_1 - q_1^T(q_2^Tc)q_2$

Since $q_1^Tc$ and $q_2^Tc$ are scalars and $q_1$ and $q_2$ are orthonormal we then have

$q_1^Tc' = q_1^Tc - q_1^T(q_1^Tc)q_1 - q_1^T(q_2^Tc)q_2 = q_1^Tc - (q_1^Tc)q_1^Tq_1 - (q_2^Tc)q_1^Tq_2$

$= q_1^Tc - (q_1^Tc) \cdot 1 - (q_2^Tc) \cdot 0 = q_1^Tc - q_1^Tc = 0$

So $c'$ is orthogonal to $q_1$.

Taking the dot product of $q_2$ and $c'$ we have

$q_2^Tc' = q_2^T \left[ c - (q_1^Tc)q_1 - (q_2^Tc)q_2 \right] = q_2^Tc - q_2^T(q_1^Tc)q_1 - q_2^T(q_2^Tc)q_2$

$= q_1^Tc - (q_1^Tc)q_1^Tq_1 - (q_2^Tc)q_1^Tq_2 = q_2^Tc - q_2^Tc = 0$

So $c'$ is also orthogonal to $q_2$.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.25

Exercise 3.4.25. Given $y = x^2$ over the interval $-1 \le x \le 1$ what is the closest line $C + Dx$ to the parabola formed by $y$?

Answer: This amounts to finding a least-squares solution to the equation $\begin{bmatrix} 1&x \end{bmatrix} \begin{bmatrix} C \\ D \end{bmatrix} = y$, where the entries 1, $x$, and $y = x^2$ are understood as functions of $x$ over the interval -1 to 1 (as opposed to being scalar values).

Interpreting the traditional least squares equation $A^TAx = A^Tb$ in this context, here the matrix $A = \begin{bmatrix} 1&x \end{bmatrix}$ and we have

$A^TA = \begin{bmatrix} 1 \\ x \end{bmatrix} \begin{bmatrix} 1&x \end{bmatrix} = \begin{bmatrix} (1, 1)&(1, x) \\ (x, 1)&(x, x) \end{bmatrix}$

where the entries of $A^TA$ are the dot products of the functions, i.e., the integrals of their products over the interval -1 to 1.

We then have

$(1, 1) = \int_{-1}^1 1 \cdot 1 \;\mathrm{d}x = 2$

$(1, x) = (x, 1) = \int_{-1}^1 1 \cdot x \;\mathrm{d}x = \left( \frac{1}{2}x^2 \right) \;\big|_{-1}^1 = \frac{1}{2} \cdot 1^2 - \frac{1}{2} \cdot (-1)^2 = \frac{1}{2} - \frac{1}{2} = 0$

$(x, x) = \int_{-1}^1 x^2 \;\mathrm{d}x = \left( \frac{1}{3}x^3 \right) \;\big|_{-1}^1 = \frac{1}{3} \cdot 1^3 - \frac{1}{3} \cdot (-1)^3 = \frac{1}{3} + \frac{1}{3} = \frac{2}{3}$

so that

$A^TA = \begin{bmatrix} (1, 1)&(1, x) \\ (x, 1)&(x, x) \end{bmatrix} = \begin{bmatrix} 2&0 \\ 0&\frac{2}{3} \end{bmatrix}$

Continuing the interpretation of the least squares equation $A^TAx = A^Tb$ in this context, the role of $b$ is played by the function $y = x^2$, and we have

$A^Ty = \begin{bmatrix} 1 \\ x \end{bmatrix} x^2 = \begin{bmatrix} (1,x^2) \\ (x, x^2) \end{bmatrix}$

where again the entries are dot products of the functions. From above we have

$(1, x^2) = \int_{-1}^1 1 \cdot x^2 \;\mathrm{d}x = \frac{2}{3}$

and from previous exercises we have

$(x, x^2) = \int_{-1}^1 x \cdot x^2 \;\mathrm{d}x = \int_{-1}^1 x^3 \;\mathrm{d}x = 0$

so that

$A^Ty = \begin{bmatrix} \frac{2}{3} \\ 0 \end{bmatrix}$

To get the least squares solution $\bar{C} + \bar{D}x$ we then have

$\begin{bmatrix} 2&0 \\ 0&\frac{2}{3} \end{bmatrix} \begin{bmatrix} \bar{C} \\ \bar{D} \end{bmatrix} = \begin{bmatrix} \frac{2}{3} \\ 0 \end{bmatrix}$

From the second equation we have $\bar{D} = 0$. From the first equation we have $2\bar{C} = \frac{2}{3}$ or $C = \frac{1}{3}$.

The line of best fit to the parabola $y = x^2$ over the interval $-1 \le x \le 1$ is therefore the horizontal line with $y$-intercept of $\frac{1}{3}$.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.24

Exercise 3.4.24. As discussed on page 178, the first three Legendre polynomials are 1, $x$, and $x^2 - \frac{1}{3}$. Find the next Legendre polynomial; it will be a cubic polynomial defined for $-1 \le x \le 1$ and will be orthogonal to the first three Legendre polynomials.

Answer: The process of finding the fourth Legendre poloynomial is essentially an application of Gram-Schmidt orthogonalization. The first three polynomials are

$v_1 = 1 \qquad v_2 = x \qquad v_3 = x^2 - \frac{1}{3}$

We can find the fourth Legendre polynomial by starting with $x^3$ and subtracting off the projections of $x_3$ on the first three polynomials:

$v_4 = x^3 - \frac{(v_1, x^3)}{(v_1, v_1)}v_1 - \frac{(v_2, x^3)}{(v_2, v_2)}v_2 - \frac{(v_3, x^3)}{(v_3, v_3)}v_3$

$= \frac{(1, x^3)}{(1, 1)}\cdot 1 - \frac{(x, x^3)}{(x, x)}x - \frac{(x^2-\frac{1}{3}, x^3)}{(x^2-\frac{1}{3}, x^2-\frac{1}{3})}(x^2-\frac{1}{3})$

For the first term we have

$(1, x^3) = \int_{-1}^1 1 \cdot x^3 \;\mathrm{d}x = \int_{-1}^1 x^3 \;\mathrm{d}x = 0$

so that the first term $\frac{(v_1, x^3)}{(v_1, v_1)}v_1$ does not appear in the expression for $v_4$.

The third term $\frac{(v_3, x^3)}{(v_3, v_3)}v_3$ drops out for the same reason: its numerator is

$(x^2-\frac{1}{3}, x^3) = \int_{-1}^1 (x^2 - \frac{1}{3}) x^3 \;\mathrm{d}x$

$= \int_{-1}^1 x^5 \;\mathrm{d}x - \frac{1}{3} \int_{-1}^1 x^3 \;\mathrm{d}x = 0 - \frac{1}{3} \cdot 0 = 0$

That leaves the second term $\frac{(v_2, x^3)}{(v_2, v_2)}v_2$ with numerator of

$(x, x^3) = \int_{-1}^1 x \cdot x^3 \;\mathrm{d}x = \int_{-1}^1 x^4 \;\mathrm{d}x$

$= \left( \frac{1}{5} x^5 \right) \;\big|_{-1}^1 = \frac{1}{5} \cdot 1^5 - \frac{1}{5} \cdot (-1)^5 = \frac{1}{5} - (-\frac{1}{5}) = \frac{2}{5}$

and denominator

$(x, x) = \int_{-1}^1 x^2 \;\mathrm{d}x = \left( \frac{1}{3}x^3 \right) \;\big|_{-1}^1 = \frac{1}{3} \cdot 1^3 - \frac{1}{3} \cdot (-1)^3 = \frac{1}{3} + \frac{1}{3} = \frac{2}{3}$

We then have

$v_4 = x^3 - \left[ \frac{2}{5}/\frac{2}{3} \right] x = x^3 - \frac{3}{5}x$

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.23

Exercise 3.4.23. Given the step function $y$ with $y(x) = 1$ for $0 \le x \le \pi$ and $y(x) = 0$ for $\pi < x < 2\pi$, find the following Fourier coefficients:

$a_0 = \frac{(y, 1)}{(1, 1)} \qquad a_1 = \frac{(y, \cos x)}{(\cos x, \cos x)} \qquad b_1 = \frac{(y, \sin x)}{(\sin x, \sin x)}$

Answer: For $a_0$ the numerator is

$(y, 1) = \int_0^{2\pi} y(x) \cdot 1 \;\mathrm{d}x = \int_0^{\pi} 1 \;\mathrm{d}x + \int_{\pi}^{2\pi} 0 \;\mathrm{d}x = \pi$

and the denominator is

$(1, 1) = \int_0^{2\pi} 1^2 \;\mathrm{d}x = 2\pi$

so that $a_0 = \frac{\pi}{2\pi} = \frac{1}{2}$.

For $a_1$ the numerator is

$(y, \cos x) = \int_0^{2\pi} y(x) \cos x \;\mathrm{d}x = \int_0^{\pi} 1 \cdot \cos x \;\mathrm{d}x + \int_{\pi}^{2\pi} 0 \cdot \cos x \;\mathrm{d}x$

$= \int_0^{\pi} \cos x = \sin x \;\big|_0^{\pi} = 0 - 0 = 0$

so that $a_1 = 0$.

For $b_1$ the numerator is

$(y, \sin x) = \int_0^{2\pi} y(x) \sin x \;\mathrm{d}x = \int_0^{\pi} 1 \cdot \sin x \;\mathrm{d}x + \int_{\pi}^{2\pi} 0 \cdot \sin x \;\mathrm{d}x$

$= \int_0^{\pi} \sin x = (-\cos x) \;\big|_0^{\pi} = -(-1) - (-1) = 1 + 1 = 2$

and the denominator is

$(\sin x, \sin x) = \int_0^{2\pi} \sin^2 x \;\mathrm{d}x = \left[ \frac{1}{2}x - \frac{1}{4} \sin 2x \right] \;\big|_0^{2\pi}$

$= \left[ \frac{1}{2}\cdot(2\pi) - \frac{1}{4} \sin 2\pi \right] - \left[ \frac{1}{2} \cdot 0 - \frac{1}{4} \sin 2 \cdot 0 \right] = \pi - \frac{1}{4} \cdot 0 - 0 + \frac{1}{4} \cdot 0 = \pi$

so that $b_1 = \frac{2}{\pi}$.

So we have $a_0 = \frac{1}{2}$, $a_1 = 0$, and $b_1 = \frac{2}{\pi}$.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.