## Variance and the sum of squared pairwise differences

The variance $\sigma^2$ of a set of $n$ values $x_1, x_2, ..., x_n$ is usually expressed in terms of squared differences between those values and the mean $\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$ of those values.

$\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2$

However the sum of squared differences $(x_i - \bar{x})^2$ between the values and the mean can also be expressed in term of the sum of squared pairwise differences $(x_i - x_j)^2$ among the values themselves, without reference to the mean $\bar{x}$.

In particular, we want to show that

$\sum_{i=1}^{n} (x_i - \bar{x})^2 = \frac{1}{2n} \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2$.

To get an expression involving $\bar{x}$ we rewrite the squared difference in the righthand sum and then expand the result:

$\sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2 = \sum_{i=1}^{n} \sum_{j=1}^{n} [(x_i - \bar{x}) - (x_j - \bar{x})]^2$

$= \sum_{i=1}^{n} \sum_{j=1}^{n} [(x_i - \bar{x})^2 - 2 (x_i - \bar{x}) (x_j - \bar{x}) + (x_j - \bar{x})^2]$

$= \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x})^2 - 2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) + \sum_{i=1}^{n} \sum_{j=1}^{n} (x_j - \bar{x})^2$

Since the squared difference in the first term does not depend on $j$, the first term can be rewritten as

$\sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x})^2 = \sum_{i=1}^{n} n (x_i - \bar{x})^2 = n \sum_{i=1}^{n} (x_i - \bar{x})^2$

Since the squared difference in the third term does not depend on $i$, the third term can be rewritten as

$= \sum_{i=1}^{n} \sum_{j=1}^{n} (x_j - \bar{x})^2 = n \sum_{j=1}^{n} (x_j - \bar{x})^2 = n \sum_{i=1}^{n} (x_i - \bar{x})^2$

where in the last step we replaced $j$ as an index with $i$. So the third term is identical to the first term.

We now turn to the second term, $-2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x})$. We can bring the difference $(x_i - \bar{x})$ out of the inner sum, since it does not depend on the index $j$. This gives us

$-2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) = -2 \sum_{i=1}^{n} (x_i - \bar{x}) [\sum_{j=1}^{n} (x_j - \bar{x})]$

The sum $\sum_{j=1}^{n} (x_j - \bar{x})$ can then be rewritten as

$\sum_{j=1}^{n} (x_j - \bar{x}) = \sum_{j=1}^{n} x_j - \sum_{j=1}^{n} \bar{x}$

$= \sum_{j=1}^{n} x_j - n \bar{x}$

But we have $\bar{x} = \frac{1}{n} \sum_{j=1}^{n} x_j$ by definition, so we then have

$\sum_{j=1}^{n} (x_j - \bar{x}) = \sum_{j=1}^{n} x_j - n \bar{x} = n \bar{x} - n \bar{x} = 0$

We can then substitute this result into the second term as follows:

$-2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) = -2 \sum_{i=1}^{n} (x_i - \bar{x}) [\sum_{j=1}^{n} (x_j - \bar{x})]$

$= -2 \sum_{i=1}^{n} (x_i - \bar{x}) \cdot 0 = -2 \sum_{i=1}^{n} 0 = 0$

Now that we know the value of all three terms we have

$\sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2$

$= \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x})^2 - 2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) + \sum_{i=1}^{n} \sum_{j=1}^{n} (x_j - \bar{x})^2$

$= n \sum_{i=1}^{n} (x_i - \bar{x})^2 + 0 + n \sum_{i=1}^{n} (x_i - \bar{x})^2$

$= 2n \sum_{i=1}^{n} (x_i - \bar{x})^2$

so that

$\sum_{i=1}^{n} (x_i - \bar{x})^2 = \frac{1}{2n} \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2$

which is what we set out to prove.

However, we can further simplify this identity. Since $(x_i - x_j) = 0$ when $i = j$ and $(x_i - x_j)^2 = (x_j - x_i)^2$, we can consider only differences when $i < j$ (i.e., elements above the diagonal, if we consider the pairwise comparisons to form a matrix):

$\sum_{i=1}^{n} (x_i - \bar{x})^2 = \frac{1}{2n} \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2$

$= \frac{1}{2n} [\sum_{i < j} (x_i - x_j)^2 + \sum_{i = j} (x_i - x_j)^2 + \sum_{i > j} (x_i - x_j)^2]$

$\frac{1}{2n} [\sum_{i < j} (x_i - x_j)^2 + 0 + \sum_{i < j} (x_i - x_j)^2]$

$\frac{1}{2n} [2 \sum_{i < j} (x_i - x_j)^2] = \frac{1}{n} \sum_{i < j} (x_i - x_j)^2$

From the definition of $\sigma^2$ we then have

$\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2$

$= \frac{1}{n} [\frac{1}{n} \sum_{i < j} (x_i - x_j)^2]$

$= \frac{1}{n^2} \sum_{i < j} (x_i - x_j)^2$

## All length-preserving matrices are unitary

I recently read the (excellent) online resource Quantum Computing for the Very Curious by Andy Matuschak and Michael Nielsen. Upon reading the proof that all length-preserving matrices are unitary and trying it out myself, I came to believe that there is an error in the proof as written, specifically with trying to show that off-diagonal entries in $M^\dagger M$ are zero if $M$ is length-preserving.

Using the identity $|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$, a suitable choice of $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ with $j \ne k$, and the fact that $M$ is length-preserving, Nielsen first shows that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ for $j \ne k$.

He then goes on to write “But what if we’d done something slightly different, and instead of using $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ we’d used $\left|\psi\right> = \left|e_j\right> - \left|e_k\right>$? … I won’t explicitly go through the steps – you can do that yourself – but if you do go through them you end up with the equation: $(M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0$.”

I was an undergraduate physics and math major, but either I never worked with bra-ket notation and Hermitian conjugates or I’ve forgotten whatever I knew about them. In any case in working through this I could not get the same result as Nielsen; I simply ended up once again proving that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$.

After some thought and experimentation I concluded that the key is to choose $\left|\psi\right> = \left|e_j\right> + i\left|e_k\right>$. Below is my (possibly mistaken!) attempt at a correct proof that all length-preserving matrices are unitary.

Proof: Let $M$ be a length-preserving matrix such that for any vector $\left|\psi\right>$ we have $|| M \left|\psi\right> || = || \left|\psi\right> ||$. We wish to show that $M$ is unitary, i.e., $M^\dagger M = I$.

We first show that the diagonal elements of $M^\dagger M$, or $(M^\dagger M)_{jj}$, are equal to 1.

To do this we start with the unit vectors $\left|e_j\right>$ and $\left|e_k\right>$ with 1 in positions $j$ and $k$ respectively, and 0 otherwise. The product $M^\dagger M \left|e_k\right>$ is then the $k$th column of $M^\dagger M$, and $\left$ is the $jk$th entry of $M^\dagger M$ or $(M^\dagger M)_{jk}$.

From the general identity $\left<\psi\right| M^\dagger M \left|\psi\right> = || M \left|\psi\right> ||^2$ we also have $\left = || M \left|e_j\right> ||^2$. But since $M$ is length-preserving we have $|| M \left|e_j\right> ||^2 = || \left|e_j\right> ||^2 = 1^2 = 1$ since $\left|e_j\right>$ is a unit vector.

We thus have $(M^\dagger M)_{jj} = \left = || M \left|e_j\right> ||^2 = 1$. So all diagonal entries of $M^\dagger M$ are 1.

We next show that the non-diagonal elements of $M^\dagger M$, or $(M^\dagger M)_{jk}$ with $j \ne k$, are equal to zero.

Let $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ with $j \ne k$. Since $M$ is length-preserving we have

$|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + \left|e_k\right> ||^2 = 1^2 + 1^2 = 2$

We also have $|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$ where $\left<\psi\right| = \left|\psi\right>^\dagger = (\left|e_j\right> + \left|e_k\right>)^\dagger$. From the definition of the dagger operation and the fact that the nonzero entries of $\left|e_j\right>$ and $\left|e_k\right>$ have no imaginary parts we have $(\left|e_j\right> + \left|e_k\right>)^\dagger = \left.

We then have

$|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$

$= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>$

$= (\left|e_j\right> + \left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + \left|e_k\right>)$

$= (\left + \left|e_k\right>)$

$= \left + \left + \left + \left$

$= (M^\dagger M)_{jj} + (M^\dagger M)_{jk} + (M^\dagger M)_{kj} + (M^\dagger M)_{kk}$

$= 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}$

since we previously showed that all diagonal entries of $M^\dagger M$ are 1.

Since $|| M \left|\psi\right> ||^2 = 2$ and also $|| M \left|\psi\right> ||^2 = 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}$ we thus have $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ for $j \ne k$.

Now let $\left|\psi\right> = \left|e_j\right> + i\left|e_k\right>$ with $j \ne k$. Again we have $|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2$ since $M$ is length-preserving, so that

$|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + i\left|e_k\right> ||^2$

$= (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)$

Since $i\left|e_k\right>$ has an imaginary part for its (single) nonzero entry, in performing the dagger operation and taking complex conjugates we obtain $(\left|e_j\right> + i\left|e_k\right>)^\dagger = \left. We thus have

$|| M \left|\psi\right> ||^2 = (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)$

$= (\left + i\left|e_k\right>)$

$= \left + \left - i \left - i \left$

$= \left + i\left - i \left - i^2\left$

$= \left + i\left - i\left + \left$

$= 1^2 + i\cdot 0 - i\cdot 0 + 1^2 = 2$

We also have

$|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$

$= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>$

$= (\left|e_j\right> + i\left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + i\left|e_k\right>)$

$= (\left + i\left|e_k\right>)$

$= \left + \left - i\left - i\left$

$= \left + i\left - i\left - i^2\left$

$= (M^\dagger M)_{jj} + i(M^\dagger M)_{jk} - i(M^\dagger M)_{kj} + (M^\dagger M)_{kk}$

$= 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$

Since $|| M \left|\psi\right> ||^2 = 2$ we have $2 = 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$ or $0 = i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$ so that $(M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0$.

But we showed above that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$. Adding the two equations the terms for $(M^\dagger M)_{kj}$ cancel out and we get $(M^\dagger M)_{jk} = 0$ for $j \ne k$. So all nondiagonal entries of $M^\dagger M$ are equal to zero.

Since all diagonal entries of $M^\dagger M$ are equal to 1 and all nondiagonal entries of $M^\dagger M$ are equal to zero, we have $M^\dagger M = I$ and thus the matrix $M$ is unitary.

Since we assumed $M$ was a length-preserving matrix we have thus shown that all length-preserving matrices are unitary.

Posted in Uncategorized | 4 Comments

## Linear Algebra and Its Applications, Exercise 3.4.28

Exercise 3.4.28. Given the plane $x_1 + x_2 + x_3 = 0$ and the following vectors

$\begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} \qquad \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} \qquad \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix}$

in the plane, find an orthonormal basis for the subspace represented by the plane. Report the dimension of the subspace and the number of nonzero vectors produced by Gram-Schmidt orthogonalization.

Answer: We start with the vector $a_1 = (1, -1, 0)$ and normalize it to create $q_1$:

$\|a_1\|^2 = 1^2 + (-1)^2 + 0^2 = 1 + 1 = 2$

$q_1 = a_1/\|a_1\| = \frac{1}{\sqrt{2}} a_1 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix}$

We then take the second vector $a_2 = (0, 1, -1)$ and create a second orthogonal vector $a_2'$ by subtracting from $a_2$ its projection on $q_1$:

$a_2' = a_2 - (q_1^Ta_2)q_1$

$= a_2 - \left[ \frac{1}{\sqrt{2}} \cdot 0 + (-\frac{1}{\sqrt{2}}) \cdot 1 + 0 \cdot (-1) \right]q_1 = a_2 - (-\frac{1}{\sqrt{2}})q_1 = a_2 + \frac{1}{\sqrt{2}}q_1$

$= \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix}$

We then normalize $a_2'$ to create $q_2$:

$\|a_2'\|^2 = (\frac{1}{2})^2 + (\frac{1}{2})^2 + (-1)^2 = \frac{1}{4} + \frac{1}{4} + 1 = \frac{3}{2}$

$q_2 = a_2'/\|a_2'\| = a_2'/\sqrt{\frac{3}{2}} = \frac{\sqrt{2}}{\sqrt{3}} \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix} = \begin{bmatrix} \frac{\sqrt{2}}{2\sqrt{3}} \\ \frac{\sqrt{2}}{2\sqrt{3}} \\ -\frac{\sqrt{2}}{\sqrt{3}} \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}$

Finally, we take the third vector $a_3 = (1, 0, -1)$ and attempt to create another orthogonal vector $a_3'$ by subtracting from $a_3$ its projections on $q_1$ and $q_2$:

$a_3' = a_3 - (q_1^Ta_3)q_1 - (q_2^Ta_3)q_2$

$= a_3 - \left[ \frac{1}{\sqrt{2}} \cdot 1 + (-\frac{1}{\sqrt{2}}) \cdot 0 + 0 \cdot (-1) \right]q_1- \left[ \frac{1}{\sqrt{6}} \cdot 1 + \frac{1}{\sqrt{6}} \cdot 0 + (-\frac{2}{\sqrt{6}}) \cdot (-1) \right] q_2$

$= a_3 - \frac{1}{\sqrt{2}}q_1 - \frac{3}{\sqrt{6}}q_2 = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} - \frac{3}{\sqrt{6}} \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}$

$= \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} - \begin{bmatrix} \frac{3}{6} \\ \frac{3}{6} \\ -\frac{6}{6} \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}$

Since $a_3' = 0$ we cannot create a third orthogonal vector to $q_1$ and $q_2$. The vectors

$q_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} \qquad q_2 = \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}$

are an orthonormal basis for the subspace, and the dimension of the subspace is 2.

(In hindsight we could have predicted this result by inspecting the original vectors $a_1$, $a_2$, and $a_3$ and noticing that $a_3 = a_1 + a_2$. Thus only $a_1$ and $a_2$ were linearly independent, $a_3$ being linearly dependent on the first two vectors, so that only two orthonormal basis vectors could be created from the three vectors given.)

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.27

Exercise 3.4.27. Given the subspace spanned by the three vectors

$a_1 = \begin{bmatrix} 1 \\ -1 \\ 0 \\ 0 \end{bmatrix} \qquad a_2 = \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} \qquad a_3 = \begin{bmatrix} 0 \\ 0 \\ 1 \\ -1 \end{bmatrix}$

find vectors $q_1$, $q_2$, and $q_3$ that form an orthonormal basis for the subspace.

Answer: We can save some time by noting that $a_1$ and $a_3$ are already orthogonal. We can normalize these two vectors to create $q_1$ and $q_3$:

$\|a_1\|^2 = 1^2 + (-1)^2 + 0^2 + 0^2 = 1 + 1 = 2$

$q_1 = a_1/\|a_1\| = \frac{1}{\sqrt{2}} a_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix}$

$\|a_3\|^2 = 0^2 + 0^2 + 1^2 + (-1)^2 = 1 + 1 = 2$

$q_3 = a_3/\|a_3\| = \frac{1}{\sqrt{2}} a_3 = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix}$

We can then compute a third orthogonal vector $a_2'$ by subtracting from $a_2$ its projections on $q_1$ and $q_3$:

$a_2' = a_2 - (q_1^Ta_2)q_1 - (q_3^Ta_2)q_3$

$= a_2 - \left[ \frac{1}{\sqrt{2}} \cdot 0 + (-\frac{1}{\sqrt{2}}) \cdot 1 + 0 \cdot (-1) + 0 \cdot 0 \right]q_1 - \left[ 0 \cdot 0 + 0 \cdot 1 + \frac{1}{\sqrt{2}} \cdot (-1) + (-\frac{1}{\sqrt{2}}) \cdot 0 \right]q_3$

$= a_2 - (-\frac{1}{\sqrt{2}})q_1 - (-\frac{1}{\sqrt{2}})q_3 = a_2 + \frac{1}{\sqrt{2}}q_1 + \frac{1}{\sqrt{2}}q_3$

$= \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \\ \frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix}$

Finally, we normalize $a_2'$ to create $q_2$:

$\|a_2'\|^2 = (\frac{1}{2})^2 + (\frac{1}{2})^2 + (-\frac{1}{2})^2 + (-\frac{1}{2})^2 = \frac{1}{4} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4} = 1$

$q_2 = a_2'/\|a_2'\| = a_2' = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix}$

An orthonormal basis for the space is therefore

$q_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix} \qquad q_2 = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} \qquad q_3 = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix}$

(It’s worth noting that the solution for this exercise on page 480 is different than the solution given above. That’s presumably because we computed the orthonormal vectors in the order $q_1$, $q_3$, $q_2$ rather than the standard order $q_1$, $q_2$, $q_3$, taking advantage of the fact that the original vectors $a_1$ and $a_3$ were already orthogonal. Recall that a basis set is not unique, so it is possible to have different orthonormal bases for the same subspace.)

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.26

Exercise 3.4.26. In the Gram-Schmidt orthogonalization process the third component $c'$ is computed as $c' = c - (q_1^Tc)q_1 - (q_2^Tc)q_2$. Verify that $c'$ is orthogonal to both $q_1$ and $q_2$.

Answer: Taking the dot product of $q_1$ and $c'$ we have

$q_1^Tc' = q_1^T \left[ c - (q_1^Tc)q_1 - (q_2^Tc)q_2 \right] = q_1^Tc - q_1^T(q_1^Tc)q_1 - q_1^T(q_2^Tc)q_2$

Since $q_1^Tc$ and $q_2^Tc$ are scalars and $q_1$ and $q_2$ are orthonormal we then have

$q_1^Tc' = q_1^Tc - q_1^T(q_1^Tc)q_1 - q_1^T(q_2^Tc)q_2 = q_1^Tc - (q_1^Tc)q_1^Tq_1 - (q_2^Tc)q_1^Tq_2$

$= q_1^Tc - (q_1^Tc) \cdot 1 - (q_2^Tc) \cdot 0 = q_1^Tc - q_1^Tc = 0$

So $c'$ is orthogonal to $q_1$.

Taking the dot product of $q_2$ and $c'$ we have

$q_2^Tc' = q_2^T \left[ c - (q_1^Tc)q_1 - (q_2^Tc)q_2 \right] = q_2^Tc - q_2^T(q_1^Tc)q_1 - q_2^T(q_2^Tc)q_2$

$= q_1^Tc - (q_1^Tc)q_1^Tq_1 - (q_2^Tc)q_1^Tq_2 = q_2^Tc - q_2^Tc = 0$

So $c'$ is also orthogonal to $q_2$.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.25

Exercise 3.4.25. Given $y = x^2$ over the interval $-1 \le x \le 1$ what is the closest line $C + Dx$ to the parabola formed by $y$?

Answer: This amounts to finding a least-squares solution to the equation $\begin{bmatrix} 1&x \end{bmatrix} \begin{bmatrix} C \\ D \end{bmatrix} = y$, where the entries 1, $x$, and $y = x^2$ are understood as functions of $x$ over the interval -1 to 1 (as opposed to being scalar values).

Interpreting the traditional least squares equation $A^TAx = A^Tb$ in this context, here the matrix $A = \begin{bmatrix} 1&x \end{bmatrix}$ and we have

$A^TA = \begin{bmatrix} 1 \\ x \end{bmatrix} \begin{bmatrix} 1&x \end{bmatrix} = \begin{bmatrix} (1, 1)&(1, x) \\ (x, 1)&(x, x) \end{bmatrix}$

where the entries of $A^TA$ are the dot products of the functions, i.e., the integrals of their products over the interval -1 to 1.

We then have

$(1, 1) = \int_{-1}^1 1 \cdot 1 \;\mathrm{d}x = 2$

$(1, x) = (x, 1) = \int_{-1}^1 1 \cdot x \;\mathrm{d}x = \left( \frac{1}{2}x^2 \right) \;\big|_{-1}^1 = \frac{1}{2} \cdot 1^2 - \frac{1}{2} \cdot (-1)^2 = \frac{1}{2} - \frac{1}{2} = 0$

$(x, x) = \int_{-1}^1 x^2 \;\mathrm{d}x = \left( \frac{1}{3}x^3 \right) \;\big|_{-1}^1 = \frac{1}{3} \cdot 1^3 - \frac{1}{3} \cdot (-1)^3 = \frac{1}{3} + \frac{1}{3} = \frac{2}{3}$

so that

$A^TA = \begin{bmatrix} (1, 1)&(1, x) \\ (x, 1)&(x, x) \end{bmatrix} = \begin{bmatrix} 2&0 \\ 0&\frac{2}{3} \end{bmatrix}$

Continuing the interpretation of the least squares equation $A^TAx = A^Tb$ in this context, the role of $b$ is played by the function $y = x^2$, and we have

$A^Ty = \begin{bmatrix} 1 \\ x \end{bmatrix} x^2 = \begin{bmatrix} (1,x^2) \\ (x, x^2) \end{bmatrix}$

where again the entries are dot products of the functions. From above we have

$(1, x^2) = \int_{-1}^1 1 \cdot x^2 \;\mathrm{d}x = \frac{2}{3}$

and from previous exercises we have

$(x, x^2) = \int_{-1}^1 x \cdot x^2 \;\mathrm{d}x = \int_{-1}^1 x^3 \;\mathrm{d}x = 0$

so that

$A^Ty = \begin{bmatrix} \frac{2}{3} \\ 0 \end{bmatrix}$

To get the least squares solution $\bar{C} + \bar{D}x$ we then have

$\begin{bmatrix} 2&0 \\ 0&\frac{2}{3} \end{bmatrix} \begin{bmatrix} \bar{C} \\ \bar{D} \end{bmatrix} = \begin{bmatrix} \frac{2}{3} \\ 0 \end{bmatrix}$

From the second equation we have $\bar{D} = 0$. From the first equation we have $2\bar{C} = \frac{2}{3}$ or $C = \frac{1}{3}$.

The line of best fit to the parabola $y = x^2$ over the interval $-1 \le x \le 1$ is therefore the horizontal line with $y$-intercept of $\frac{1}{3}$.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | Tagged , | Leave a comment

## Linear Algebra and Its Applications, Exercise 3.4.24

Exercise 3.4.24. As discussed on page 178, the first three Legendre polynomials are 1, $x$, and $x^2 - \frac{1}{3}$. Find the next Legendre polynomial; it will be a cubic polynomial defined for $-1 \le x \le 1$ and will be orthogonal to the first three Legendre polynomials.

Answer: The process of finding the fourth Legendre poloynomial is essentially an application of Gram-Schmidt orthogonalization. The first three polynomials are

$v_1 = 1 \qquad v_2 = x \qquad v_3 = x^2 - \frac{1}{3}$

We can find the fourth Legendre polynomial by starting with $x^3$ and subtracting off the projections of $x_3$ on the first three polynomials:

$v_4 = x^3 - \frac{(v_1, x^3)}{(v_1, v_1)}v_1 - \frac{(v_2, x^3)}{(v_2, v_2)}v_2 - \frac{(v_3, x^3)}{(v_3, v_3)}v_3$

$= \frac{(1, x^3)}{(1, 1)}\cdot 1 - \frac{(x, x^3)}{(x, x)}x - \frac{(x^2-\frac{1}{3}, x^3)}{(x^2-\frac{1}{3}, x^2-\frac{1}{3})}(x^2-\frac{1}{3})$

For the first term we have

$(1, x^3) = \int_{-1}^1 1 \cdot x^3 \;\mathrm{d}x = \int_{-1}^1 x^3 \;\mathrm{d}x = 0$

so that the first term $\frac{(v_1, x^3)}{(v_1, v_1)}v_1$ does not appear in the expression for $v_4$.

The third term $\frac{(v_3, x^3)}{(v_3, v_3)}v_3$ drops out for the same reason: its numerator is

$(x^2-\frac{1}{3}, x^3) = \int_{-1}^1 (x^2 - \frac{1}{3}) x^3 \;\mathrm{d}x$

$= \int_{-1}^1 x^5 \;\mathrm{d}x - \frac{1}{3} \int_{-1}^1 x^3 \;\mathrm{d}x = 0 - \frac{1}{3} \cdot 0 = 0$

That leaves the second term $\frac{(v_2, x^3)}{(v_2, v_2)}v_2$ with numerator of

$(x, x^3) = \int_{-1}^1 x \cdot x^3 \;\mathrm{d}x = \int_{-1}^1 x^4 \;\mathrm{d}x$

$= \left( \frac{1}{5} x^5 \right) \;\big|_{-1}^1 = \frac{1}{5} \cdot 1^5 - \frac{1}{5} \cdot (-1)^5 = \frac{1}{5} - (-\frac{1}{5}) = \frac{2}{5}$

and denominator

$(x, x) = \int_{-1}^1 x^2 \;\mathrm{d}x = \left( \frac{1}{3}x^3 \right) \;\big|_{-1}^1 = \frac{1}{3} \cdot 1^3 - \frac{1}{3} \cdot (-1)^3 = \frac{1}{3} + \frac{1}{3} = \frac{2}{3}$

We then have

$v_4 = x^3 - \left[ \frac{2}{5}/\frac{2}{3} \right] x = x^3 - \frac{3}{5}x$

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.23

Exercise 3.4.23. Given the step function $y$ with $y(x) = 1$ for $0 \le x \le \pi$ and $y(x) = 0$ for $\pi < x < 2\pi$, find the following Fourier coefficients:

$a_0 = \frac{(y, 1)}{(1, 1)} \qquad a_1 = \frac{(y, \cos x)}{(\cos x, \cos x)} \qquad b_1 = \frac{(y, \sin x)}{(\sin x, \sin x)}$

Answer: For $a_0$ the numerator is

$(y, 1) = \int_0^{2\pi} y(x) \cdot 1 \;\mathrm{d}x = \int_0^{\pi} 1 \;\mathrm{d}x + \int_{\pi}^{2\pi} 0 \;\mathrm{d}x = \pi$

and the denominator is

$(1, 1) = \int_0^{2\pi} 1^2 \;\mathrm{d}x = 2\pi$

so that $a_0 = \frac{\pi}{2\pi} = \frac{1}{2}$.

For $a_1$ the numerator is

$(y, \cos x) = \int_0^{2\pi} y(x) \cos x \;\mathrm{d}x = \int_0^{\pi} 1 \cdot \cos x \;\mathrm{d}x + \int_{\pi}^{2\pi} 0 \cdot \cos x \;\mathrm{d}x$

$= \int_0^{\pi} \cos x = \sin x \;\big|_0^{\pi} = 0 - 0 = 0$

so that $a_1 = 0$.

For $b_1$ the numerator is

$(y, \sin x) = \int_0^{2\pi} y(x) \sin x \;\mathrm{d}x = \int_0^{\pi} 1 \cdot \sin x \;\mathrm{d}x + \int_{\pi}^{2\pi} 0 \cdot \sin x \;\mathrm{d}x$

$= \int_0^{\pi} \sin x = (-\cos x) \;\big|_0^{\pi} = -(-1) - (-1) = 1 + 1 = 2$

and the denominator is

$(\sin x, \sin x) = \int_0^{2\pi} \sin^2 x \;\mathrm{d}x = \left[ \frac{1}{2}x - \frac{1}{4} \sin 2x \right] \;\big|_0^{2\pi}$

$= \left[ \frac{1}{2}\cdot(2\pi) - \frac{1}{4} \sin 2\pi \right] - \left[ \frac{1}{2} \cdot 0 - \frac{1}{4} \sin 2 \cdot 0 \right] = \pi - \frac{1}{4} \cdot 0 - 0 + \frac{1}{4} \cdot 0 = \pi$

so that $b_1 = \frac{2}{\pi}$.

So we have $a_0 = \frac{1}{2}$, $a_1 = 0$, and $b_1 = \frac{2}{\pi}$.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | Tagged , | Leave a comment

## Linear Algebra and Its Applications, Exercise 3.4.22

Exercise 3.4.22. Given an arbitrary function $y$ find the coefficient $b_1$ that minimizes the quantity

$\|b_1\sin x - y\|^2 = \int_0^{2\pi} (b_1\sin x - y(x))^2 \;\mathrm{d}x$

(Use the method of setting the derivative to zero.) How does this value of $b_1$ compare with the Fourier coefficient $b_1$? What is $b_1$ if $y(x) = \cos x$?

Answer: We are looking for a value of $b_1$ that minimizes the expression on the right, so we need to differentiate with respect to $b_1$. Expanding the right-hand side of the equation above, we have

$\int_0^{2\pi} (b_1\sin x - y(x))^2 \;\mathrm{d}x = \int_0^{2\pi} [b_1^2\sin^2 x - 2b_1y(x)\sin x + y(x)^2] \;\mathrm{d}x$

$= \int_0^{2\pi} b_1^2\sin^2 x \;\mathrm{d}x - 2 \int_0^{2\pi} b_1y(x)\sin x \;\mathrm{d}x + \int_0^{2\pi} y(x)^2 \;\mathrm{d}x$

Since $b_1$ is not dependent on $x$ we can pull it out of the integral, so that

$\int_0^{2\pi} (b_1\sin x - y(x))^2 \;\mathrm{d}x = b_1^2 \int_0^{2\pi} \sin^2 x \;\mathrm{d}x - 2b_1 \int_0^{2\pi} y(x) \sin x \;\mathrm{d}x + \int_0^{2\pi} y(x)^2 \;\mathrm{d}x$

Differentiating with respect to $b_1$ we have

$\frac{\mathrm{d}}{\mathrm{d}b_1} \int_0^{2\pi} (b_1\sin x - y(x))^2 \;\mathrm{d}x$

$\frac{\mathrm{d}}{\mathrm{d}b_1} \left[ b_1^2 \int_0^{2\pi} \sin^2 x \;\mathrm{d}x - 2b_1 \int_0^{2\pi} y(x) \sin x \;\mathrm{d}x + \int_0^{2\pi} y(x)^2 \;\mathrm{d}x \right]$

$= 2b_1 \int_0^{2\pi} \sin^2 x \;\mathrm{d}x - 2 \int_0^{2\pi} y(x) \sin x \;\mathrm{d}x$

Equating the derivative to zero gives us

$2b_1 \int_0^{2\pi} \sin^2 x \;\mathrm{d}x = 2 \int_0^{2\pi} y(x) \sin x \;\mathrm{d}x$

or

$b_1 = \left( \int_0^{2\pi} y(x) \sin x \;\mathrm{d}x \right) / \left( \int_0^{2\pi} \sin^2 x \;\mathrm{d}x \right)$

Note that this is identical to the expression for the Fourier coefficient $b_1$ on page 178; the numerator is the dot product of $y(x)$ with $\sin x$ and the denominator is the dot product of $\sin x$ with itself.

If $y(x) = \cos x$ then the numerator of $b_1$ becomes

$\int_0^{2\pi} \cos x \sin x \;\mathrm{d}x = 0$

since $\cos x$ and $\sin x$ are orthogonal, and we therefore have $b_1 = 0$.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

## Linear Algebra and Its Applications, Exercise 3.4.21

Exercise 3.4.21. Given the function $f(x) = \sin 2x$ on the interval $-\pi \le x \le \pi$, what is the closest function $a \cos x + b \sin x$ to $f$? What is the closest line $c + dx$ to $f$?

Answer: To find the closest function $a \cos x + b \sin x$ to the function $f(x) = \sin 2x$ we first project $f$ onto the function $\cos x$ on the given interval to obtain $a$, and then project $f$ onto $\sin x$ to obtain $b$.

We project $f$ onto $\cos x$ by taking the dot product of $f$ with $\cos x$ and then normalizing by dividing by the dot product of $\cos x$ with itself:

$a = (f, \cos x)/(\cos x, \cos x)$

The numerator is

$(f, \cos x) = \int_{-\pi}^{\pi} f(x) \cos x \;\mathrm{d}x = \int_{-\pi}^{\pi} \sin 2x \cos x \;\mathrm{d}x$

$= 2 \int_{-\pi}^{\pi} \sin x \cos^2 x \;\mathrm{d}x$

where we used the trigonometric identity $\sin 2\theta = \sin \theta \cos \theta$.

To integrate we substitute the variable $u = \cos x$ so that $\mathrm{d}u = -\sin x \;\mathrm{d}x$. We then have

$\int \sin x \cos^2 x \;\mathrm{d}x = -\int \cos^2 x (-\sin x) \;\mathrm{d}x$

$-\int u^2 \;\mathrm{d}u = -\frac{1}{3}u^3 = -\frac{1}{3} \cos^3 x$

We then have

$(f, \cos x) = 2 \int_{-\pi}^{\pi} \sin x \cos^2 x \;\mathrm{d}x = 2 (-\frac{1}{3} \cos^3 x) \;\big|_{-\pi}^{\pi}$

$= -\frac{2}{3} \cos^3 \pi - [-\frac{2}{3} \cos^3 (-\pi)] = -\frac{2}{3} (-1)^3 - [-\frac{2}{3} (-1)^3]$

$= -\frac{2}{3} \cdot (-1) - [-\frac{2}{3} \cdot (-1)] = \frac{2}{3} - \frac{2}{3} = 0$

Since the numerator in the expression for $a$ is zero, we have $a = 0$.

(Note that we do not need to calculate the denominator in the expression for $a$. We know it must be positive, and thus the quotient is defined. See below for a sketch of a proof of this.)

We next project $f$ onto $\sin x$ by taking the dot product of $f$ with $\sin x$ and then normalizing by dividing by the dot product of $\sin x$ with itself:

$a = (f, \sin x)/(\sin x, \sin x)$

The numerator is

$(f, \sin x) = \int_{-\pi}^{\pi} f(x) \sin x \;\mathrm{d}x = \int_{-\pi}^{\pi} \sin 2x \sin x \;\mathrm{d}x$

$= 2 \int_{-\pi}^{\pi} \sin^2 x \cos x \;\mathrm{d}x$

where we used the trigonometric identity $\sin 2\theta = \sin \theta \cos \theta$.

To integrate we substitute the variable $u = \sin x$ so that $\mathrm{d}u = \cos x \;\mathrm{d}x$. We then have

$\int \sin^2 x \cos x \;\mathrm{d}x = \int u^2 \;\mathrm{d}u = \frac{1}{3}u^3 = \frac{1}{3} \sin^3 x$

We then have

$(f, \sin x) = 2 \int_{-\pi}^{\pi} \sin^2 x \cos x \;\mathrm{d}x = 2 (\frac{1}{3} \sin^3 x) \;\big|_{-\pi}^{\pi}$

$= \frac{2}{3} \sin^3 \pi - \frac{2}{3} \sin^3 (-\pi) = \frac{2}{3} (0)^3 - \frac{2}{3} (0)^3$

$= 0 - 0 = 0$

Since the numerator in the expression for $b$ is zero, we have $b = 0$. (Again, we are guaranteed that the denominator is positive and the quotient defined.)

So the closest function $a \cos x + b \sin x$ to $f(x) = \sin 2x$ is $0 \cdot \cos x + 0 \cdot \sin x = 0$.

To find the closest function $c + dx$ to the function $f(x) = \sin 2x$ we first project $f$ onto the constant function with the value 1 on the given interval to obtain $c$, and then project $f$ onto the function $x$ to obtain $d$.

We project $f$ onto the constant function with value 1 by taking the dot product of $f$ with 1 and then normalizing by dividing by the dot product of 1 with itself:

$c = (f, 1)/(1, 1)$

The numerator is

$(f, 1) = \int_{-\pi}^{\pi} f(x) \cdot 1 \;\mathrm{d}x = \int_{-\pi}^{\pi} \sin 2x \;\mathrm{d}x$

To integrate we substitute the variable $u = 2x$ so that $\mathrm{d}u = 2 \;\mathrm{d}x$. We then have

$\int \sin 2x \;\mathrm{d}x = \int \frac{1}{2} \sin 2x \cdot 2 \;\mathrm{d}x$

$= \frac{1}{2} \int \sin u \;\mathrm{d}u = \frac{1}{2}(-\cos u) = -\frac{1}{2} \cos 2x$

We then have

$(f, 1) = \int_{-\pi}^{\pi} \sin 2x \;\mathrm{d}x = -\frac{1}{2} \cos 2x \;\big|_{-\pi}^{\pi}$

$= -\frac{1}{2} \cos 2\pi - (-\frac{1}{2} \cos (-2\pi) = -\frac{1}{2} (1)^3 - (-\frac{1}{2} (1)^3$

$= -\frac{1}{2} + \frac{1}{2}= 0$

Since the numerator in the expression for $c = (f, 1)/(1, 1)$ is zero we have $c = 0$. (Recall that the denominator is guaranteed to be positive.)

We project $f$ onto the function $x$ by taking the dot product of $f$ with $x$ and then normalizing by dividing by the dot product of $x$ with itself:

$d = (f, x)/(x, x)$

The numerator is

$(f, x) = \int_{-\pi}^{\pi} f(x) \cdot x \;\mathrm{d}x = \int_{-\pi}^{\pi} x \sin 2x \;\mathrm{d}x$

To integrate this we use integration by parts, taking advantage of the formula $\int u \;\mathrm{d}v = uv - \int v \;\mathrm{d}u$. (The following is adapted from a post on socratic.org.) We let $\mathrm{d}v = \sin 2x \;\mathrm{d}x$ and $u = x$. Then $\mathrm{d}u$ is simply $\mathrm{d}x$, and $v = -\frac{1}{2} \cos 2x$ (the integrand of $\sin 2x$, as discussed above).

We then have

$\int x \sin x \;\mathrm{d}x = \int u \;\mathrm{d}v = uv - \int v \;\mathrm{d}u$

$= x (-\frac{1}{2} \cos 2x) - \int (-\frac{1}{2} \cos 2x) \;\mathrm{d}x$

$= -\frac{1}{2} x \cos 2x + \frac{1}{2} \int \cos 2x \;\mathrm{d}x$

The second integral we can evaluate by substituting $w = 2x$ and $\mathrm{d}w = 2 \;\mathrm{d}x$ so that

$\int \cos 2x \;\mathrm{d}x = \frac{1}{2} \int \cos w \;\mathrm{d}w = \frac{1}{2} \sin w = \frac{1}{2} \sin 2x$

Substituting for the second integral above we then have

$\int x \sin x \;\mathrm{d}x = -\frac{1}{2} x \cos 2x + \frac{1}{2} \int \cos 2x \;\mathrm{d}x = -\frac{1}{2} x \cos 2x + \frac{1}{2} (\frac{1}{2} \sin 2x)$

$= -\frac{1}{2} x \cos 2x + \frac{1}{4} \sin 2x$

We then have

$(f, x) = \int_{-\pi}^{\pi} x \sin 2x \;\mathrm{d}x = -\frac{1}{2} x \cos 2x \;\big|_{-\pi}^{\pi} + \frac{1}{4} \sin 2x \;\big|_{-\pi}^{\pi}$

$= -\frac{1}{2} \pi \cos 2\pi - (-\frac{1}{2} (-\pi) \cos (-2\pi) + \frac{1}{4} \sin 2\pi - \frac{1}{4} \sin 2(-\pi)$

$= -\frac{1}{2} \pi \cdot 1 + \frac{1}{2} (-\pi) \cdot 1 + \frac{1}{4} \cdot 0 - \frac{1}{4} \cdot 0 = -\frac{\pi}{2} - \frac{\pi}{2}= -\pi$

The denominator in the expression for $d$ is

$(x, x) = \int_{-\pi}^{\pi} x^2 \;\mathrm{d}x = \frac{1}{3} x^3 \;\big|_{-\pi}^{\pi}$

$= \frac{1}{3} \pi^3 - \frac{1}{3} (-\pi)^3 = \frac{2}{3} \pi^3$

We then have

$d = (f, x)/(x, x) = -\pi / (\frac{2}{3} \pi^3) = -\frac{3}{2\pi^2}$

The straight line $c + dx$ closest to the function $\sin 2x$ is thus the line $-\frac{3}{2\pi^2} x$.

ADDENDUM: Suppose that $g$ is a continuous function defined on the interval $[a, b]$ and $g(t) \ne 0$ for some $a \le t \le b$. Then we want to show that the inner product $(g, g) > 0$.

The basic idea of the proof is as follows: The function $g^2$ is always nonnegative, and thus its integral over the interval $[a, b]$ is nonnegative as well. If $g(t)$ is nonzero for some $a \le t \le b$ then since $g$ is continuous $g$ will also be nonzero for some interval $[c, d]$ that includes $t$, with $a \le c < d \le b$. This implies that the integral of $g^2$ over that subinterval $[c, d]$ will be positive.

But we also have $\int_a^b g(x)^2 \;\mathrm{d}x \ge \int_c^d g(x)^2 \;\mathrm{d}x$ since $g(x)^2 \ge 0$ and $[c, d]$ is contained within $[a, b]$. So if $\int_c^d g(x)^2 \;\mathrm{d}x > 0$ then we also have $\int_a^b g(x)^2 \;\mathrm{d}x > 0$ and the inner product $(g, g)$ is positive.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | Tagged , | Leave a comment