Quantum Country exercise 6

This is one in a series of posts working through the exercises in the Quantum Country online introduction to quantum computing and related topics. The exercises in the original document are not numbered; I have added my own numbers for convenience in referring to them.

Exercise 6. Show that the identity matrix I is unitary.

Answer: Since I is symmetric we have I^T = I and since the values of I are all real we have \left( I^T \right)^* = I^T. We thus have I^\dagger I = \left( I^T \right)^* I = I^T I = II = I by the definition of I.

Since I^\dagger I = I the matrix I is unitary.

Posted in quantum country | Leave a comment

Quantum Country exercise 5

This is one in a series of posts working through the exercises in the Quantum Country online introduction to quantum computing and related topics. The exercises in the original document are not numbered; I have added my own numbers for convenience in referring to them.

Exercise 5. Show that the matrix X = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} is unitary.

Answer: A matrix U is unitary if U^\dagger U = I where U^\dagger is the adjoint matrix to U (produced by taking the transpose U^T of U and then replacing all values by their complex conjugates) and I is the identity matrix.

Since X is symmetric we have X^T = X and since the values of X (and thus of X^T) are all real we have \left( X^T \right)^* = X^T. We thus have

X^\dagger X = \left( X^T \right)^* X = X^T X = XX

= \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I

Since X^\dagger X = I the matrix X is unitary.

Posted in quantum country | Leave a comment

Quantum Country exercise 4

This is one in a series of posts working through the exercises in the Quantum Country online introduction to quantum computing and related topics. The exercises in the original document are not numbered; I have added my own numbers for convenience in referring to them.

Exercise 4. Consider a quantum circuit in which the Hadamard gate H is applied to a quantum state \vert \psi \rangle and then the outout is measured in the computational basis.  Show that when the state

Answer: First consider the case when the state

H \left( \frac{\vert 0 \rangle + \vert 1 \rangle}{\sqrt{2}} \right) = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ 1 \end{bmatrix}

= \frac{1}{2} \begin{bmatrix} 2 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \vert 0 \rangle

So the result of applying H to the state is the state \vert 0 \rangle.

In the general case the state \alpha \vert 0 \rangle + \beta \vert 1 \rangle will produce the result 0 with probability |\alpha|^2 and the value 1 with probability |\alpha|^2 when measured in the computational basis. Since the result in this case is \vert 0 \rangle we have \alpha = 1 and \beta = 0 so the result will be m = 0 with probability |1|^2 = 1, with the result m = 1 having probability |0|^2 = 0. So the output as measured will always be m = 0.

When the state

H \left( \frac{\vert 0 \rangle - \vert 1 \rangle}{\sqrt{2}} \right) = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ -1 \end{bmatrix}

= \frac{1}{2} \begin{bmatrix} 0 \\ 2 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \vert 1 \rangle

So the result of applying H to the state is the state \vert 1 \rangle, which when measured in the computational basis will always produce the value m = 1.

Posted in quantum country | Leave a comment

Quantum Country exercise 3

This is one in a series of posts working through the exercises in the Quantum Country online introduction to quantum computing and related topics. The exercises in the original document are not numbered; I have added my own numbers for convenience in referring to them.

Exercise 3. Consider a quantum circuit in which first the Hadamard gate H is applied to a quantum state \vert \psi \rangle and then the X gate is applied to the output of the first gate. Explain why the output from this circuit is XH \vert \psi \rangle and not

Answer: Applying the gate H to the quantum state \vert \psi \rangle is equivalent to multiplying the state vector \begin{bmatrix} \psi_1 \\ \psi_2 \end{bmatrix} (where \psi_1 and \psi_2 are complex values) by the 2 by 2 matrix \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}.

By the rules of matrix multiplication this multiplication occurs from the left as H \vert \psi \rangle and produces a two element column vector as a result, representing the output quantum state \vert \psi' \rangle.

Applying the second gate X to that result again requires multiplying that two-element column vector by the matrix \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix} from the left as X \vert \psi' \rangle. That produces another two-element column vector representing the final quantum state \vert \psi'' \rangle output from the quantum circuit.

We thus have

\vert \psi'' \rangle = X \vert \psi' \rangle = X (H \vert \psi \rangle) = XH \vert \psi \rangle

In contrast, the expression HX \vert \psi \rangle amounts to first applying the gate X to \vert \psi \rangle and then applying the Hadamard gate H afterward, the opposite of the given circuit.

Posted in quantum country | Leave a comment

Quantum Country exercise 2

This is one in a series of posts working through the exercises in the Quantum Country online introduction to quantum computing and related topics. The exercises in the original document are not numbered; I have added my own numbers for convenience in referring to them.

Exercise 2. Suppose that instead of the Hadamard matrix H we’d defined a matrix J = \frac{1}{\sqrt{2}} \begin{bmatrix} 1&1 \\ 1&1 \end{bmatrix}. Explain why J would not make a suitable quantum gate by applying it to the quantum state \frac{| 0 \rangle - | 1 \rangle}{\sqrt{2}}.

Answer: We have

\frac{\vert 0 \rangle - \vert 1 \rangle}{\sqrt{2}} = \frac{1}{\sqrt{2}} \left( \begin{bmatrix} 1 \\ 0 \end{bmatrix} - \begin{bmatrix} 0 \\ 1 \end{bmatrix} \right) = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ -1 \end{bmatrix}

so that applying J to \frac{\vert 0 \rangle - \vert 1 \rangle}{\sqrt{2}} gives us

\frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1& 1 \end{bmatrix} \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ -1 \end{bmatrix} = \frac{1}{2} \begin{bmatrix} 1 - 1 \\ 1 - 1 \end{bmatrix}

= \frac{1}{2} \begin{bmatrix} 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} = 0

So J applied to \frac{\vert 0 \rangle - \vert 1 \rangle}{\sqrt{2}} gives us the zero state. That is not a result we’d want from applying a quantum gate.

Posted in quantum country | Leave a comment

Quantum Country exercise 1

This is the first in a series of posts working through the exercises in the Quantum Country online introduction to quantum computing and related topics. The exercises in the original document are not numbered; I have added my own numbers for convenience in referring to them.

Exercise 1. For the Hadamard matrix H = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1& -1 \end{bmatrix}, verify that HH = I where I is the identity matrix \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}.

Answer: We have

HH = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1& -1 \end{bmatrix} \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1& -1 \end{bmatrix}

= \frac{1}{2} \begin{bmatrix} 1 \cdot 1 + 1 \cdot 1 & 1 \cdot 1 + 1 \cdot (-1) \\ 1 \cdot 1 + (-1) \cdot 1 & 1 \cdot 1 + (-1) \cdot (-1) \end{bmatrix}

= \frac{1}{2} \begin{bmatrix} 1 + 1 & 1 - 1 \\ 1 - 1 & 1 + 1 \end{bmatrix}

= \frac{1}{2} \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I

So we have shown that HH = I.

Posted in quantum country | Leave a comment

Variance and the sum of squared pairwise differences

The variance \sigma^2 of a set of n values x_1, x_2, ..., x_n is usually expressed in terms of squared differences between those values and the mean \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i of those values.

\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2

However the sum of squared differences (x_i - \bar{x})^2 between the values and the mean can also be expressed in term of the sum of squared pairwise differences (x_i - x_j)^2 among the values themselves, without reference to the mean \bar{x}.

In particular, we want to show that

\sum_{i=1}^{n} (x_i - \bar{x})^2 = \frac{1}{2n} \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2.

To get an expression involving \bar{x} we rewrite the squared difference in the righthand sum and then expand the result:

\sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2 = \sum_{i=1}^{n} \sum_{j=1}^{n} [(x_i - \bar{x}) - (x_j - \bar{x})]^2

= \sum_{i=1}^{n} \sum_{j=1}^{n} [(x_i - \bar{x})^2 - 2 (x_i - \bar{x}) (x_j - \bar{x}) + (x_j - \bar{x})^2]

= \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x})^2 - 2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) + \sum_{i=1}^{n} \sum_{j=1}^{n} (x_j - \bar{x})^2

Since the squared difference in the first term does not depend on j, the first term can be rewritten as

\sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x})^2 = \sum_{i=1}^{n} n (x_i - \bar{x})^2 = n \sum_{i=1}^{n} (x_i - \bar{x})^2

Since the squared difference in the third term does not depend on i, the third term can be rewritten as

= \sum_{i=1}^{n} \sum_{j=1}^{n} (x_j - \bar{x})^2 = n \sum_{j=1}^{n} (x_j - \bar{x})^2 = n \sum_{i=1}^{n} (x_i - \bar{x})^2

where in the last step we replaced j as an index with i. So the third term is identical to the first term.

We now turn to the second term, -2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}). We can bring the difference (x_i - \bar{x}) out of the inner sum, since it does not depend on the index j. This gives us

-2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) = -2 \sum_{i=1}^{n} (x_i - \bar{x}) [\sum_{j=1}^{n} (x_j - \bar{x})]

The sum \sum_{j=1}^{n} (x_j - \bar{x}) can then be rewritten as

\sum_{j=1}^{n} (x_j - \bar{x}) = \sum_{j=1}^{n} x_j - \sum_{j=1}^{n} \bar{x}

= \sum_{j=1}^{n} x_j - n \bar{x}

But we have \bar{x} = \frac{1}{n} \sum_{j=1}^{n} x_j by definition, so we then have

\sum_{j=1}^{n} (x_j - \bar{x}) = \sum_{j=1}^{n} x_j - n \bar{x} = n \bar{x} - n \bar{x} = 0

We can then substitute this result into the second term as follows:

-2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) = -2 \sum_{i=1}^{n} (x_i - \bar{x}) [\sum_{j=1}^{n} (x_j - \bar{x})]

= -2 \sum_{i=1}^{n} (x_i - \bar{x}) \cdot 0 = -2 \sum_{i=1}^{n} 0 = 0

Now that we know the value of all three terms we have

\sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2

= \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x})^2 - 2 \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - \bar{x}) (x_j - \bar{x}) + \sum_{i=1}^{n} \sum_{j=1}^{n} (x_j - \bar{x})^2

= n \sum_{i=1}^{n} (x_i - \bar{x})^2 + 0 + n \sum_{i=1}^{n} (x_i - \bar{x})^2

= 2n \sum_{i=1}^{n} (x_i - \bar{x})^2

so that

\sum_{i=1}^{n} (x_i - \bar{x})^2 = \frac{1}{2n} \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2

which is what we set out to prove.

However, we can further simplify this identity. Since (x_i - x_j) = 0 when i = j and (x_i - x_j)^2 = (x_j - x_i)^2, we can consider only differences when i < j (i.e., elements above the diagonal, if we consider the pairwise comparisons to form a matrix):

\sum_{i=1}^{n} (x_i - \bar{x})^2 = \frac{1}{2n} \sum_{i=1}^{n} \sum_{j=1}^{n} (x_i - x_j)^2

= \frac{1}{2n} [\sum_{i < j} (x_i - x_j)^2 + \sum_{i = j} (x_i - x_j)^2 + \sum_{i > j} (x_i - x_j)^2]

\frac{1}{2n} [\sum_{i < j} (x_i - x_j)^2 + 0 + \sum_{i < j} (x_i - x_j)^2]

\frac{1}{2n} [2 \sum_{i < j} (x_i - x_j)^2] = \frac{1}{n} \sum_{i < j} (x_i - x_j)^2

From the definition of \sigma^2 we then have

\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2

= \frac{1}{n} [\frac{1}{n} \sum_{i < j} (x_i - x_j)^2]

= \frac{1}{n^2} \sum_{i < j} (x_i - x_j)^2

Posted in Uncategorized | Leave a comment

All length-preserving matrices are unitary

I recently read the (excellent) online resource Quantum Computing for the Very Curious by Andy Matuschak and Michael Nielsen. Upon reading the proof that all length-preserving matrices are unitary and trying it out myself, I came to believe that there is an error in the proof as written, specifically with trying to show that off-diagonal entries in M^\dagger M are zero if M is length-preserving.

Using the identity || M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>, a suitable choice of \left|\psi\right> = \left|e_j\right> + \left|e_k\right> with j \ne k, and the fact that M is length-preserving, Nielsen first shows that (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0 for j \ne k.

He then goes on to write “But what if we’d done something slightly different, and instead of using \left|\psi\right> = \left|e_j\right> + \left|e_k\right> we’d used \left|\psi\right> = \left|e_j\right> - \left|e_k\right>? … I won’t explicitly go through the steps – you can do that yourself – but if you do go through them you end up with the equation: (M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0.”

I was an undergraduate physics and math major, but either I never worked with bra-ket notation and Hermitian conjugates or I’ve forgotten whatever I knew about them. In any case in working through this I could not get the same result as Nielsen; I simply ended up once again proving that (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0.

After some thought and experimentation I concluded that the key is to choose \left|\psi\right> = \left|e_j\right> + i\left|e_k\right>. Below is my (possibly mistaken!) attempt at a correct proof that all length-preserving matrices are unitary.

Proof: Let M be a length-preserving matrix such that for any vector \left|\psi\right> we have || M \left|\psi\right> || = || \left|\psi\right> ||. We wish to show that M is unitary, i.e., M^\dagger M = I.

We first show that the diagonal elements of M^\dagger M, or (M^\dagger M)_{jj}, are equal to 1.

To do this we start with the unit vectors \left|e_j\right> and \left|e_k\right> with 1 in positions j and k respectively, and 0 otherwise. The product M^\dagger M \left|e_k\right> is then the kth column of M^\dagger M, and \left<e_j\right| M^\dagger M \left|e_k\right> is the jkth entry of M^\dagger M or (M^\dagger M)_{jk}.

From the general identity \left<\psi\right| M^\dagger M \left|\psi\right> = || M \left|\psi\right> ||^2 we also have \left<e_j\right| M^\dagger M \left|e_j\right> = || M \left|e_j\right> ||^2. But since M is length-preserving we have || M \left|e_j\right> ||^2 = || \left|e_j\right> ||^2 = 1^2 = 1 since \left|e_j\right> is a unit vector.

We thus have (M^\dagger M)_{jj} = \left<e_j\right| M^\dagger M \left|e_j\right> = || M \left|e_j\right> ||^2 =  1. So all diagonal entries of M^\dagger M are 1.

We next show that the non-diagonal elements of M^\dagger M, or (M^\dagger M)_{jk} with j \ne k, are equal to zero.

Let \left|\psi\right> = \left|e_j\right> + \left|e_k\right> with j \ne k. Since M is length-preserving we have

|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + \left|e_k\right> ||^2 = 1^2 + 1^2 = 2

We also have || M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right> where \left<\psi\right| = \left|\psi\right>^\dagger = (\left|e_j\right> + \left|e_k\right>)^\dagger. From the definition of the dagger operation and the fact that the nonzero entries of \left|e_j\right> and \left|e_k\right> have no imaginary parts we have (\left|e_j\right> + \left|e_k\right>)^\dagger = \left<e_j\right| + \left<e_k\right|.

We then have

|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>

= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>

= (\left|e_j\right> + \left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + \left|e_k\right>)

= (\left<e_j\right| + \left<e_k\right|) M^\dagger M (\left|e_j\right> + \left|e_k\right>)

= \left<e_j\right| M^\dagger M \left|e_j\right> + \left<e_j\right| M^\dagger M \left|e_k\right> + \left<e_k\right| M^\dagger M \left|e_j\right> + \left<e_k\right| M^\dagger M \left|e_k\right>

= (M^\dagger M)_{jj} + (M^\dagger M)_{jk} + (M^\dagger M)_{kj} + (M^\dagger M)_{kk}

= 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}

since we previously showed that all diagonal entries of M^\dagger M are 1.

Since || M \left|\psi\right> ||^2 = 2 and also || M \left|\psi\right> ||^2 = 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj} we thus have (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0 for j \ne k.

Now let \left|\psi\right> = \left|e_j\right> + i\left|e_k\right> with j \ne k. Again we have || M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 since M is length-preserving, so that

|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + i\left|e_k\right> ||^2

= (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)

Since i\left|e_k\right> has an imaginary part for its (single) nonzero entry, in performing the dagger operation and taking complex conjugates we obtain (\left|e_j\right> + i\left|e_k\right>)^\dagger = \left<e_j\right| - i\left<e_k\right|. We thus have

|| M \left|\psi\right> ||^2 = (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)

= (\left<e_j\right| - i\left<e_k\right|)(\left|e_j\right> + i\left|e_k\right>)

= \left<e_j\right| \left|e_j\right> + \left<e_j\right| i \left|e_k\right> - i \left<e_k\right| \left|e_j\right> - i \left<e_k\right| i \left|e_k\right>

= \left<e_j|e_j\right> + i\left<e_j|e_k\right> - i \left<e_k|e_j\right> - i^2\left<e_k|e_k\right>

= \left<e_j|e_j\right> + i\left<e_j|e_k\right> - i\left<e_k|e_j\right> + \left<e_k|e_k\right>

= 1^2 + i\cdot 0 - i\cdot 0 + 1^2 = 2

We also have

|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>

= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>

= (\left|e_j\right> + i\left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + i\left|e_k\right>)

= (\left<e_j\right| - i\left<e_k\right|) M^\dagger M (\left|e_j\right> + i\left|e_k\right>)

= \left<e_j\right| M^\dagger M \left|e_j\right> + \left<e_j\right| M^\dagger M i\left|e_k\right> - i\left<e_k\right| M^\dagger M \left|e_j\right> - i\left<e_k\right| M^\dagger M i\left|e_k\right>

= \left<e_j\right| M^\dagger M \left|e_j\right> + i\left<e_j\right| M^\dagger M i\left|e_k\right> - i\left<e_k\right| M^\dagger M \left|e_j\right> - i^2\left<e_k\right| M^\dagger M \left|e_k\right>

= (M^\dagger M)_{jj} + i(M^\dagger M)_{jk} - i(M^\dagger M)_{kj} + (M^\dagger M)_{kk}

= 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)

Since || M \left|\psi\right> ||^2 = 2 we have 2 = 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right) or 0 = i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right) so that (M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0.

But we showed above that (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0. Adding the two equations the terms for (M^\dagger M)_{kj} cancel out and we get (M^\dagger M)_{jk} = 0 for j \ne k. So all nondiagonal entries of M^\dagger M are equal to zero.

Since all diagonal entries of M^\dagger M are equal to 1 and all nondiagonal entries of M^\dagger M are equal to zero, we have M^\dagger M = I and thus the matrix M is unitary.

Since we assumed M was a length-preserving matrix we have thus shown that all length-preserving matrices are unitary.

Posted in Uncategorized | 4 Comments

Linear Algebra and Its Applications, Exercise 3.4.28

Exercise 3.4.28. Given the plane x_1 + x_2 + x_3 = 0 and the following vectors

\begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} \qquad \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} \qquad \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix}

in the plane, find an orthonormal basis for the subspace represented by the plane. Report the dimension of the subspace and the number of nonzero vectors produced by Gram-Schmidt orthogonalization.

Answer: We start with the vector a_1 = (1, -1, 0) and normalize it to create q_1:

\|a_1\|^2 = 1^2 + (-1)^2 + 0^2 = 1 + 1 = 2

q_1 = a_1/\|a_1\| = \frac{1}{\sqrt{2}} a_1 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix}

We then take the second vector a_2 = (0, 1, -1) and create a second orthogonal vector a_2' by subtracting from a_2 its projection on q_1:

a_2' = a_2 - (q_1^Ta_2)q_1

= a_2 - \left[ \frac{1}{\sqrt{2}} \cdot 0 + (-\frac{1}{\sqrt{2}}) \cdot 1 + 0 \cdot (-1) \right]q_1 = a_2 - (-\frac{1}{\sqrt{2}})q_1 = a_2 + \frac{1}{\sqrt{2}}q_1

= \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ -1 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix}

We then normalize a_2' to create q_2:

\|a_2'\|^2 = (\frac{1}{2})^2 + (\frac{1}{2})^2 + (-1)^2 = \frac{1}{4} + \frac{1}{4} + 1 = \frac{3}{2}

q_2 = a_2'/\|a_2'\| = a_2'/\sqrt{\frac{3}{2}} = \frac{\sqrt{2}}{\sqrt{3}} \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix} = \begin{bmatrix} \frac{\sqrt{2}}{2\sqrt{3}} \\ \frac{\sqrt{2}}{2\sqrt{3}} \\ -\frac{\sqrt{2}}{\sqrt{3}} \end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}

Finally, we take the third vector a_3 = (1, 0, -1) and attempt to create another orthogonal vector a_3' by subtracting from a_3 its projections on q_1 and q_2:

a_3' = a_3 - (q_1^Ta_3)q_1 - (q_2^Ta_3)q_2

= a_3 - \left[ \frac{1}{\sqrt{2}} \cdot 1 + (-\frac{1}{\sqrt{2}}) \cdot 0 + 0 \cdot (-1) \right]q_1- \left[ \frac{1}{\sqrt{6}} \cdot 1 + \frac{1}{\sqrt{6}} \cdot 0 + (-\frac{2}{\sqrt{6}}) \cdot (-1) \right] q_2

= a_3 - \frac{1}{\sqrt{2}}q_1 - \frac{3}{\sqrt{6}}q_2 = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} - \frac{3}{\sqrt{6}} \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}

= \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} - \begin{bmatrix} \frac{3}{6} \\ \frac{3}{6} \\ -\frac{6}{6} \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \end{bmatrix} - \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}

Since a_3' = 0 we cannot create a third orthogonal vector to q_1 and q_2. The vectors

q_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \end{bmatrix} \qquad q_2 = \begin{bmatrix} \frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{6}} \\ -\frac{2}{\sqrt{6}} \end{bmatrix}

are an orthonormal basis for the subspace, and the dimension of the subspace is 2.

(In hindsight we could have predicted this result by inspecting the original vectors a_1, a_2, and a_3 and noticing that a_3 = a_1 + a_2. Thus only a_1 and a_2 were linearly independent, a_3 being linearly dependent on the first two vectors, so that only two orthonormal basis vectors could be created from the three vectors given.)

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | Tagged , | Leave a comment

Linear Algebra and Its Applications, Exercise 3.4.27

Exercise 3.4.27. Given the subspace spanned by the three vectors

a_1 = \begin{bmatrix} 1 \\ -1 \\ 0 \\ 0 \end{bmatrix} \qquad a_2 = \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} \qquad a_3 = \begin{bmatrix} 0 \\ 0 \\ 1 \\ -1 \end{bmatrix}

find vectors q_1, q_2, and q_3 that form an orthonormal basis for the subspace.

Answer: We can save some time by noting that a_1 and a_3 are already orthogonal. We can normalize these two vectors to create q_1 and q_3:

\|a_1\|^2 = 1^2 + (-1)^2 + 0^2 + 0^2 = 1 + 1 = 2

q_1 = a_1/\|a_1\| = \frac{1}{\sqrt{2}} a_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix}

\|a_3\|^2 = 0^2 + 0^2 + 1^2 + (-1)^2 = 1 + 1 = 2

q_3 = a_3/\|a_3\| = \frac{1}{\sqrt{2}} a_3 = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix}

We can then compute a third orthogonal vector a_2' by subtracting from a_2 its projections on q_1 and q_3:

a_2' = a_2 - (q_1^Ta_2)q_1 - (q_3^Ta_2)q_3

= a_2 - \left[ \frac{1}{\sqrt{2}} \cdot 0 + (-\frac{1}{\sqrt{2}}) \cdot 1 + 0 \cdot (-1) + 0 \cdot 0 \right]q_1 - \left[ 0 \cdot 0 + 0 \cdot 1 + \frac{1}{\sqrt{2}} \cdot (-1) +  (-\frac{1}{\sqrt{2}}) \cdot 0 \right]q_3

= a_2 - (-\frac{1}{\sqrt{2}})q_1 - (-\frac{1}{\sqrt{2}})q_3 = a_2 + \frac{1}{\sqrt{2}}q_1 + \frac{1}{\sqrt{2}}q_3

= \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix} + \frac{1}{\sqrt{2}} \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ -1 \\ 0 \end{bmatrix} + \begin{bmatrix} \frac{1}{2} \\ -\frac{1}{2} \\ 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \\ \frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix}

Finally, we normalize a_2' to create q_2:

\|a_2'\|^2 = (\frac{1}{2})^2 + (\frac{1}{2})^2 + (-\frac{1}{2})^2 + (-\frac{1}{2})^2 = \frac{1}{4} + \frac{1}{4} + \frac{1}{4} + \frac{1}{4} = 1

q_2 = a_2'/\|a_2'\| = a_2' = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix}

An orthonormal basis for the space is therefore

q_1 = \begin{bmatrix} \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \\ 0 \\ 0 \end{bmatrix} \qquad q_2 = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \\ -\frac{1}{2} \\ -\frac{1}{2} \end{bmatrix} \qquad q_3 = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}} \end{bmatrix}

(It’s worth noting that the solution for this exercise on page 480 is different than the solution given above. That’s presumably because we computed the orthonormal vectors in the order q_1, q_3, q_2 rather than the standard order q_1, q_2, q_3, taking advantage of the fact that the original vectors a_1 and a_3 were already orthogonal. Recall that a basis set is not unique, so it is possible to have different orthonormal bases for the same subspace.)

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fifth Edition and the accompanying free online course, and Dr Strang’s other books.

Posted in linear algebra | Tagged , | Leave a comment