All length-preserving matrices are unitary

I recently read the (excellent) online resource Quantum Computing for the Very Curious by Andy Matuschak and Michael Nielsen. Upon reading the proof that all length-preserving matrices are unitary and trying it out myself, I came to believe that there is an error in the proof as written, specifically with trying to show that off-diagonal entries in $M^\dagger M$ are zero if $M$ is length-preserving.

Using the identity $|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$ , a suitable choice of $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ with $j \ne k$ , and the fact that $M$ is length-preserving, Nielsen first shows that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ for $j \ne k$ .

He then goes on to write “But what if we’d done something slightly different, and instead of using $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ we’d used $\left|\psi\right> = \left|e_j\right> - \left|e_k\right>$ ? … I won’t explicitly go through the steps – you can do that yourself – but if you do go through them you end up with the equation: $(M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0$ .”

I was an undergraduate physics and math major, but either I never worked with bra-ket notation and Hermitian conjugates or I’ve forgotten whatever I knew about them. In any case in working through this I could not get the same result as Nielsen; I simply ended up once again proving that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ .

After some thought and experimentation I concluded that the key is to choose $\left|\psi\right> = \left|e_j\right> + i\left|e_k\right>$ . Below is my (possibly mistaken!) attempt at a correct proof that all length-preserving matrices are unitary.

Proof: Let $M$ be a length-preserving matrix such that for any vector $\left|\psi\right>$ we have $|| M \left|\psi\right> || = || \left|\psi\right> ||$ . We wish to show that $M$ is unitary, i.e., $M^\dagger M = I$ .

We first show that the diagonal elements of $M^\dagger M$ , or $(M^\dagger M)_{jj}$ , are equal to 1.

To do this we start with the unit vectors $\left|e_j\right>$ and $\left|e_k\right>$ with 1 in positions $j$ and $k$ respectively, and 0 otherwise. The product $M^\dagger M \left|e_k\right>$ is then the $k$ th column of $M^\dagger M$ , and $\left<e_j\right| M^\dagger M \left|e_k\right>$ is the $jk$ th entry of $M^\dagger M$ or $(M^\dagger M)_{jk}$ .

From the general identity $\left<\psi\right| M^\dagger M \left|\psi\right> = || M \left|\psi\right> ||^2$ we also have $\left<e_j\right| M^\dagger M \left|e_j\right> = || M \left|e_j\right> ||^2$ . But since $M$ is length-preserving we have $|| M \left|e_j\right> ||^2 = || \left|e_j\right> ||^2 = 1^2 = 1$ since $\left|e_j\right>$ is a unit vector.

We thus have $(M^\dagger M)_{jj} = \left<e_j\right| M^\dagger M \left|e_j\right> = || M \left|e_j\right> ||^2 = 1$ . So all diagonal entries of $M^\dagger M$ are 1.

We next show that the non-diagonal elements of $M^\dagger M$ , or $(M^\dagger M)_{jk}$ with $j \ne k$ , are equal to zero.

Let $\left|\psi\right> = \left|e_j\right> + \left|e_k\right>$ with $j \ne k$ . Since $M$ is length-preserving we have

$|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + \left|e_k\right> ||^2 = 1^2 + 1^2 = 2$

We also have $|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$ where $\left<\psi\right| = \left|\psi\right>^\dagger = (\left|e_j\right> + \left|e_k\right>)^\dagger$ . From the definition of the dagger operation and the fact that the nonzero entries of $\left|e_j\right>$ and $\left|e_k\right>$ have no imaginary parts we have $(\left|e_j\right> + \left|e_k\right>)^\dagger = \left<e_j\right| + \left<e_k\right|$ .

We then have

$|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$

$= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>$

$= (\left|e_j\right> + \left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + \left|e_k\right>)$

$= (\left<e_j\right| + \left<e_k\right|) M^\dagger M (\left|e_j\right> + \left|e_k\right>)$

$= (M^\dagger M)_{jj} + (M^\dagger M)_{jk} + (M^\dagger M)_{kj} + (M^\dagger M)_{kk}$

$= 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}$

since we previously showed that all diagonal entries of $M^\dagger M$ are 1.

Since $|| M \left|\psi\right> ||^2 = 2$ and also $|| M \left|\psi\right> ||^2 = 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}$ we thus have $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ for $j \ne k$ .

Now let $\left|\psi\right> = \left|e_j\right> + i\left|e_k\right>$ with $j \ne k$ . Again we have $|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2$ since $M$ is length-preserving, so that

$|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + i\left|e_k\right> ||^2$

$= (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)$

Since $i\left|e_k\right>$ has an imaginary part for its (single) nonzero entry, in performing the dagger operation and taking complex conjugates we obtain $(\left|e_j\right> + i\left|e_k\right>)^\dagger = \left<e_j\right| - i\left<e_k\right|$ . We thus have

$|| M \left|\psi\right> ||^2 = (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)$

$= (\left<e_j\right| - i\left<e_k\right|)(\left|e_j\right> + i\left|e_k\right>)$

$= \left<e_j|e_j\right> + i\left<e_j|e_k\right> - i \left<e_k|e_j\right> - i^2\left<e_k|e_k\right>$

$= \left<e_j|e_j\right> + i\left<e_j|e_k\right> - i\left<e_k|e_j\right> + \left<e_k|e_k\right>$

$= 1^2 + i\cdot 0 - i\cdot 0 + 1^2 = 2$

We also have

$|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>$

$= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>$

$= (\left|e_j\right> + i\left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + i\left|e_k\right>)$

$= (\left<e_j\right| - i\left<e_k\right|) M^\dagger M (\left|e_j\right> + i\left|e_k\right>)$

$= (M^\dagger M)_{jj} + i(M^\dagger M)_{jk} - i(M^\dagger M)_{kj} + (M^\dagger M)_{kk}$

$= 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$

Since $|| M \left|\psi\right> ||^2 = 2$ we have $2 = 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$ or $0 = i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)$ so that $(M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0$ .

But we showed above that $(M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0$ . Adding the two equations the terms for $(M^\dagger M)_{kj}$ cancel out and we get $(M^\dagger M)_{jk} = 0$ for $j \ne k$ . So all nondiagonal entries of $M^\dagger M$ are equal to zero.

Since all diagonal entries of $M^\dagger M$ are equal to 1 and all nondiagonal entries of $M^\dagger M$ are equal to zero, we have $M^\dagger M = I$ and thus the matrix $M$ is unitary.

Since we assumed $M$ was a length-preserving matrix we have thus shown that all length-preserving matrices are unitary.

4 Responses to All length-preserving matrices are unitary

Alper Paksoy says:

March 1, 2020 at 12:14 pm

The notation and the terms in this proof are somewhat different from what we have been using in linear algebra courses I have been taking (we call the “dagger”, for example, the complex conjugate or the Hermitian form of a matrix) and sorry that I did not go through all of your proof but it inspired me a great deal in completing my own proof: Here is the one I came up with: https://github.com/apaksoy/alaff/blob/master/Theorem%202_2_4_4.pdf

I have benefited quite a bit from studying Strang’s “Introduction to Linear Algebra, 5E” book when I was taking the introductory linear algebra course in addition to Axler’s “Linear Algebra Done Right, 3E”. I think both are great texts but in quite different ways. Thx!

hecker says:

March 1, 2020 at 5:25 pm

Thanks for your comment! Yes, the bra-ket notation can be somewhat difficult to understand if you’re not used to it, but it seems to be the prevalent notation for those working with complex matrices in the context of quantum mechanics. I looked at your proof and noted that the key steps are the same as in mine: you choose x = e_j + e_k for one step in the proof and then x = e_j + ie_k for a second step in the proof. That’s the secret to getting two equations that you can add together in order to cancel one of the terms and equate the remaining term to zero.

Alper Paksoy says:

March 14, 2020 at 1:54 pm

One of the homework in the linear algebra course I am taking implies there is a much shorter proof for this theorem if you are allowed to use singular value decomposition. See Homework 2.5.1.8. at the following link if you would like to take a look: https://github.com/apaksoy/alaff/blob/master/HW%202_5_1.pdf

- hecker says:
  
  March 14, 2020 at 6:51 pm
  
  Thanks for pointing readers to this!