All length-preserving matrices are unitary

I recently read the (excellent) online resource Quantum Computing for the Very Curious by Andy Matuschak and Michael Nielsen. Upon reading the proof that all length-preserving matrices are unitary and trying it out myself, I came to believe that there is an error in the proof as written, specifically with trying to show that off-diagonal entries in M^\dagger M are zero if M is length-preserving.

Using the identity || M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>, a suitable choice of \left|\psi\right> = \left|e_j\right> + \left|e_k\right> with j \ne k, and the fact that M is length-preserving, Nielsen first shows that (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0 for j \ne k.

He then goes on to write “But what if we’d done something slightly different, and instead of using \left|\psi\right> = \left|e_j\right> + \left|e_k\right> we’d used \left|\psi\right> = \left|e_j\right> - \left|e_k\right>? … I won’t explicitly go through the steps – you can do that yourself – but if you do go through them you end up with the equation: (M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0.”

I was an undergraduate physics and math major, but either I never worked with bra-ket notation and Hermitian conjugates or I’ve forgotten whatever I knew about them. In any case in working through this I could not get the same result as Nielsen; I simply ended up once again proving that (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0.

After some thought and experimentation I concluded that the key is to choose \left|\psi\right> = \left|e_j\right> + i\left|e_k\right>. Below is my (possibly mistaken!) attempt at a correct proof that all length-preserving matrices are unitary.

Proof: Let M be a length-preserving matrix such that for any vector \left|\psi\right> we have || M \left|\psi\right> || = || \left|\psi\right> ||. We wish to show that M is unitary, i.e., M^\dagger M = I.

We first show that the diagonal elements of M^\dagger M, or (M^\dagger M)_{jj}, are equal to 1.

To do this we start with the unit vectors \left|e_j\right> and \left|e_k\right> with 1 in positions j and k respectively, and 0 otherwise. The product M^\dagger M \left|e_k\right> is then the kth column of M^\dagger M, and \left<e_j\right| M^\dagger M \left|e_k\right> is the jkth entry of M^\dagger M or (M^\dagger M)_{jk}.

From the general identity \left<\psi\right| M^\dagger M \left|\psi\right> = || M \left|\psi\right> ||^2 we also have \left<e_j\right| M^\dagger M \left|e_j\right> = || M \left|e_j\right> ||^2. But since M is length-preserving we have || M \left|e_j\right> ||^2 = || \left|e_j\right> ||^2 = 1^2 = 1 since \left|e_j\right> is a unit vector.

We thus have (M^\dagger M)_{jj} = \left<e_j\right| M^\dagger M \left|e_j\right> = || M \left|e_j\right> ||^2 =  1. So all diagonal entries of M^\dagger M are 1.

We next show that the non-diagonal elements of M^\dagger M, or (M^\dagger M)_{jk} with j \ne k, are equal to zero.

Let \left|\psi\right> = \left|e_j\right> + \left|e_k\right> with j \ne k. Since M is length-preserving we have

|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + \left|e_k\right> ||^2 = 1^2 + 1^2 = 2

We also have || M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right> where \left<\psi\right| = \left|\psi\right>^\dagger = (\left|e_j\right> + \left|e_k\right>)^\dagger. From the definition of the dagger operation and the fact that the nonzero entries of \left|e_j\right> and \left|e_k\right> have no imaginary parts we have (\left|e_j\right> + \left|e_k\right>)^\dagger = \left<e_j\right| + \left<e_k\right|.

We then have

|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>

= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>

= (\left|e_j\right> + \left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + \left|e_k\right>)

= (\left<e_j\right| + \left<e_k\right|) M^\dagger M (\left|e_j\right> + \left|e_k\right>)

= \left<e_j\right| M^\dagger M \left|e_j\right> + \left<e_j\right| M^\dagger M \left|e_k\right> + \left<e_k\right| M^\dagger M \left|e_j\right> + \left<e_k\right| M^\dagger M \left|e_k\right>

= (M^\dagger M)_{jj} + (M^\dagger M)_{jk} + (M^\dagger M)_{kj} + (M^\dagger M)_{kk}

= 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj}

since we previously showed that all diagonal entries of M^\dagger M are 1.

Since || M \left|\psi\right> ||^2 = 2 and also || M \left|\psi\right> ||^2 = 2 + (M^\dagger M)_{jk} + (M^\dagger M)_{kj} we thus have (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0 for j \ne k.

Now let \left|\psi\right> = \left|e_j\right> + i\left|e_k\right> with j \ne k. Again we have || M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 since M is length-preserving, so that

|| M \left|\psi\right> ||^2 = || \left|\psi\right> ||^2 = || \left|e_j\right> + i\left|e_k\right> ||^2

= (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)

Since i\left|e_k\right> has an imaginary part for its (single) nonzero entry, in performing the dagger operation and taking complex conjugates we obtain (\left|e_j\right> + i\left|e_k\right>)^\dagger = \left<e_j\right| - i\left<e_k\right|. We thus have

|| M \left|\psi\right> ||^2 = (\left|e_j\right> + i\left|e_k\right>)^\dagger (\left|e_j\right> + i\left|e_k\right>)

= (\left<e_j\right| - i\left<e_k\right|)(\left|e_j\right> + i\left|e_k\right>)

= \left<e_j\right| \left|e_j\right> + \left<e_j\right| i \left|e_k\right> - i \left<e_k\right| \left|e_j\right> - i \left<e_k\right| i \left|e_k\right>

= \left<e_j|e_j\right> + i\left<e_j|e_k\right> - i \left<e_k|e_j\right> - i^2\left<e_k|e_k\right>

= \left<e_j|e_j\right> + i\left<e_j|e_k\right> - i\left<e_k|e_j\right> + \left<e_k|e_k\right>

= 1^2 + i\cdot 0 - i\cdot 0 + 1^2 = 2

We also have

|| M \left|\psi\right> ||^2 = \left<\psi\right| M^\dagger M \left|\psi\right>

= \left|\psi\right>^\dagger M^\dagger M \left|\psi\right>

= (\left|e_j\right> + i\left|e_k\right>)^\dagger M^\dagger M (\left|e_j\right> + i\left|e_k\right>)

= (\left<e_j\right| - i\left<e_k\right|) M^\dagger M (\left|e_j\right> + i\left|e_k\right>)

= \left<e_j\right| M^\dagger M \left|e_j\right> + \left<e_j\right| M^\dagger M i\left|e_k\right> - i\left<e_k\right| M^\dagger M \left|e_j\right> - i\left<e_k\right| M^\dagger M i\left|e_k\right>

= \left<e_j\right| M^\dagger M \left|e_j\right> + i\left<e_j\right| M^\dagger M i\left|e_k\right> - i\left<e_k\right| M^\dagger M \left|e_j\right> - i^2\left<e_k\right| M^\dagger M \left|e_k\right>

= (M^\dagger M)_{jj} + i(M^\dagger M)_{jk} - i(M^\dagger M)_{kj} + (M^\dagger M)_{kk}

= 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right)

Since || M \left|\psi\right> ||^2 = 2 we have 2 = 2 + i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right) or 0 = i\left((M^\dagger M)_{jk} - (M^\dagger M)_{kj}\right) so that (M^\dagger M)_{jk} - (M^\dagger M)_{kj} = 0.

But we showed above that (M^\dagger M)_{jk} + (M^\dagger M)_{kj} = 0. Adding the two equations the terms for (M^\dagger M)_{kj} cancel out and we get (M^\dagger M)_{jk} = 0 for j \ne k. So all nondiagonal entries of M^\dagger M are equal to zero.

Since all diagonal entries of M^\dagger M are equal to 1 and all nondiagonal entries of M^\dagger M are equal to zero, we have M^\dagger M = I and thus the matrix M is unitary.

Since we assumed M was a length-preserving matrix we have thus shown that all length-preserving matrices are unitary.

This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to All length-preserving matrices are unitary

  1. Alper Paksoy says:

    The notation and the terms in this proof are somewhat different from what we have been using in linear algebra courses I have been taking (we call the “dagger”, for example, the complex conjugate or the Hermitian form of a matrix) and sorry that I did not go through all of your proof but it inspired me a great deal in completing my own proof: Here is the one I came up with: https://github.com/apaksoy/alaff/blob/master/Theorem%202_2_4_4.pdf

    I have benefited quite a bit from studying Strang’s “Introduction to Linear Algebra, 5E” book when I was taking the introductory linear algebra course in addition to Axler’s “Linear Algebra Done Right, 3E”. I think both are great texts but in quite different ways. Thx!

  2. hecker says:

    Thanks for your comment! Yes, the bra-ket notation can be somewhat difficult to understand if you’re not used to it, but it seems to be the prevalent notation for those working with complex matrices in the context of quantum mechanics. I looked at your proof and noted that the key steps are the same as in mine: you choose x = e_j + e_k for one step in the proof and then x = e_j + ie_k for a second step in the proof. That’s the secret to getting two equations that you can add together in order to cancel one of the terms and equate the remaining term to zero.

  3. Alper Paksoy says:

    One of the homework in the linear algebra course I am taking implies there is a much shorter proof for this theorem if you are allowed to use singular value decomposition. See Homework 2.5.1.8. at the following link if you would like to take a look: https://github.com/apaksoy/alaff/blob/master/HW%202_5_1.pdf

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s