In doing the answers to exercise 2.6.14 in Gilbert Strang’s Linear Algebra and Its Applications, Third Edition I noticed one of the downsides of the book: While Strang’s focus on practical applications is usually welcome, sometimes in his desire to avoid abstract concepts and arguments he hand waves his way through important points and leaves the reader somewhat confused. At least, I was confused in his discussion of rule 2V on page 123, in which he doesn’t really provide a lot of background (let alone a real proof) for why the composition of two linear transformations should itself be a linear transformation.

As I’ve done before in a couple of cases, I thought it was worth stopping and reviewing the basic definition and consequent properties of linear transformations, ignoring the connection with matrices and focusing just on the abstract concept.

1) Definition of a linear transformation. First, a linear transformation is a function from one vector space to another vector space (which may be itself). So if we have two vector spaces and , a linear transformation takes a vector in and produces a vector in . In other words using function notation. (For clarity I’ll continue to use function notation for the rest of this post.)

What makes a linear transformation linear is that it has the property that

for any and in and any scalars and that could be used to multiply vectors in and .

2) Alternate definition of a linear transformation. Note that the property above is often expressed instead in the form of two simpler properties:

for any and in and any scalars and that could be used to multiply vectors in and .

This alternate definition is equivalent to the definition in (1) above, as shown by the following argument:

Suppose we have . Since and are vectors in and and are scalars, by the definition of a vector space we know that and are also vectors in . (Vector spaces are closed under scalar multiplication.) By the alternate definition we thus have . By the same definition we also have and so that . Combining the equations we see that .

Note also that the original property reduces to if and reduces to if .

3) Applying a linear transformation to an arbitrary linear combination of vectors. Suppose we have a linear transformation from to , an arbitrary set of vectors , , through in and an arbitrary set of scalars , , through . Then we have

This is easily proved using induction: First, for from the definition in (1) above we have

Now suppose for some we have

Then for we have

Since the proposition is true for and is also true for for any , it is true for all .

4) The composition of two linear transformations. Suppose is a linear transformation from a vector space to a vector space and is a linear transformation from a vector space to . We define their composition to be for all in ; the result is a vector in .

We can show that is a linear transformation as follows: Given and in we have

since is a linear transformation and

since is a linear transformation.

Since

we see that is a linear transformation as well.

Finally, if we have a third linear transformation from a vector space to then the result of applying and then to form the composition is the same as applying then to form the composition . (In other words, composition of linear transformations is associative.) For the proof of this see the answers to exercise 2.6.14.

NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.

If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fourth Edition and the accompanying free online course, and Dr Strang’s other books.