In doing the answers to exercise 2.6.14 in Gilbert Strang’s Linear Algebra and Its Applications, Third Edition I noticed one of the downsides of the book: While Strang’s focus on practical applications is usually welcome, sometimes in his desire to avoid abstract concepts and arguments he hand waves his way through important points and leaves the reader somewhat confused. At least, I was confused in his discussion of rule 2V on page 123, in which he doesn’t really provide a lot of background (let alone a real proof) for why the composition of two linear transformations should itself be a linear transformation.
As I’ve done before in a couple of cases, I thought it was worth stopping and reviewing the basic definition and consequent properties of linear transformations, ignoring the connection with matrices and focusing just on the abstract concept.
1) Definition of a linear transformation. First, a linear transformation is a function from one vector space to another vector space (which may be itself). So if we have two vector spaces and
, a linear transformation
takes a vector
in
and produces a vector
in
. In other words
using function notation. (For clarity I’ll continue to use function notation for the rest of this post.)
What makes a linear transformation linear is that it has the property that
for any and
in
and any scalars
and
that could be used to multiply vectors in
and
.
2) Alternate definition of a linear transformation. Note that the property above is often expressed instead in the form of two simpler properties:
for any and
in
and any scalars
and
that could be used to multiply vectors in
and
.
This alternate definition is equivalent to the definition in (1) above, as shown by the following argument:
Suppose we have . Since
and
are vectors in
and
and
are scalars, by the definition of a vector space we know that
and
are also vectors in
. (Vector spaces are closed under scalar multiplication.) By the alternate definition we thus have
. By the same definition we also have
and
so that
. Combining the equations we see that
.
Note also that the original property reduces to
if
and reduces to
if
.
3) Applying a linear transformation to an arbitrary linear combination of vectors. Suppose we have a linear transformation from
to
, an arbitrary set of vectors
,
, through
in
and an arbitrary set of scalars
,
, through
. Then we have
This is easily proved using induction: First, for from the definition in (1) above we have
Now suppose for some we have
Then for we have
Since the proposition is true for and is also true for
for any
, it is true for all
.
4) The composition of two linear transformations. Suppose is a linear transformation from a vector space
to a vector space
and
is a linear transformation from a vector space
to
. We define their composition
to be
for all
in
; the result
is a vector in
.
We can show that is a linear transformation as follows: Given
and
in
we have
since is a linear transformation and
since is a linear transformation.
Since
we see that is a linear transformation as well.
Finally, if we have a third linear transformation from a vector space
to
then the result of applying
and then
to form the composition
is the same as applying
then
to form the composition
. (In other words, composition of linear transformations is associative.) For the proof of this see the answers to exercise 2.6.14.
NOTE: This continues a series of posts containing worked out exercises from the (out of print) book Linear Algebra and Its Applications, Third Edition by Gilbert Strang.
If you find these posts useful I encourage you to also check out the more current Linear Algebra and Its Applications, Fourth Edition, Dr Strang’s introductory textbook Introduction to Linear Algebra, Fourth Edition
and the accompanying free online course, and Dr Strang’s other books
.