Multiplication as a transformation

The idea of a transformation can seem more complicated than it really is at first, so before diving into how 2×22 \times 2 matrices transform two-dimensional space, or how 3×33 \times 3 matrices transform three-dimensional space, let's go over how plain old numbers—a.k.a. 1×11 \times 1 matrices—can be considered transformations of one-dimensional space.
One-dimensional space is simply the number line.
Number Line
What happens when you multiply every number on the line by a particular value, like two? One way to visualize this is as follows:
We keep a copy of the original line for reference, then slide each number on the line to two times that number.
Similarly, multiplication by 12\dfrac{1}{2} could be visualized like this:
And, so the negative numbers don't feel neglected, here is multiplication by negative three:
For those of you fond of fancy terminology, these animated actions could be described as "linear transformations of one-dimensional space". The word transformation means the same thing as the word function: something which takes in a number and outputs a number, like f(x)=2xf(x) = 2x. However, while we typically visualize functions with graphs, people tend to use the word transformation to indicate that you should instead visualize some object moving, stretching, squishing, etc. So, the function f(x)=2xf(x) = 2x visualized as a transformation gives us the multiplication-by-two video above. It moves the point one on the number line to where two starts off, moves two to where four starts off, etc.
The technical definition of a linear transform is a function f(x)f(x) that satisfies two properties:
  • f(x+y)=f(x)+f(y)f(x+y) = f(x) + f(y)
  • f(cx)=cf(x)f(cx) = cf(x)
In one dimension, xx and yy are numbers, as opposed to, say, vectors. In this special case of one dimension, the only functions satisfying these properties look like multiplication by a constant—i.e., f(x)=kxf(x) = kx for some constant kk.
Why? Plug in x=1x=1 to the second property, and we see f(c)=cf(1)f(c) = cf(1). If we interpret cc as a variable and f(1)f(1) as some constant, the entire function is just multiplication by f(1)f(1). For instance, if f(1)=8f(1) = 8, then f(x)=8xf(x) = 8x. In fact, we don't even need the first property since once we know ff looks like f(x)=kxf(x) = kx, the fact that f(x+y)=f(x)+f(y)f(x+y) = f(x)+f(y) follows from the distributive law: k(x+y)=kx+kyk(x + y) = kx + ky.
This might seem like an awfully complicated way to describe multiplication, especially given that the first property above is pointless. However, the importance of linearity comes when ff is a function of vectors. The fact that in one-dimension ff is completely determined by where it takes the number one has a much more interesting analog in higher dimensions.
Before we move on to two-dimensional space, there's one simple but important fact we should keep in the back of our minds. Suppose you watch one of these transformations, knowing that it uses multiplication by some number but without knowing what that number is:
You can easily figure out which number is being multiplied into the line by following one. In this case, one lands where negative three started off, so you can tell that the animation represents multiplication by negative three.
Thinking of numbers as transformations gives an alternate interpretation of multiplication.
If we apply two transformation in a row, for instance multiplication by two, then by three,
the total action is the same as some other single transformation, in this case multiplication by six:
This gives a convoluted understanding of multiplication, which will be surprisingly useful as an analogy for matrix multiplication.
Person A: What is four times five?
Person B: Well, consider the unique linear transformation which takes one to four, as well as the unique linear transformation which takes one to five, then apply each of these transformations, one after the other, and four times five will be whatever number one lands on.
Person A: ... that's the stupidest thing I've ever heard.

What do linear transformations in two dimensions look like?

A two-dimensional linear transformation is a special kind of function which takes in a two-dimensional vector [xy]\left[ \begin{array}{c} x \\ y \end{array} \right] and outputs another two-dimensional vector. As before, our use of the word transformation indicates we should think about smooshing something around, which in this case is two-dimensional space.
Here are some examples:
For our purposes, what makes a transformation linear is the following geometric rule: The origin must remain fixed, and all lines must remain lines. So, all the transformations in the above animation are examples of linear transformations, but the following are not:
As in one dimension, what makes a two-dimensional transformation linear is that it satisfies two properties:
f(v+w)=f(v)+f(w)f(\textbf{v}+\textbf{w}) = f(\textbf{v})+f(\textbf{w})
f(cv)=cf(v)f(c\textbf{v}) = cf(\textbf{v})
Only now, v\textbf{v} and w\textbf{w} are vectors instead of numbers. While in one-dimension, the first property was useless, it now plays a more important role because, in some sense, it determines how the two different dimensions play together during a transformation.

Following specific vectors during a transformation

Imagine you are watching one particular transformation, like this one:
How could you describe this transformation to a friend who is not watching the same animation? You can no longer describe it using a single number, the way we could just follow the number one in the one-dimensional case. To help keep track of everything, let's put a green arrow over the vector [10]\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]}, a red arrow over the vector [01]\redD{\left[\begin{array}{c} 0 \\ 1 \end{array} \right]}, and fix a copy of the grid in the background.
Now it's a lot easier to see where things land. Watch the animation again, and focus on the vector [11]\left[ \begin{array}{c} 1 \\ 1 \end{array} \right]. We can more easily follow it to see that it lands on the vector [42]\left[\begin{array}{c} 4 \\ -2 \end{array} \right].
We can represent this fact with the following notation:
[11][42]\left[ \begin{array}{c} 1 \\ 1 \end{array} \right] \rightarrow \left[ \begin{array}{c} 4 \\ -2 \end{array} \right]
Practice Problem: Where does the point at [10]\left[ \begin{array}{c} -1 \\ 0 \end{array}\right] end up after the plane has undergone the transformation in the above video?
Choose 1 answer:
Choose 1 answer:

Practice Problem Even though it has gone off screen, can you predict where the point [30]\left[ \begin{array}{c} 3 \\ 0 \end{array}\right] has landed?
Choose 1 answer:
Choose 1 answer:

Notice, a vector like [20]\left[ \begin{array}{c} 2 \\ 0 \end{array} \right], which starts off as 22 times the green arrow, continues to be 22 times the green arrow after the transformation. Since the green arrow lands on [12]\greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]}, we can deduce that
[20]2[12]=[24]\left[ \begin{array}{c} 2 \\ 0 \end{array} \right] \rightarrow 2 \cdot \greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]} = \left[ \begin{array}{c} 2 \\ -4 \end{array} \right].
And in general
[x0]=x[10]x[12]=[x2x]\begin{aligned} \left[ \begin{array}{c} x \\ 0 \end{array} \right] = x \cdot \greenD{\left[\begin{array}{c} 1 \\ 0 \end{array}\right]} &\rightarrow x \cdot \greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]} = \left[ \begin{array}{c} x \\ -2x \end{array} \right] \\ \end{aligned}
Similarly, the destination of the entire yy-axis is determined by where the red arrow [01]\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]} lands, which for this transformation is [30]\redD{\left[ \begin{array}{c} 3 \\ 0 \end{array} \right]}.
Practice Problem: After the plane has undergone the transformation illustrated above, where does the general point [0y]\left[ \begin{array}{c} 0 \\ y \end{array}\right] on the yy-axis land?
Choose 1 answer:
Choose 1 answer:

In fact, once we know where [10]\left[ \begin{array}{c} 1 \\ 0 \end{array} \right] and [01]\left[ \begin{array}{c} 0 \\ 1 \end{array} \right] land, we can deduce where every point on the plane must go. For example, let's follow the point [12]\left[ \begin{array}{c} -1 \\ 2 \end{array} \right] in our animation:
It starts at negative one times the green arrow plus two times the red arrow, but it also ends at negative one times the green arrow plus two times the red arrow, which after the transformation means
1[12]+2[30]=[52] -1 \cdot \greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]} + 2 \cdot \redD{\left[ \begin{array}{c} 3 \\ 0 \end{array} \right]} = \left[ \begin{array}{c} 5 \\ 2 \end{array} \right]
This ability to break up a vector in terms of it components both before and after the transformation is what's so special about linear transformations.
Practice Problem: Use this same tactic to compute where the vector [11]\left[ \begin{array}{c} 1 \\ -1 \end{array}\right] lands.
Choose 1 answer:
Choose 1 answer:

Representing two dimensional linear transforms with matrices

In general, each vector [xy]\left[ \begin{array}{c} x \\ y \end{array} \right] can be broken down as follows:
[xy]=x[10]+y[01] \left[ \begin{array}{c} x \\ y \end{array} \right] = x\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]} + y\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]}
So, if the green arrow [10]\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]} lands on some vector [ac]\greenD{\left[ \begin{array}{c} a \\ c \end{array} \right]} and the red arrow [01]\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]} lands on some vector [bd]\redD{\left[ \begin{array}{c} b \\ d \end{array} \right]}, then the vector [xy]\left[ \begin{array}{c} x \\ y \end{array} \right] must land on
x[ac]+y[bd]=[ax+bycx+dy]. x \cdot \greenD{\left[ \begin{array}{c} a \\ c \end{array} \right]} + y \cdot \redD{\left[ \begin{array}{c} b \\ d \end{array} \right]} = \left[ \begin{array}{c} \greenD{a}x + \redD{b}y \\ \greenD{c}x + \redD{d}y \end{array} \right].
A really nice way to describe all this is to represent a given linear transform with the matrix below:
A=[abcd]\textbf{A} = \left[\begin{array}{cc} a & b \\ c & d \end{array} \right]
In this matrix, the first column tells us where [10]\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]} lands, and the second column tells us where [01]\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]} lands. Now we can describe where any vector v=[xy]\textbf{v} = \left[ \begin{array}{c} x \\ y \end{array} \right] lands very compactly as the matrix-vector product
Av=[ax+bycx+dy]\textbf{Av} = \left[\begin{array}{c} ax + by \\ cx + dy \end{array}\right].
In fact, this is where the definition of a matrix-vector product comes from.
So in the same way that one-dimensional linear transformations can be described as multiplication by some number, namely whichever number one lands on top of, two-dimensional linear transformations can always be described by a 2×22 \times 2 matrix, namely the one whose first column indicates where [10]\left[ \begin{array}{c} 1 \\ 0 \end{array} \right] lands and whose second column indicates where [01]\left[ \begin{array}{c} 0 \\ 1 \end{array} \right] lands.