Matrices as transformations

Learn how exactly 2x2 matrices act as transformations of the plane.

Introduction

If we think about a matrix as a transformation of space it can lead to a deeper understanding of matrix operations. This viewpoint helps motivate how we define matrix operations like multiplication, and, it gives us a nice excuse to draw pretty pictures. This material touches on linear algebra (usually a college topic).

Multiplication as a transformation

The idea of a "transformation" can seem more complicated than it really is at first, so before diving into how $2 \times 2$ matrices transform $2$-dimensional space, or how $3 \times 3$ matrices transform $3$-dimensional space, let's go over how plain old numbers (a.k.a. $1 \times 1$ matrices) can be considered transformations of $1$-dimensional space.
"$1$-dimensional space" is simply the number line.
Number Line
What happens when you multiply every number on the line by a particular value, like $2$? One way to visualize this is as follows:
We keep a copy of the original line for reference, then slide each number on the line to $2$ times that number.
Similarly, multiplication by $\dfrac{1}{2}$ could be visualized like this:
And so that negative numbers don't feel neglected, here is multiplication by $-3$:
For those of you fond of fancy terminology, these animated actions could be described as "Linear transformations of $1$-dimensional space". The word “transformation” means the same thing as “function”: something which takes in a number and outputs a number, like $f(x) = 2x$. However, while we typically visualize functions with their graphs, people tend to use the word “transformation” to indicate that you should instead visualize some object moving, stretching, squishing, etc. So the function $f(x) = 2x$ visualized as a transformation gives us the "Multiplication by $2$" video above. It moves the point $1$ on the number line to where $2$ starts off, moves $2$ to where $4$ starts off, etc.
The technical definition of a “linear” transform is a function $f(x)$ that satisfies the two properties
• $f(x+y) = f(x) + f(y)$
• $f(cx) = cf(x)$
Here, in 1 dimension, $x$ and $y$ are numbers, as opposed to, say, vectors. In this special case of 1 dimension, the only functions satisfying these properties look like multiplication by a constant, i.e. $f(x) = kx$ for some constant $k$.
Why? Plug in $x=1$ to the second property, and we see $f(c) = cf(1)$. If we interpret $c$ as a variable and $f(1)$ as some constant, the entire function is just multiplication by $f(1)$. For instance, if $f(1) = 8$, then $f(x) = 8x$. In fact, we don't even need the first property, since once we know $f$ looks like $f(x) = kx$, the fact that $f(x+y) = f(x)+f(y)$ follows from the distributive law: $k(x + y) = kx + ky$.
This might seem like an awfully complicated way to describe multiplication, especially given that the first property above is pointless. However, the importance of linearity comes when $f$ is a function of vectors. The fact that in $1$-dimension $f$ is completely determined by where it takes the number $1$ has a much more interesting analog in higher dimensions.
Before we move on to $2$-dimensional space, there's one simple but important fact we should keep in the back of our minds. Suppose you watch one of these transformations, knowing that it's multiplication by some number, but without knowing what that number is, like this one:
You can easily figure out which number is being multiplied into the line by $\goldE{\text{following }1}$. In this case, $1$ lands where $-3$ started off, so you can tell that the animation represents multiplication by $-3$.
Thinking of numbers as transformations gives an alternate interpretation of multiplication.
If we apply two transformation in a row, for instance multiplication by $2$, then by $3$,
the total action is the same as some other single transformation, in this case multiplication by $6$:
This gives a convoluted understanding of multiplication, which will be surprisingly useful as an analogy for matrix multiplication.
Person A: What is $4 \times 5$?
Person B: Well, consider the unique linear transform which takes $1$ to $4$, as well as the unique linear transform which takes $1$ to $5$, then apply each of these transformations, one after the other, and $4\times 5$ will be whatever number $1$ lands on.
Person A: ...that's the stupidest thing I've ever heard.

What do linear transformations in $2$ dimensions look like?

A $2$-dimensional linear transformation is a special kind of function which takes in a $2$-dimensional vector $\left[ \begin{array}{c} x \\ y \end{array} \right]$ and outputs another $2$-dimensional vector. As before, our use of the word “transformation” indicates we should think about smooshing something around, which in this case is $2$-dimensional space. Here are some examples:
For our purposes, what makes a transformation linear is the following geometric rule: The origin must remain fixed, and all lines must remain lines. So all the transforms in the above animation are examples, but the following are not:
As in 1 dimension, what makes a transformation “linear” is that it satisfies the two properties
$f(\textbf{v}+\textbf{w}) = f(\textbf{v})+f(\textbf{w})$
$f(c\textbf{v}) = cf(\textbf{v})$
where $\textbf{v}$ and $\textbf{w}$ are now vectors instead of numbers. While in 1-dimension the first property was useless, it now plays a more important role, because in some sense it determines how the two different dimensions play together during a transformation.

Following specific vectors during a transformation

Imagine you are watching one particular transformation, like this one
How could you describe this to a friend who is not watching the same animation? You can no longer describe it using a single number, the way we could just follow the number $1$ in the one dimensional case. To help keep track of everything, let's put a green arrow over the vector $\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]}$, put a red arrow over the vector $\redD{\left[\begin{array}{c} 0 \\ 1 \end{array} \right]}$, and fix a copy of the grid in the background.
Now it's a lot easier to see where things land. For example, watch the animation again, and focus on the vector $\left[ \begin{array}{c} 1 \\ 1 \end{array} \right]$, we can more easily follow it to see that it lands on the vector $\left[\begin{array}{c} 4 \\ -2 \end{array} \right]$.
We can represent this fact with the following notation:
$\left[ \begin{array}{c} 1 \\ 1 \end{array} \right] \rightarrow \left[ \begin{array}{c} 4 \\ -2 \end{array} \right]$
Practice Problem: Where does the point at $\left[ \begin{array}{c} -1 \\ 0 \end{array}\right]$ end up after the plane has undergone the transformation in the above video?
Practice Problem Even though it has gone off screen, can you predict where the point $\left[ \begin{array}{c} 3 \\ 0 \end{array}\right]$ has landed?
Notice, a vector like $\left[ \begin{array}{c} 2 \\ 0 \end{array} \right]$, which starts off as $2$ times the green arrow, continues to be $2$ times the green arrow after the transformation. Since the green arrow lands on $\greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]}$, we can deduce that
$\left[ \begin{array}{c} 2 \\ 0 \end{array} \right] \rightarrow 2 \cdot \greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]} = \left[ \begin{array}{c} 2 \\ -4 \end{array} \right]$.
And in general
\begin{aligned} \left[ \begin{array}{c} x \\ 0 \end{array} \right] = x \cdot \greenD{\left[\begin{array}{c} 1 \\ 0 \end{array}\right]} &\rightarrow x \cdot \greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]} = \left[ \begin{array}{c} x \\ -2x \end{array} \right] \\ \end{aligned}
Similarly, the destination of the entire $y$-axis is determined by where the red arrow $\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]}$ lands, which for this transformation is $\redD{\left[ \begin{array}{c} 3 \\ 0 \end{array} \right]}$.
Practice Problem: After the plane has undergone the transformation illustrated above, where does the general point $\left[ \begin{array}{c} 0 \\ y \end{array}\right]$ on the $y$-axis land?
In fact, once we know where $\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]$ and $\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]$ land, we can deduce where every point on the plane must go. For example, let's follow the point $\left[ \begin{array}{c} -1 \\ 2 \end{array} \right]$ in our animation:
It starts at $-1$ times the green arrow plus $2$ times the red arrow, but it also ends at $-1$ times the green arrow plus $2$ times the red arrow, which after the transformation means
$-1 \cdot \greenD{\left[ \begin{array}{c} 1 \\ -2 \end{array} \right]} + 2 \cdot \redD{\left[ \begin{array}{c} 3 \\ 0 \end{array} \right]} = \left[ \begin{array}{c} 5 \\ 2 \end{array} \right]$
This ability to break up a vector in terms of its components both before and after the transformation is what's so special about linear transformations.
Practice Problem: Use this same tactic to compute where the vector $\left[ \begin{array}{c} 1 \\ -1 \end{array}\right]$ lands.
In general, since each vector $\left[ \begin{array}{c} x \\ y \end{array} \right]$ can be broken down as
$\left[ \begin{array}{c} x \\ y \end{array} \right] = x\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]} + y\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]}$
If the green arrow $\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]}$ lands on some vector $\greenD{\left[ \begin{array}{c} a \\ c \end{array} \right]}$, and the red arrow $\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]}$ lands on some vector $\redD{\left[ \begin{array}{c} b \\ d \end{array} \right]}$, then the vector $\left[ \begin{array}{c} x \\ y \end{array} \right]$ must land on
$x \cdot \greenD{\left[ \begin{array}{c} a \\ c \end{array} \right]} + y \cdot \redD{\left[ \begin{array}{c} b \\ d \end{array} \right]} = \left[ \begin{array}{c} \greenD{a}x + \redD{b}y \\ \greenD{c}x + \redD{d}y \end{array} \right]$
$\textbf{A} = \left[\begin{array}{cc} a & b \\ c & d \end{array} \right]$
where the first column tells us where $\greenD{\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]}$ lands and the second column tells us where $\redD{\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]}$ lands. Now we can describe where any vector $\textbf{v} = \left[ \begin{array}{c} x \\ y \end{array} \right]$ lands very compactly as the matrix-vector product
$\textbf{Av} = \left[\begin{array}{c} ax + by \\ cx + dy \end{array}\right]$
So in the same way that $1$-dimensional linear transforms could be described as multiplication by some number, namely whichever number $1$ lands on top of, $2$-dimensional linear transforms can always be described by a $2 \times 2$ matrix, namely the one whose first column indicates where $\left[ \begin{array}{c} 1 \\ 0 \end{array} \right]$ lands, and whose second column indicates where $\left[ \begin{array}{c} 0 \\ 1 \end{array} \right]$ lands.