Main content

## Linear algebra

### Unit 2: Lesson 5

Finding inverses and determinants# Deriving a method for determining inverses

Determining a method for constructing inverse transformation matrices. Created by Sal Khan.

## Want to join the conversation?

- What is the difference between a transformation matrix and a permutation matrix?(6 votes)
- A permutation matrix has ones and zeroes only. All it can do is move entries from the matrix/vector it is being multiplied with, so some
**very**limited transformations could be represented with a permutation vector. For instance, here's a permutation matrix to swap row 2 and row 1 in a matrix/vector with 3 rows:

[0 1 0]

[1 0 0]

[0 0 1]

it is the identity matrix with row 2 and row 1 swapped. Simple huh? Please multiply that matrix with this column vector

[x]

[y]

[z]

to verify that it does what we think it should do.

A transformation matrix has a lot more freedom. It can stretch along one axis independently from others. For instance, here's a stretch of 5 in the x axis and a shrink by a half in the z axis:

[ 5 0 0]

[ 0 1 0]

[ 0 0 1/2]

transformation matrices can rotate, flip, project, etc: http://en.wikipedia.org/wiki/Linear_mapping#Examples_of_linear_transformation_matrices.(19 votes)

- In my current textbook (and I'm sure other places discussing this topic),
**invertible**is a term that means the same thing as the term**non-singular**, such that there are a finite number of row operations you can do to get to the identity matrix. (All those row operations merged together are called the inverse of the matrix.)

Likewise,**non-invertible**corresponds with**singular**, such that there is no matrix that can produce the identity matrix for a singular matrix.

It's been pretty confusing for me with different terminology in my class and online, but I hope this helps out someone!(4 votes)- Mathematics progresses at different rates among different people with different perspectives for different reasons all the time, so you end up with differences in language, interpretation, visualization, focus, goal and strategy... all of which leads to differences in vocabulary. You will (hopefully) find that each type of vocabulary reveals the math in a different way, so it is useful to master as many versions of an idea as you can.(6 votes)

- I do not understand how he got the numbers at 3min 38 seconds Can you plz explain(4 votes)
- S is the transformation matrix we're trying to solve, I is the Identity matrix,

the idea is that if we apply the transformation to I, that is, SxI, we should get S itself since I is identity.

So let us apply to I (identity) what we know of what to do regarding this transformation, [a1, a2, a3] -> [a1, a2+a1, a3-a1], the result should be S.(4 votes)

- Can anyone please help me out with a way to calculate faster the inverse of a matrix?(2 votes)
- I'm having a problem. I've multiplied S1xA by hand and with an app. Both times I got

1 -1 -1

-2 2 3

-2 1 4

Not

1 -1 -1

0 1 2

0 2 5

Any thoughts on what I did wrong?(2 votes)- Wrong order. Unlike normal multiplication A*B is not the same as B*A

If you are confused which order you should do, think of each matrix as a function, where you start with f(x) and then to have that be part of a function you write g(f(x)), expanding to the left. That's how I think of it.(3 votes)

- What is row echelon form?(1 vote)
- Hat is homomorphism? (This is out of context question)(2 votes)
- Is it possible to convert a matrix into row echelon form by transforming the row vectors instead of column vectors?(2 votes)
- I'm confused by the order of multiplying the matrices. We have matrix A. I thought we would be multiplying this matrix by the transformed identity matrix to get our result. However, that doesn't work out = we multiply I by A (I is first). Is there a general rule in linear algebra that says which matrix should the the first in multiplication (since commutative law doesn't hold).(1 vote)
- If you perform a transformation T, then perform a transformation S, where A is the matrix corresponding to T and B is the matrix corresponding to S, then it would be B * A * v, where v is the vector you are transforming.(2 votes)

- Why is the word "singular" used for matrices that aren't invertible? Are all functions that aren't both 1-1 and onto called "singular"?(1 vote)
- It's just an arbitrary term they (the math community) chose. The term "singular" is only used for matrices according to wikipedia, but I am sure it is also used for linear operators (i.e. a linear transformation T: V -> V). Note that singular matrices are also square.(2 votes)

## Video transcript

I have this matrix A here that I
want to put into reduced row echelon form. And we've done this
multiple times. You just perform a bunch
of row operations. But what I want to show you in
this video is that those row operations are equivalent to
linear transformations on the column vectors of A. So let me show you by example. So if we just want to put A into
reduced row echelon form, the first step that we might
want to do if we wanted to zero out these entries right
here, is-- let me do it right here-- is we'll keep our
first entry the same. So for each of these column
vectors, we're going to keep the first entry the same. So they're going to be
1, minus 1, minus 1. And actually, let me
simultaneously construct my transformation. So I'm saying that my row
operation I'm going to perform is equivalent to a linear
transformation on the column vector. So it's going to be a
transformation that's going to take some column vector,
a1, a2, and a3. It's going to take each of these
and then do something to them, do something to them
in a linear way. They'll be linear
transformations. So we're keeping the
first entry of our column vector the same. So this is just going
to be a1. This is a line right here. That's going to be a1. Now, what can we do if
we want to get to reduced row echelon form? We'd want to make
this equal to 0. So we would want to replace our
second row with the second row plus the first row, because
then these guys would turn out to be 0. So let me write that on
my transformation. I'm going to replace the second
row with the second row plus the first row. Let me write it out here. Minus 1 plus 1 is 0. 2 plus minus 1 is 1. 3 plus minus 1 is 2. Now, we also want
to get a 0 here. So let me replace my third
row with my third row minus my first row. So I'm going to replace my third
row with my third row minus my first row. So 1 minus 1 is 0. 1 minus minus 1 is 2. 4 minus minus 1 is 5,
just like that. So you see this was just a
linear transformation. And any linear transformation
you could actually represent as a matrix vector product. So for example, this
transformation, I could represent it. To figure out its transformation
matrix, so if we say that T of x is equal to,
I don't know, let's call it some matrix S times x. We already used the matrix A. So I have to pick
another letter. So how do we find S? Well, we just apply the
transformation to all of the column vectors, or the standard
basis vectors of the identity matrix. So let's do that. So the identity matrix-- I'll
draw it really small like this-- the identity matrix looks
like this, 1, 0, 0, 0, 1, 0, 0, 0, 1. That's what that identity
matrix looks like. To find the transformation
matrix, we just apply this guy to each of the column
vectors of this. So what do we get? I'll do it a little
bit bigger. We apply it to each of
these column vectors. But we see the first row
always stays the same. So the first row is always going
to be the same thing. So 1, 0, 0. I'm essentially applying it
simultaneously to each of these column vectors, saying,
look, when you transform each of these column vectors, their
first entry stays the same. The second entry becomes
the second entry plus the first entry. So 0 plus 1 is 1. 1 plus 0 is 1. 0 plus 0 is 0. Then the third entry gets
replaced with the third entry minus the first entry. So 0 minus 1 is minus 1. 0 minus 0 is 0. 1 minus 0 is 1. Now notice, when I apply this
transformation to the column vectors of our identity matrix,
I essentially just performed those same
row operations that I did up there. I performed those exact same
row operations on this identity matrix. But we know that this is
actually the transformation matrix, that if we multiply
it by each of these column vectors, or by each of these
column vectors, we're going to get these column vectors. So you can view it this way. This right here, this
is equal to S. This is our transformation
matrix. So we could say that if we
create a new matrix whose columns are S times this column
vector, S times 1, minus 1, 1. And then the next column is S
times-- I wanted to do it in that other color-- S times
this guy, minus 1, 2, 1. And then the third column is
going to be S times this third column vector, minus 1, 3, 4. We now know we're applying this
transformation, this is S, times each of these
column vectors. That is the matrix
representation of this transformation. This guy right here will
be transformed to this right here. Let me do it down here. I wanted to show that stuff that
I had above here as well. Well, I'll just draw an arrow. That's probably the
simplest thing. This matrix right here
will become that matrix right there. So another way you could
write it, this is equivalent to what? What is this equivalent to? When you take a matrix and you
multiply it times each of the column vectors, when you
transform each of the column vectors by this matrix, this
is the definition of a matrix-matrix product. This is equal to our matrix S--
I'll do it in pink-- this is equal to our matrix S, which
is 1, 0, 0, 1, 1, 0, minus 1, 0, 1, times our matrix
A, times 1, minus 1, 1, minus 1, 2, 1, minus 1, 3, 4. So let me make this
very clear. This is our transformation
matrix S. This is our matrix A. And when you perform this
product you're going to get this guy right over here. I'll just copy and paste it. Edit, copy, and let
me paste it. You're going to get that
guy just like that. Now the whole reason why I'm
doing that is just to remind you that when we perform each of
these row operations, we're just multiplying. We're performing a linear
transformation on each of these columns. And it is completely equivalent
to just multiplying this guy by some matrix S. In this case, we took the
trouble of figuring out what that matrix S is. But any of these row operations
that we've been doing, you can always represent
them by a matrix multiplication. So this leads to a very
interesting idea. When you put something in
reduced row echelon form, let me do it up here. Actually, let's just finish what
we started with this guy. Let's put this guy in reduced
row echelon form. Let me call this first S. Let's call that S1. So this guy right here
is equal to that first S1 times A. We already showed that
that's true. Now let's perform another
transformation. Let's just do another set of
row operations to get us to reduced row echelon form. So let's keep our middle
row the same, 0, 1, 2. And let's replace the first row
with the first row plus the second row, because I
want to make this a 0. So 1 plus 0 is 1. Let me do it in another color. Minus 1 plus 1 is 0. Minus 1 plus 2 is 1. Now, I want to replace the third
row with, let's say the third row minus 2 times
the first row. So that's 0 minus 2,
times 0, is 0. 2 minus 2, times 1, is 0. 5 minus 2, times 2, is 1. 5 minus 4 is 1. We're almost there. We just have to zero out
these guys right there. Let's see if we can get this
into reduced row echelon form. So what is this? I just performed another
linear transformation. Actually, let me write this. Let's say if this was our first
linear transformation, what I just did is I performed
another linear transformation, T2. I'll write it in a different
notation, where you give me some vector, some column
vector, x1, x2, x3. What did I just do? What was the transformation
that I just performed? My new vector, I made the top
row equal to the top row plus the second row. So it's x1 plus x2. I kept the second
row the same. And then the third row, I
replaced it with the third row minus 2 times the second row. That was a linear transformation
we just did. And we could represent this
linear transformation as being, we could say T2 applied
to some vector x is equal to some transformation vector
S2, times our vector x. Because if we applied this
transformation matrix to each of these columns, it's
equivalent to multiplying this guy by this transformation
matrix. So you could say that this guy
right here-- we haven't figured out what this is, but
I think you get the idea-- this matrix right here is going
to be equal to this guy. It's going to be equal
to S2 times this guy. What is this guy right here? Well, this guy is equal
to S1 times A. It's going to be S2
times S1, times A. Fair enough. And you could have gotten
straight here if you just multiplied S2 times S1. This could be some
other matrix. If you just multiplied it by
A, you'd go straight from there to there. Fair enough. Now, we still haven't gotten
this guy into reduced row echelon form. So let's try to get there. I've run out of space below
him, so I'm going to have to go up. So let's go upwards. What I want to do is, I'm going
to keep the third row the same, 0, 0, 1. Let me replace the second row
with the second row minus 2 times the third row. So we'll get a 0, we'll get a 1
minus 2, times 0, and we'll get a 2 minus 2, times 1. So that's a 0. Let's replaced the first
row with the first row minus the third row. So 1 minus 0 is 1. 0 minus 0 is 0. 1 minus 1 is 0, just
like that. Let's just actually write what
our transformation was. Let's call it T3. I'll do it in purple. T3 is the transformation of some
vector x-- let me write it like this-- of some
vector x1, x2, x3. What did we do? We replaced the first row with
the first row minus the third row, x1 minus x3. We replaced the second row with
the second row minus 2 times the third row. So it's x2 minus 2 times x3. Then the third row just
stayed the same. So obviously, this could
also be represented. T3 of x could be equal to some
other transformation matrix, S3 times x. So this transformation, when
you multiply it to each of these columns, is equivalent to
multiplying this guy times this transformation matrix,
which we haven't found yet. We can write it. So this is going to be equal to
S3 times this matrix right here, which is S2, S1, A. And what do we have here? We got the identity matrix. We put it in reduced
row echelon form. We got the identity matrix. We already know from previous
videos the reduced row echelon form of something is the
identity matrix. Then we are dealing with an
invertible transformation, or an invertible matrix. Because this obviously could be
the transformation for some transformation. Let's just call this
transformation, I don't know, did I already use T? Let's just call it Tnaught for
our transformation applied to some vector x, that might
be equal to Ax. So we know that this
is invertible. We put it in reduced
row echelon form. We put its transformation
matrix in reduced row echelon form. And we got the identity
matrix. So that tells us that
this is invertible. But something even more
interesting happened. We got here by performing
some row operations. And we said those row operations
were completely equivalent to multiplying this
guy right here by multiplying our original transformation
matrix by a series of transformation matrices that
represent our row operations. And when we multiplied all this,
this was equal to the identity matrix. Now, in the last video we said
that the inverse matrix, so if this is Tnaught, Tnaught inverse
could be represented-- it's also a linear
transformation-- It can be represented by some inverse
matrix that we just called A inverse times x. And we saw that the inverse
transformation matrix times our transformation matrix is
equal to the identity matrix. We saw this last time. We proved this to you. Now, something very
interesting here. We have a series of matrix
products times this guy, times this guy, that also got me
the identity matrix. So this guy right here, this
series of matrix products, this must be the same thing as
my inverse matrix, as my inverse transformation matrix. And so we could actually
calculate it if we wanted to. Just like we did, we actually
figured out what S1 was. We did it down here. We could do a similar operation
to figure out what S2 was, S3 was, and then
multiply them all out. We would have actually
constructed A inverse. I guess, something more
interesting we could do instead of doing that, what if
we applied these same matrix products to the identity
matrix. So the whole time we did
here, when we did our first row operation. So we have here, we
have the matrix A. Let's say we have an identity
matrix on the right. Let's call that I,
right there. Now, our first linear
transformation we did-- we saw that right here-- that
was equivalent to multiplying S1 times A. The first set of row operations
was this. It got us here. Now, if we perform that same set
of row operations on the identity matrix, what
are we going to get? We're going to get
the matrix S1. S1 times the identity
matrix is just S1. All of the columns of anything
times the identity times the standard basis columns, it'll
just be equal to itself. You'll just be left
with that S1. This is S1 times I. That's just S1. Fair enough. Now, you performed your next row
operation and you ended up with S2 times S1, times A. Now if you performed that same
row operation on this guy right there, what
would you have? You would have S2 times S1,
times the identity matrix. Now, our last row operation we
represented with the matrix product S3. We're multiplying it by the
transformation matrix S3. So if you did that, you
have S3, S2, S1 A. But if you perform the same
exact row operations on this guy right here, you have
S3, S2, S1, times the identity matrix. Now when you did this, when
you performed these row operations here, this got you
to the identity matrix. Well, what are these going
to get you to? When you just performed the same
exact row operations you performed on A to get to the
identity matrix, if you performed those same exact row
operations on the identity matrix, what do you get? You get this guy right here. Anything times that identity
matrix is going to be equal to itself. So what is that right there? That is A inverse. So we have a generalized way
of figuring out the inverse for transformation matrix. What I can do is, let's
say I have some transformation matrix A. I can set up an augmented
matrix where I put the identity matrix right there,
just like that, and I perform a bunch of row operations. And you could represent them
as matrix products. But you perform a bunch of row
operations on all of them. You perform the same operations
you perform on A as you would do on the
identity matrix. By the time you have A as an
identity matrix, you have A in reduced row echelon form. By the time A is like that, your
identity matrix, having performed the same exact
operations on it, it is going to be transformed into
A's inverse. This is a very useful tool for
solving actual inverses. Now, I've explained
the theoretical reason why this works. In the next video we'll
actually solve this. Maybe we'll do it for the
example that I started off with in this video.