Main content
Linear algebra
Course: Linear algebra > Unit 2
Lesson 3: Transformations and matrix multiplicationMatrix product associativity
Showing that matrix products are associative. Created by Sal Khan.
Want to join the conversation?
- In the review, (I know this doesn't have to do with the example), if S: x -> y and T: y -> z shouldn't the composition be T(S(x)), rather than S(T(x))? Just clarifying.(30 votes)
- Yes, there is an error in the video.
If S: X -> Y and T: Y -> Z, this means that whatever you put into T has to be a member of the set Y.
S on the other hand returns a member of the set Y.
So you can put S(a), were a € X, into T:
T(S(a))
And another way to write this would be:
(T o S)(a)
Note that T o S is read from right to left. So first you apply the rightmost transformation on a, then whatever your output of S is, you put into T.
You could also think of "T o S" as "T after S".
You could of course also solve that problem by simply redefining the transformations S and T such that:
S: Y -> Z and T: X -> Y (Switching the sets of the transformations.)
Sal, please correct this error in the video. It's a bit confusing. :-)
You don't have to redo the video, just make a remark that:
T: X -> Y and S: Y -> Z.(26 votes)
- What if AB is undefined but BC is defined and the product of BC with A is also defined--wouldn't the parentheses matter then?(6 votes)
- Hey, I have a general question regarding associativity. So, as proven in the video, (AB)C=A(BC)=ABC. But does ABC and all its equivalents is equal to ACB or BCA? Thanks.(2 votes)
- Not necessarily. A(BC)=(AB)C is what "associative" means, but notice that we don't change the order there. Being able to switch order, AB=BA, is a different property called "commutativity", and matrix multiplication is not commutative.(5 votes)
- I don't understand, What the is hogof?(1 vote)
- The composition of h with g and f, or the multiplication of each transformation's matrix.
You should probably watch the previous videos in the playlist.(4 votes)
- In this video, shouldn't be the T:Z-Y otherwise there might be problem regarding the dimension while calculating.(1 vote)
Video transcript
We know that if we have some
linear transformation, that the transformation from x to y
-- and these are just sets, sets of vectors, and T is a
linear transformation from y to z-- that we can construct a
composition of s with T that is a linear transformation
from x all the way to z. We saw this several
videos ago. And the definition of our linear
transformation, or the composition of our linear
transformation. So the composition of s with t,
applied to someone vector x in our set x, our domain,
is equal to s of t of x. This was our definition. And then we went on and we said,
look, if s of x can be represented as the matrix
multiplication a x, the matrix vector product, and if T of x
can be represented -- or the transformation T can be
represented-- as the product of the matrix b with x, we saw
that this thing right here -- which is of course, if we just
write this way, this is equal to a times T times x, which is
just b x -- we saw in multiple videos now that this is
equivalent to, by our definition of matrix products,
the matrix a b -- right? When you take the product of
two matrices you just get another matrix -- the
product a b times x. So you take essentially the
first linear transformation in your composition, its matrix,
which was a, and you take the product with the second one. Fair enough, all of this
is review so far. Let's take three linear
transformations. Let's say that I have the linear
transformation h, and when I apply that to a vector
x, it's equivalent to multiplying my vector
x by the matrix a. Let's say I have the linear
transformation g. When I applied that to a vector
x, it's equivalent to multiplying that vectrix -- that
vector, there should be a new concept called a vetrix --
it's equivalent to multiplying that vector times
the matrix b. And then I have a final linear
transformation f. When it's applied to some vector
x, it's equivalent to multiplying that vector
x times the matrix z. Now what I'm curious about is
what happens when I take the composition of h with g, and
then I take the composition of that with f -- these are all
linear transformations -- and then I apply that to
some vector x. Which is necessarily going to be
in the domain of this guy. I haven't actually drawn out
their domain and co-domain definitions, but I think
you get the idea. So let's explore what this
is a little bit. Well by the definition of
what a -- let's go back. By this definition right here
of what composition even means, we can just apply that
to this right here. So we could just imagine this as
being our s, and then this is our T right there. Then what is this going
to be equal to? If we just do a straight up
pattern match right there, this is going to be equal to
s, the transformation s, applied to the transformation
f, applied to x. So s is h of g. So it is h -- or I should say h
of g -- the composition of h with g, that is our s. And then I apply that
to f applied to x. f is our t. I apply that to f applied
to x, just like that. Now what is this equal to? Now we can imagine that
this is our x. If we just pattern match,
according to this definition, that this and this guy right
here, that this is our t, and that this is our s. And so if we just pattern
match here, this is equal to what? This is just straight from the
definition of a composition. So it's equal to s of -- s is
our h -- so h of t, which in this case is g, g
applied to x. But instead of an x, we have
this vector here, which was the transformation
f applied to x. So g of f of x. That's what this is equal to. The composition of h with
g, and the composition competition of f with h, the
composition of h and g, all of that applied to x is equal
to h of g of f of x. Now what is this equal to? Well this is equal to -- I'll
do it right here -- this is equal to h, the transformation
h, applied to -- what is this term right here? I'll do it in pink. What is this? That is the composition of
g and f applied to x. You can just replace s with g,
and f with T, and you'll get that right there. So this is just equal to the
composition of g with f applied to x. That's all that is. Now, what is this equal
to right there? And it's probably confusing
to see two parentheses in different colors, but
you get the idea. What is this equal to? Well, just go back to your
definition of the composition -- I just want to make it very
clear what we're doing. This is, if you imagine this
being your T and then this being your s, this is just the
composition of s with T, applied to x. So this is just equal to --
I'll write it this way. This is equal to -- I shouldn't
write s's -- this is a composition of h with the
composition of g and f. And then all of that
applied to x. Now why did I do all of this? Well one, to show you that the
composition is associative. I went all the way here and then
I went all the way back. And essentially it doesn't
matter where you put the parentheses. The composition of h with g with
f, is equivalent to the composition of h with the
composition of g and f. That these two things are
equivalent, and essentially these two things, you can
just re-write them. The parentheses are essentially
unnecessary. You can write this as a
composition of h with g with f, all of that applied to x. Now, I took the time to say
that each of these linear transformations I can
represent as matrix multiplications. Why did I do that? Well we saw before, that any
composition, when you take the composition of s with T, the
matrix version of this transformation of this
composition is going to be equal to the product -- by our
definition of matrix matrix products -- the product of the
s's transformation matrix and T's transformation matrix. So what are these going
to be equal to? So this one right here --
if you think of this transformation right here, this
statement right here, its matrix version of it--
so let me write that. A matrix version of the
composition of h with g, and then the composition of that
with f, applied to x, is going to be equal to -- and we've
seen this before -- the product of these matrices. So this composition, its matrix
is going to be a b. h and g, their matrices
are a and b. So it's going to be a b -- and
I'll do it in parentheses. And then you take that matrix,
and you take the product -- so this guy's matrix representation
is a b, right? And this guy's matrix
representation is c. So the matrix representation
of this whole thing is this guy, taking the product of a b,
and then taking the product of that with c. So a b. and then c. And then if you look at this
guy right here -- and of course all of that times a
vector x, all of that time some vector x, right there. That's the vector x. Now let's look at this
one right here. If we take the composition of
h with the composition of g and f, and apply all of that to
some vector x, what is that equivalent to? Well this composition right
here, the matrix version of it, I guess we can say, is going
to be the product b c. And we're going to
apply that to x. So we're going to have
the product b c. And then we're going to take the
product of that with this guy's matrix representation,
which is a. And we've shown this before. We never showed it with
three, but it extends. I kind of showed it extends, so
you can just keep applying the definition. You can keep applying this
property right here, and so it'll just naturally extend. Because every time, we're
just taking the composition of two things. Even though it looks like we're
taking the composition of three, we're taking the
composition of two things first here. And then we get its matrix
representation. And then we take the composition
of that with this other thing. So the matrix representation of
the entire composition is going to be this matrix
times this matrix. Which I did here. Similarly, here we take first
the composition of these two linear transformations, and
their matrix representation will be that right there. And then we take the composition
of that with that. So its entire matrix
representation is going to be guy's matrix times this
guy's matrix. So a times b c. And of course, all of that
applied to the vector x. Now, in this video I've showed
you that these two things are equivalent. If anything, the parentheses
are completely unnecessary. And I showed you that there. They both essentially boil
down to h of g of f of x. So these two things
are equivalent. So we could say, essentially,
that these two things over here are equivalent. Or that a b, the product a b,
and then taking the product of that matrix with the matrix c,
is equivalent to taking the product a with the matrix b c. Which is just another
product matrix. Or another way of saying it is
that these parentheses don't matter, that all of these
is just equal to a b c. Or -- I mean, this is just a
statement that matrix products exhibit the associative
property. It doesn't matter where you
put the parentheses. And you know, sometimes it's
confusing me, the word associative. It just means it doesn't matter
where you put the parentheses. Matrix products do not exhibit
the commutative property. We saw that in the last video. In general, we cannot make
the statement that a b is equal to b a. We cannot do that. And in fact in the last video
-- I think it was the last video -- I showed you that if a
b is defined, sometimes b a is not even defined. Or if b a is defined, sometimes
a b isn't defined. So it's not commutative. It is associative, though. In the next video, I'll see if
matrix products are actually distributive.