Current time:0:00Total duration:21:10

0 energy points

# Matrix vector products

Defining and understanding what it means to take the product of a matrix and a vector. Created by Sal Khan.

Video transcript

In the last couple of videos,
I already exposed you to the idea of a matrix, which is
really just an array of numbers, usually a 2-dimensional
array. Actually it's always
a 2-dimensional array for our purposes. So if I have an m by n matrix,
the m is just the number of rows, and then the n is just
the number of columns. So let me write out
the m by n matrix. So I'll just specify, let's have
the m by n matrix A, it's a capital bold A. And it is equal to, I'll be as
general as possible, first entry is in, I'll just call that
lowercase a, it's in row 1 column 1. The next entry is
row 1 column 2. And you go all the way
to row 1 column n, you have n columns. And then when you go down, you
go to the next row, it will be row 2 column 1. And then you keep going
all the way down to row m column n. And then of course, what? This entry is going to be, row
2, let me write that a little smaller, row 2 column 2. And you go all the way, and
you're going to have row m column n. And so if you think about it,
you're going to have how many total entries here? You're going to have m entries
this way, n that way. So you're going to m times
n total entries. And I think you're pretty
familiar with this idea already of a matrix, you
probably saw this in your Algebra II classes. So what we want to do now in
this video is relate our notion of a matrix to everything
we already know about vectors. Or maybe introduce some
operations that allow matrix and vectors to interact
with each other. And maybe the most natural one
is multiplication, or taking the product. So what I'm going to do in this
video is define what it means when we take the product
of our matrix A, of any matrix A, I've written this as
general as possible, with some vector x. And our definition will only
work if x, the vector we're multiplying A by, has
the same number of components as A has columns. So this is only valid for an x
that looks like this: x1, x2, all the way down to x n. So let me be very clear with
this, this vector, I guess you could do it a different height
than this vector. What matters is that the same
number of A's you have in this direction, you have n A's
here, then you have n components of this vector
right here. And if you have that constraint,
if the length of your vector, or the number of
components in vector is equal to the number of columns in your
matrix, then we define this product to be equal to --
so this is my vector x -- so this is a definition. There's nothing in nature that
told us it had to be defined this way. It's just human beings, or
mathematicians, decided that this is a useful convention to
the define the multiplication, or the product, of a matrix
and a vector. So we'll define A times
our vector x. These are both bold, this is
a matrix, that's a vector. And the convention, if I didn't
draw the little vector symbol, your textbooks would
just bold out the x, so that it'll be a lowercase x. Lower case is vector, uppercase
is matrix, both of them are bolded. That tells you that you're
not just dealing with regular numbers. So we're defining this to be
equal to -- let me write it out fairly large. You're going to take each row,
and we're going to show you that there's multiple ways to
kind of visualize this, but it's going to a11 times x1,
let me write that down. So a11 times x1 plus a12 times
x2, all the way to plus a1n times xn. So the product of this matrix,
this m by n matrix and this n component vector, will be a new
vector, the first entry of which is essentially each
of these entries times a corresponding entry here,
and you add them all up. And as you can see, that's
already looking fairly similar to a dot product, and I'll
discuss that in a second. But let me finish my definition
before I start talking about what it
means, or what it might be related to. So that was that first row
right there, it'll just look like that. We just multiply that
times this thing to get that row there. Now the second row -- I want to
do it in a different color -- remember this is
a definition. Human beings came
up with this. Nothing about nature said we
had to do it this way, but it's just nice and convenient. So our second row will have a21
times x1, we'll just do the whole thing over again,
but this time we're multiplying this row times
this column vector. So a21 times x1 plus a22 times
x2 all the way until we get to -- I wanted to do that in
magenta -- a2n times xn. So we multiplied this entire row
times that entire column. This term times that term, plus
this term plus this term. All the way down to plus
this last term times that last term. And we keep doing this for every
row until we get to the m-th row, and then the
m-th row will be am1. This is the m-th row
first column. am1 times x1 plus -- it's hard
to keep switching colors -- plus am2 times x2, all the way
until we get to amn times xn. So what is this vector
going to look like? It's essentially going to have
-- let's say we call this vector-- Let's say it's
equal to vector b. What does vector b look like? How many entries is
it going to have? Well it has an entry for each
row of this, right? We're taking each row and we're
essentially taking the dot product of this row vector
with this column vector. And I'll be a little bit
more formal with the notation in a second. But I think you understand that
this is a dot product. The first component times the
first component plus the second component times the
second component plus the third component times the third
component, all the way to the n-th component plus
the n-th component times the n-th component. So this is essentially the dot
product of this row vector. We've been writing all of our
vectors as columns, so we could call them column vectors,
you're just writing them as rows. And we can be a little bit
more specific with the notation in a second, but what's
this going look like? Well we're doing this
m times, so we're going to have m entries. You're going to b1 b2
all the way to bn. If you viewed these all as
matrices, you can kind of view it as -- and this will
eventually work for the matrix math we're going to learn --
this is an m by n matrix and we're multiplying it by -- how
many rows does this guy have? He has n rows. He has n components, and
he has 1 column. So m by n times an n by 1, you
essentially can ignore these middle two terms, and they'll
result with -- how many rows does this guy have? He has m rows, and 1 column. These middle two terms have to
be equal to each other just for the multiplication to be
defined, and then you're left with an m by 1 matrix. So this was all abstract, let
me actually apply it to some actual numbers. But it's important to actually
set the definition. Now that we have the definition
we can apply it to some actual matrices
and vectors. So let's say we have
the matrix. Let's say I want to multiply the
matrix minus 3, 0, 3, 2. Now I'll do this
one in yellow. 1, 7, minus 1, 9. And I want to multiply
that by the vector. Now how many components,
or rows, does this vector have to have? Well my matrix times vector
product, or multiplication, is only defined if my vector has
as many components as this matrix has columns. So we have 1, 2, 3, 4 columns. So this guy's going to have 4
components for us even to be able to multiply
them, otherwise it wouldn't be defined. So let me put 4 entries here. Let's say it's 2, minus 3,
4, and then minus 1. So what is this going
to be equal to? The first term of this is going
to be the dot product of this first row with
this vector. And then the second entry is
going to be the dot product of this row vector with
this column. So let's do it. So it's going to be minus 3
times 2, I'm not going to color code it, minus 3 times 2
plus 0 times minus 3 plus 3 times 4 plus 2 times minus 1. And now my second row, or I
guess my second component in this vector, is going to be 1
times 2 plus 7 times negative 3 plus minus 1 times 4
plus 9 times minus 1. And so what does this
simplify to? This is equal to minus 3 times
2 is minus 6 plus 0 plus 12. This is 12. Minus 2. And then this is simplified to
2 minus 21 minus 4 minus 9. So this is equal to this top
term, let's see, I have a minus 6 plus 12 is
6 minus 2 is 4. And then I have 2 minus
21 is minus 19. I want to make sure I get
the math right here. Minus 21 minus 9 is minus 30 and
I have a minus 34 and then I have a plus 2, so minus 32. So that's my product
right there. And let me be very
clear right here. Everything we've been used to
right now, we've been writing our vectors as column vectors. But you can view each of these
right here as a row vector. But let me be even better. Let's say that vector, let
me call vector a, a1. So let me define vector a1 is
equal to minus 3, 0, 3, 2. And let me define vector a2 to
be equal to 1, 7, minus 1, 9. So all I did is I wrote these
guys, but I wrote them in our standard vector form. I wrote them as column
vectors. So what we can define to turn
these guys into row vectors is the transpose function. In transpose, you just turn the
rows into columns and the columns into rows. So if this is a1, then a1
transpose will just be the row version of this. So it's minus 3, 0, 3, 2. And then a2 transpose would be
equal to 1, 7, minus 1, and 9. And then this multiplication
right here, we can rewrite it as -- we have vector a1
transpose for the first row. These are vectors now,
row vectors. And then this is a2 transpose. The transpose should be
the super script. This vector can be written
exactly like this because this is the first row, this
is the second row. Times the vector, let me just
call this vector x, that right there is vector x. We can now rewrite the
definition as this would be equal to what? This first row right here
that we wrote out, this was a1 dot x. You know all about
the dot products. The first row was a1 dot x. It's minus 3 times 2 plus 0
times minus 3 plus 3 times 4. It's a1 dot x. And this is useful because
when I defined the dot product, I only defined it with
column vectors like this. And I'm dotting 2
column vectors. I haven't formally defined
a row vector times a column vector. So now I can say if this is just
a standard column factor, like we've been working with, I
can write my matrix as each row is the transpose
of a column vector, or it's a row vector. Then I can write this product
as just the dot products of each of these transpose, or
I guess you could say the inverse transpose, with this
vector right here. And then obviously the second
row is going to be a2 dot x. The second row is a2 dot x, is 1
times 2 plus 7 times minus 3 minus 1 times 4 plus
9 times minus 1. So just like that. So this is one way to view it. Matrix times the vector is just
like the transpose of its rows dotted with the vector
you're ds it by. This is one way to perceive
matrix multiplication. Now the other way to perceive
it -- let me do it with a different example. Those numbers are getting
a little bit tiresome. Let's say I have the matrix A,
nice and bold, is equal to 3, 1, 0, 3, 2, 4, 7, 0, minus
1, 2, 3, and 4. And I need to multiply this
times a 4 component vector. So let me call vector x is equal
to x1, x2, x3, and x4. Now instead of viewing these as
row vectors, we could view A as a set of column vectors. We could call this thing
right here vector 1. We call this thing right
here vector 2. We call this thing right
here vector 3. And we call this thing
right here vector 4. Then we could rewrite our matrix
A as being equal to just a bunch of column
vectors. So we could rewrite it vector
1, vector 2, vector 3, and vector 4. So how can the matrix
multiplication be interpreted in this context? Well what did we do? When we multiply these guys,
all of the elements in here always get multiplied by x1. Let me start some of the
multiplication here, just from our definition. So if I multiply A times x,
I'll start it off, maybe I won't do the whole thing. I just want you to
see the pattern. It's 3 times x1 plus 1 times
x2 plus 0 times x3 plus 3 times x4. That's the first entry. And then you have 2 times x1
plus 4 times x2 all the way. And then you finally have
minus 1 times x1 plus 2 times x2. You get the idea. But what's happening here? This first vector is always
being multiplied by the scalar x1. In fact you can view this part
of the entries right here. We're just multiplying this guy
times the scalar of x1 in every case. You have 3, 2, minus
1, 3, 2, minus 1. We're multiplying by
the scalar of x1. And then we're adding that to
this guy times the scalar x2 and then we're adding that to
this guy times the scalar x3. So we can rewrite A times x as
being equal to the scalar x1 times the vector v1 plus
the scalar x2. This is the scalar x1 times the
vector v1 plus the scalar x2 times the vector v2. I want to do that in yellow. Plus x3 times the vector
v3 plus the scalar x4 times the vector v4. And obviously if we had n terms
here, we'd have to have n vectors here, and we
could just make this more general to n. But what's interesting here is
now the product Ax can be interpreted as a linear
combination. These are just arbitrary numbers
depending on what our vector x is. So depending on our vector
x, we're taking a linear combination of the column
vectors of A. So this is a linear
combination of column vectors of A. So this is really interesting. I'm sure you've been exposed
to matrix multiplication in the past. But I really want you
to absorb these two ways of interpreting it, because
they'll be important when we talk about column spaces
and things like that in the future. Actually there's other ways
you can actually interpret that as a transformation
of this vector x. But I won't cover that in this
video just for brevity. But you can interpret it as a
weighted combination, or a linear combination of the column
vectors of A, where the matrix X dictates what
the weights on each of the columns are. Or you can interpret it as,
essentially, the dot product of the row vectors, or you could
define the row vectors as a transpose of
column vectors. The dot product of those column
vectors, each of the corresponding column vectors,
with your matrix X. So these are both completely
valid interpretations, and hopefully this video at least
gives you a working knowledge of what matrix multiplication
is. And even better, gives you a
little bit deeper sense of all of the different ways that
it can be interpreted.