0 energy points

# Matrix vector products

Defining and understanding what it means to take the product of a matrix and a vector. Created by Sal Khan.
Video transcript
In the last couple of videos, I already exposed you to the idea of a matrix, which is really just an array of numbers, usually a 2-dimensional array. Actually it's always a 2-dimensional array for our purposes. So if I have an m by n matrix, the m is just the number of rows, and then the n is just the number of columns. So let me write out the m by n matrix. So I'll just specify, let's have the m by n matrix A, it's a capital bold A. And it is equal to, I'll be as general as possible, first entry is in, I'll just call that lowercase a, it's in row 1 column 1. The next entry is row 1 column 2. And you go all the way to row 1 column n, you have n columns. And then when you go down, you go to the next row, it will be row 2 column 1. And then you keep going all the way down to row m column n. And then of course, what? This entry is going to be, row 2, let me write that a little smaller, row 2 column 2. And you go all the way, and you're going to have row m column n. And so if you think about it, you're going to have how many total entries here? You're going to have m entries this way, n that way. So you're going to m times n total entries. And I think you're pretty familiar with this idea already of a matrix, you probably saw this in your Algebra II classes. So what we want to do now in this video is relate our notion of a matrix to everything we already know about vectors. Or maybe introduce some operations that allow matrix and vectors to interact with each other. And maybe the most natural one is multiplication, or taking the product. So what I'm going to do in this video is define what it means when we take the product of our matrix A, of any matrix A, I've written this as general as possible, with some vector x. And our definition will only work if x, the vector we're multiplying A by, has the same number of components as A has columns. So this is only valid for an x that looks like this: x1, x2, all the way down to x n. So let me be very clear with this, this vector, I guess you could do it a different height than this vector. What matters is that the same number of A's you have in this direction, you have n A's here, then you have n components of this vector right here. And if you have that constraint, if the length of your vector, or the number of components in vector is equal to the number of columns in your matrix, then we define this product to be equal to -- so this is my vector x -- so this is a definition. There's nothing in nature that told us it had to be defined this way. It's just human beings, or mathematicians, decided that this is a useful convention to the define the multiplication, or the product, of a matrix and a vector. So we'll define A times our vector x. These are both bold, this is a matrix, that's a vector. And the convention, if I didn't draw the little vector symbol, your textbooks would just bold out the x, so that it'll be a lowercase x. Lower case is vector, uppercase is matrix, both of them are bolded. That tells you that you're not just dealing with regular numbers. So we're defining this to be equal to -- let me write it out fairly large. You're going to take each row, and we're going to show you that there's multiple ways to kind of visualize this, but it's going to a11 times x1, let me write that down. So a11 times x1 plus a12 times x2, all the way to plus a1n times xn. So the product of this matrix, this m by n matrix and this n component vector, will be a new vector, the first entry of which is essentially each of these entries times a corresponding entry here, and you add them all up. And as you can see, that's already looking fairly similar to a dot product, and I'll discuss that in a second. But let me finish my definition before I start talking about what it means, or what it might be related to. So that was that first row right there, it'll just look like that. We just multiply that times this thing to get that row there. Now the second row -- I want to do it in a different color -- remember this is a definition. Human beings came up with this. Nothing about nature said we had to do it this way, but it's just nice and convenient. So our second row will have a21 times x1, we'll just do the whole thing over again, but this time we're multiplying this row times this column vector. So a21 times x1 plus a22 times x2 all the way until we get to -- I wanted to do that in magenta -- a2n times xn. So we multiplied this entire row times that entire column. This term times that term, plus this term plus this term. All the way down to plus this last term times that last term. And we keep doing this for every row until we get to the m-th row, and then the m-th row will be am1. This is the m-th row first column. am1 times x1 plus -- it's hard to keep switching colors -- plus am2 times x2, all the way until we get to amn times xn. So what is this vector going to look like? It's essentially going to have -- let's say we call this vector-- Let's say it's equal to vector b. What does vector b look like? How many entries is it going to have? Well it has an entry for each row of this, right? We're taking each row and we're essentially taking the dot product of this row vector with this column vector. And I'll be a little bit more formal with the notation in a second. But I think you understand that this is a dot product. The first component times the first component plus the second component times the second component plus the third component times the third component, all the way to the n-th component plus the n-th component times the n-th component. So this is essentially the dot product of this row vector. We've been writing all of our vectors as columns, so we could call them column vectors, you're just writing them as rows. And we can be a little bit more specific with the notation in a second, but what's this going look like? Well we're doing this m times, so we're going to have m entries. You're going to b1 b2 all the way to bn. If you viewed these all as matrices, you can kind of view it as -- and this will eventually work for the matrix math we're going to learn -- this is an m by n matrix and we're multiplying it by -- how many rows does this guy have? He has n rows. He has n components, and he has 1 column. So m by n times an n by 1, you essentially can ignore these middle two terms, and they'll result with -- how many rows does this guy have? He has m rows, and 1 column. These middle two terms have to be equal to each other just for the multiplication to be defined, and then you're left with an m by 1 matrix. So this was all abstract, let me actually apply it to some actual numbers. But it's important to actually set the definition. Now that we have the definition we can apply it to some actual matrices and vectors. So let's say we have the matrix. Let's say I want to multiply the matrix minus 3, 0, 3, 2. Now I'll do this one in yellow. 1, 7, minus 1, 9. And I want to multiply that by the vector. Now how many components, or rows, does this vector have to have? Well my matrix times vector product, or multiplication, is only defined if my vector has as many components as this matrix has columns. So we have 1, 2, 3, 4 columns. So this guy's going to have 4 components for us even to be able to multiply them, otherwise it wouldn't be defined. So let me put 4 entries here. Let's say it's 2, minus 3, 4, and then minus 1. So what is this going to be equal to? The first term of this is going to be the dot product of this first row with this vector. And then the second entry is going to be the dot product of this row vector with this column. So let's do it. So it's going to be minus 3 times 2, I'm not going to color code it, minus 3 times 2 plus 0 times minus 3 plus 3 times 4 plus 2 times minus 1. And now my second row, or I guess my second component in this vector, is going to be 1 times 2 plus 7 times negative 3 plus minus 1 times 4 plus 9 times minus 1. And so what does this simplify to? This is equal to minus 3 times 2 is minus 6 plus 0 plus 12. This is 12. Minus 2. And then this is simplified to 2 minus 21 minus 4 minus 9. So this is equal to this top term, let's see, I have a minus 6 plus 12 is 6 minus 2 is 4. And then I have 2 minus 21 is minus 19. I want to make sure I get the math right here. Minus 21 minus 9 is minus 30 and I have a minus 34 and then I have a plus 2, so minus 32. So that's my product right there. And let me be very clear right here. Everything we've been used to right now, we've been writing our vectors as column vectors. But you can view each of these right here as a row vector. But let me be even better. Let's say that vector, let me call vector a, a1. So let me define vector a1 is equal to minus 3, 0, 3, 2. And let me define vector a2 to be equal to 1, 7, minus 1, 9. So all I did is I wrote these guys, but I wrote them in our standard vector form. I wrote them as column vectors. So what we can define to turn these guys into row vectors is the transpose function. In transpose, you just turn the rows into columns and the columns into rows. So if this is a1, then a1 transpose will just be the row version of this. So it's minus 3, 0, 3, 2. And then a2 transpose would be equal to 1, 7, minus 1, and 9. And then this multiplication right here, we can rewrite it as -- we have vector a1 transpose for the first row. These are vectors now, row vectors. And then this is a2 transpose. The transpose should be the super script. This vector can be written exactly like this because this is the first row, this is the second row. Times the vector, let me just call this vector x, that right there is vector x. We can now rewrite the definition as this would be equal to what? This first row right here that we wrote out, this was a1 dot x. You know all about the dot products. The first row was a1 dot x. It's minus 3 times 2 plus 0 times minus 3 plus 3 times 4. It's a1 dot x. And this is useful because when I defined the dot product, I only defined it with column vectors like this. And I'm dotting 2 column vectors. I haven't formally defined a row vector times a column vector. So now I can say if this is just a standard column factor, like we've been working with, I can write my matrix as each row is the transpose of a column vector, or it's a row vector. Then I can write this product as just the dot products of each of these transpose, or I guess you could say the inverse transpose, with this vector right here. And then obviously the second row is going to be a2 dot x. The second row is a2 dot x, is 1 times 2 plus 7 times minus 3 minus 1 times 4 plus 9 times minus 1. So just like that. So this is one way to view it. Matrix times the vector is just like the transpose of its rows dotted with the vector you're ds it by. This is one way to perceive matrix multiplication. Now the other way to perceive it -- let me do it with a different example. Those numbers are getting a little bit tiresome. Let's say I have the matrix A, nice and bold, is equal to 3, 1, 0, 3, 2, 4, 7, 0, minus 1, 2, 3, and 4. And I need to multiply this times a 4 component vector. So let me call vector x is equal to x1, x2, x3, and x4. Now instead of viewing these as row vectors, we could view A as a set of column vectors. We could call this thing right here vector 1. We call this thing right here vector 2. We call this thing right here vector 3. And we call this thing right here vector 4. Then we could rewrite our matrix A as being equal to just a bunch of column vectors. So we could rewrite it vector 1, vector 2, vector 3, and vector 4. So how can the matrix multiplication be interpreted in this context? Well what did we do? When we multiply these guys, all of the elements in here always get multiplied by x1. Let me start some of the multiplication here, just from our definition. So if I multiply A times x, I'll start it off, maybe I won't do the whole thing. I just want you to see the pattern. It's 3 times x1 plus 1 times x2 plus 0 times x3 plus 3 times x4. That's the first entry. And then you have 2 times x1 plus 4 times x2 all the way. And then you finally have minus 1 times x1 plus 2 times x2. You get the idea. But what's happening here? This first vector is always being multiplied by the scalar x1. In fact you can view this part of the entries right here. We're just multiplying this guy times the scalar of x1 in every case. You have 3, 2, minus 1, 3, 2, minus 1. We're multiplying by the scalar of x1. And then we're adding that to this guy times the scalar x2 and then we're adding that to this guy times the scalar x3. So we can rewrite A times x as being equal to the scalar x1 times the vector v1 plus the scalar x2. This is the scalar x1 times the vector v1 plus the scalar x2 times the vector v2. I want to do that in yellow. Plus x3 times the vector v3 plus the scalar x4 times the vector v4. And obviously if we had n terms here, we'd have to have n vectors here, and we could just make this more general to n. But what's interesting here is now the product Ax can be interpreted as a linear combination. These are just arbitrary numbers depending on what our vector x is. So depending on our vector x, we're taking a linear combination of the column vectors of A. So this is a linear combination of column vectors of A. So this is really interesting. I'm sure you've been exposed to matrix multiplication in the past. But I really want you to absorb these two ways of interpreting it, because they'll be important when we talk about column spaces and things like that in the future. Actually there's other ways you can actually interpret that as a transformation of this vector x. But I won't cover that in this video just for brevity. But you can interpret it as a weighted combination, or a linear combination of the column vectors of A, where the matrix X dictates what the weights on each of the columns are. Or you can interpret it as, essentially, the dot product of the row vectors, or you could define the row vectors as a transpose of column vectors. The dot product of those column vectors, each of the corresponding column vectors, with your matrix X. So these are both completely valid interpretations, and hopefully this video at least gives you a working knowledge of what matrix multiplication is. And even better, gives you a little bit deeper sense of all of the different ways that it can be interpreted.