If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Matrix vector products as linear transformations

Matrix Vector Products as Linear Transformations. Created by Sal Khan.

Video transcript

I think you're pretty familiar with the idea of matrix vector products and what I want to do in this video is show you that taking a product of a vector with a matrix is equivalent to a transformation. It's actually a linear transformation. Let's say we have some matrix A and let's say that its terms are, or its columns are v1-- column vector is v2, all the way to vn. So this guy has n columns. Let's say it has m rows. So it's an m by n matrix. And let's say I define some transformation. Let's say my transformation goes from Rn to Rm. This is the domain. I can take any vector in Rn and it will map it to some factor in Rm. And I define my transformation. So T of x where this is some vector in Rn, is equal to A-- this is this A. Let me write it in this color right here. And it should be bolded. I kind of get careless sometimes with the bolding. But big bold A times the vector x. So the first thing you might, Sal, this transformation looks very odd relative to how we've been defining transformations or functions so far. So the first thing we have to just feel comfortable with is the idea that this is a transformation. So what are we doing? We're taking something from Rn and then what does A x produce? If we write A x like this, if this is x where it's x1, x2. It's going to have n terms because it's in Rn. This can be rewritten as x1 times v1 plus x2 times v2, all the way to xn times vn. So it's going to be a sum of a bunch of these column vectors. And each of these columns vectors, v1, v2, all the way to vn, what set are they members of? This is an m by n matrix, so they're going to have m-- the matrix has m rows, or each of these column vectors will have m entries. So all of these guys are members of Rm. So if I just take a linear combination of all of these guys, I'm going to get another member of Rm. So this guy right here is going to be a member of Rm, another vector. So clearly, by multiplying my vector x times a, I'm mapping, I'm creating a mapping from Rn-- and let me pick another color-- to Rm. And I'm saying it in very general terms. Maybe n is 3, maybe m is 5. Who knows? But I'm saying it in very general terms. And so if this is a particular instance, a particular member of set Rn, so it's that vector, our transformation or our function is going to map it to this guy right here. And this guy will be a member of Rm and we could call him a x. Or maybe if we said a x equaled b we could call him the vector b-- whatever. But this is our transformation mapping. So this does fit our kind of definition or our terminology for a function or a transformation as a mapping from one set to another. But it still might not be satisfying because everything we saw before looked kind of like this. If we had a transformation I would write it like the transformation of-- I would write, you know, x1 and x2 and xn is equal to. I'd write m terms here in commas. How does this relate to that? And to do that I'll do a specific example. So let's say that I had the matrix-- let me to a different letter. Let's say I have my matrix B and it is a fairly simple matrix. It's a 2, minus 1, 3 and 4. And I define some transformation. So I define some transformation T. And it goes from R2 to R2. And I define T. T of some vector x is equal to this matrix, B times that vector x. Now what would that equal? Well the matrix is right there. Let me write it in purple. 2, minus 1, 3, and 4 times x. x1, x2. And so what does this equal? Well this equals another vector. It equals a vector in the co-domain R2 where the first term is 2 times x1. I'm just doing the definition of matrix vector multiplication. 2 times x1 plus minus 1 times x2, or minus x2. That's that row times our vector. And then the second row times that factor. We get 3 times x1. Plus 4 times x2. So this is what we might be more familiar with. I could rewrite this transformation. I could rewrite this transformation as T of x1 x2 is equal to 2x1 minus x2 comma-- let me scroll over a little bit, comma 3x1 plus 4x2. So hopefully you're satisfied that a matrix multiplication, it isn't some new, exotic form of transformation. That they really are just another way. This statement right here is just another way of writing this exact transformation right here. Now, the next question you might ask and I already told you the answer to this at the beginning of the video is, is multiplication by a matrix always going to be a linear transformation? Now what are the two constraints for being a linear transformation? We know that the transformation of two vectors, a plus b, the sum of two vectors should be equal to the sum of their transformations. The transformation of a plus the transformation of b. And then the other requirement is that the transformation of a scaled version of a vector should be equal to a scaled version of the transformation. These are our two requirements for being a linear transformation. So let's see if matrix multiplication applies there. And I've touched on this in the past and I've even told you that you should prove it. I've already assumed you know it, but I'll prove it to you here because I'm tired of telling you that you should prove it. I should do it at least once. So let's see, matrix multiplication. If I multiply a matrix A times some vector x, we know that-- let me write it this way. We know that this is equivalent to-- I said our matrix. Let's say this is an m by n matrix. We can write any matrix as just a series of column vectors. So this guy could have n column vectors. So let's say it's v1, v2, all the way to vn column vectors. And each of these guys are going to have m components. Times x1, x2, all the way down to xn. And we've seen this multiple, multiple times before. This, by the definition of matrix vector multiplication is equal to x1 times v1. That times that. This scalar times that vector plus x2 times v2, all the way to plus xn times vn. This was by definition of a matrix vector multiplication. And of course, this is going to-- and I did this at the top of the video. This is going to have right here, this vector is going to be a member of Rm. It's going to have m components. So what happens if I take some matrix A, some m by n matrix A, and I multiply it times the sum of two vectors a plus b? So I could rewrite this as this thing right here. So my matrix A times. The sum of a plus b, the first term will just be a1 plus b1. Second term is a2 plus b2, all the way down to a n plus bn. This is the same thing as this. I'm not saying a of a plus b. I'm saying a times. Maybe I should put a dot right there. I'm multiplying the matrix. I want to be careful with my notation. This is the matrix vector multiplication. It's not some type of new matrix dot product. But this is the same thing as this multiplication right here. And based on what I just told you up here, which we've seen multiple, multiple times, this is the same thing as a1 plus b1 times the first column in a, which is that vector right there. This a is the same as this a. So times v1. Plus a2 plus b2 times v2, all the way to plus an plus bn times vn. Each xi term here is just being replaced by an ai plus bi term. So each x1 here is replaced by an a1 plus b1 here. This is equivalent to this. And then from the fact that we know that well vector products times scalars exhibit the distributive property, we can say that this is equal to a1 times v1. Let me actually write all of the a1 terms. Let me write this. a1 times v1 plus b1 times v1 plus a2 times v2 plus b2 times v2, all the way to plus a n times vn plus bn times vn. And then if we just re-associate this, if we just group all of the a's together, all of the a terms together, we get a1 plus a-- sorry. a1 plus-- let me write it this way. a1 times v1 plus a2 times v2 plus, all the way, a n times vn. I just grabbed all the a terms. We get that plus all the b terms. All the b terms I'll do in this color. All the b terms are like that. So plus b1 times v1 plus b2 times v2, all the way to plus bn times vn. That's that guy right there. Is equivalent to this statement up here; I just regrouped everything, which is of course, equivalent to that statement over there. But what's this equal to? This is equal to my vector-- these columns are remember, the column for the matrix capital A. So this is equal to the matrix capital A times a1, a2, all the way down to a n, which was our vector a. And what's this equal to? This is equal to plus these v1's. These are the columns for the a, so it's equal to the matrix A times my vector b. b1, b2, all the way down to bn. This is my vector b. We just showed you that if I add my two vectors, a and b, and then multiply it by the matrix, it's completely equivalent to multiplying each of the vectors times the matrix first and then adding them up. So we've satisfied-- and this is for an m by n matrix. So we've now satisfied this first condition right there. And then what about the second condition? And this one's even more straightforward to understand. c times a1, so let me write it this way. The vector a times-- sorry. The matrix capital A times the vector lowercase a-- let me do it this way because I want-- times the vector c lowercase a. So I'm multiplying my vector times the scalar first. Is equal to-- I can write my big matrix A. I've already labeled its columns. It's v1, v2, all the way to vn. That's my matrix a. And then, what does ca look like? ca, you just multiply its scalar times each of the terms of a. So it's ca1, ca2, all the way down to c a n. And what does this equal? We know this, we've seen this show multiple times before right there. So it just equals-- I'll write a little bit lower. That equals c a1 times this column vector, times v1. Plus c a2 times v2 times this guy, all the way to plus c a n times vn. And if you just factor this c out, once again, scalar multiplication times vectors exhibits the distributive property. I believe I've done a video on that, but it's very easy to prove. So this will be equal to c times-- I'll just stay in one color right now-- a1 v1 plus a2 v2 plus all the way to a n vn. And what is this thing equal to? Well that's just our matrix A times our vector-- or our matrix uppercase A. Maybe I'm overloading the letter A. My matrix uppercase A times my vector lowercase a. Where the lowercase a is just this thing right here, a1, a2 and so forth. This thing up here was the same thing as that. So I just showed you that if I take my matrix and multiply it times some vector that was multiplied by a scalar first, that's equivalent to first multiplying the matrix times a vector and then multiplying by the scalar. So we've shown you that matrix times vector products or matrix vector products satisfied this condition of linear transformations and this condition. So the big takeaway right here is matrix multiplication. And this is a important takeaway. Matrix multiplication or matrix products with vectors is always a linear transformation. And this is a bit of a side note. In the next video I'm going to show you that any linear transformation-- this is incredibly powerful-- can be represented by a matrix product or by-- any transformation on any vector can be equivalently, I guess, written as a product of that vector with a matrix. Has huge repercussions and you know, just as a side note, kind of tying this back to your everyday life. You have your Xbox, your Sony Playstation and you know, you have these 3D graphic programs where you're running around and shooting at things. And the way that the software renders those programs where you can see things from every different angle, you have a cube then if you kind of move this way a little bit, the cube will look more like this and it gets rotated, and you move up and down, these are all transformations of matrices. And we'll do this in more detail. These are all transformations of vectors or the positions of vectors and I'll do that in a lot more detail. And all of that is really just matrix multiplication. So all of these things that you're doing in your fancy 3D games on your Xbox or your Playstation, they're all just matrix multiplications. And I'm going to prove that to you in the next video. And so when you have these graphics cards or these graphics engines, all they are-- you know, we're jumping away from the theoretical. But all these graphics processors are, are hard wired matrix multipliers. If I have just a generalized, some type of CPU, I have to in software write how to multiply matrices. But if I'm making an Xbox or something and 99% of what I'm doing is just rotating these abstract objects and displaying them in transformed ways, I should have a dedicated piece of hardware, a chip, that all it does-- it's hard wired into it-- is multiplying matrices. And that's what those graphics processors or graphics engines really are.