If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Matrix vector products as linear transformations

Matrix Vector Products as Linear Transformations. Created by Sal Khan.

Want to join the conversation?

  • at 6 minutes or so you say there are only 2 things a linear transformation must satisfy. preserve scalar multiplication and addition. what about the 0 vector mapping to the 0 vector in the new dimension? I thought that was one too. GREAT video though. Not trying to be a critic, just curious actually. Khan's academy is an awesome project. Keep it up!
    (56 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user AlexHuang861
      the 0 vector mapping to the 0 vector can be derived from the preservation of scalar multiplication and vector addition properties. Take T(0) = T(0 + 0) = T(0) + T(0) by definition of the addition property. Then you can subtract T(0) from both sides and you get T(0) = 0 as you wanted.
      (91 votes)
  • old spice man green style avatar for user Petrie (Peter S. Asiain III)
    How exactly are matrices used in computer science or physics? I mean yeah I heard that it is related to graphics in computer science and it is related to vector quantities in physics, but how do I exactly apply matrices to these? Someone please give an example either in computer science or physics and explain to me exactly how do we work with matrices. Thanks in advance!
    (10 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Bleakwise
    Also, some people (like myself) work much better with tangable objects than all these laws, rules and properties. If I could "see" Linear Transformations geometrically, graphed out and visualized, the theory would be much more digestible.
    (4 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Bleakwise
      Oh wow, we just drop the results of the sin/cos/tan functions in the rotation matrix? Seems simple enough.

      What I am confused about is in how we decided to use these specific trig functions....
      that is
      [Cos(theta) , -Sin(theta)]
      [Sin(theta) , Cos(theta)]

      I understand vertical V1 is multiplied by X and vertical V2 is multiplied by Y, but still don't see how they were built.

      Does the "arrangement" the trig functions are in ever change (when doing rotations)? I guess I don't see how you arrived at that matrix so I'm taking you up on your offer :), that is, I'm confused on how you picked which trig functions to use in the matrix. I recognize the results of the trig functions fine (i'm more familiar with SOHCAHTOA aka hypSin(theta) or hypCos(theta) not xCos(theta) or -ySin(theta) ).

      I see Wikipedia has a sheet on various R2 matrix calculations, I'm still lost as to how those Matrixes were derived, I hope you're more clear than Wiki as I mostly work in R3, and I will need to calculate rotations of Z as well.

      I think the key lies in figureing out how to do any kind of transformations, not just rotations. It appears, that if for example in R2 that
      [x transformation, y transformation]
      [x transformation, y transformation]

      Reading your response below, a R3 rotation would be described in a 4x4 matrix?
      [x transformation, y transformation, z transformation, w transformation]
      [x transformation, y transformation, z transformation w transformation]
      [x transformation, y transformation, z transformation w transformation]
      [x transformation, y transformation, z transformation w transformation]
      (2 votes)
  • blobby green style avatar for user Bleakwise
    I would really like to see a demonstration on using Linear Transformations to describe a rotation and a relocation in a 3d space.

    Would I need a 3x3 matrix to do that? A 3x4? All this theory is fine and well, but some examples on specific applications such as the ones mentioned above would be great.
    (4 votes)
    Default Khan Academy avatar avatar for user
    • old spice man green style avatar for user newbarker
      If you had an object in 3D space, with a 3x3 matrix you can rotate, scale, stretch, flip, project. You cannot translate it (relocate). You don't need a 4x4 to translate. You could do that with a 3x4 as you suggest. A 3x4 would be very inconvenient though. As it isn't square, it wouldn't have an inverse. Quite often we want to do the opposite transform and the inverse matrix is handy in that it undoes the transformation. Another thing we want to do is combine transformations into 1 transformation. In the matrix world, we do this by multiplying the transformation matrices together. A 4x4 entity means they can be combined easily. The product of two 3x4 matrices on the other hand isn't even defined.

      N.B. When using a 4x4 matrix, the 3D points are typically augmented with an extra coordinate we call w. w is typically set to 1. This augmentation is required to allow the product of a 4x4 matrix and a 4x1 vector to be defined.

      I don't have any examples to point you at right now, but if I find some I'll edit this answer.
      (6 votes)
  • blobby green style avatar for user Joel
    At , the matrix multiplication he performs does not make sense to me. It lo0ks like at first he's treating v1, v2, v3... as the column vectors of matrix A, which would have dimension 1xm (causing it to have the expected mxn dimensions, as there are n vectors) , but then he multiplies them by the x vector, which is an nx1 matrix. You cannot perform matrix multiplication between a 1xm and an nx1 matrix. Am I overlooking something?
    (5 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Gobot
      A has n vectors, which are each m x 1. So you can't multiply them by x as a vector (as x is n x 1), but that is not what is happening. He is multiplying them by the elements of x, so x1, x2 to xn, and then summing the result. Each element of x is just a scalar, which obviously can multiply the vector columns of A. This is just another way to go through the mechanics of multiplying.. using the elements of x as coefficients of the vectors of A, and it gives the same answer as doing it say the dot product way.
      (3 votes)
  • male robot hal style avatar for user learnwalksleeprepeat
    I have an extremely basic question ...
    Is multiplying a matrix with a vector the same as multiplying a vector with a matrix (i.e. does the order matter?)
    Sal says in the beginning of this video that "taking a product of a vector with a matrix is equivalent to a transformation" ... should that sentence be "taking a product of a matrix with a vector is equivalent to a transformation."

    Sorry about nit-picking on possibly trivial elements ... it's because one does not know if something is important or not until one has fully surveyed the subject :)
    (3 votes)
    Default Khan Academy avatar avatar for user
    • primosaur ultimate style avatar for user Derek M.
      Order of matrix multiplication does matter.
      Transforming a vector x by a matrix A is mathematically written as Ax, and can also be described by:
      "Left multiplying x by A."
      Sometimes when the context is clear, when we say "multiplying of x by A", it is clear and obvious we mean left multiplication, i.e. Ax.
      (2 votes)
  • leafers seed style avatar for user Noah Schwartz
    Does every matrix A have a matrix B (where A != B), that Ax = y is equal to Bx = y?
    For example, in Sal's 2x2 matrix [2, -1 <below> 3, 4], the matrix vector product Bx was equal to [2x.1 - x.2 <below> 3x.1 + 4x.2], whereas a = 2, b = -1, c = 3, and d = 4. However, if we had a new matrix A whereas its a = -x.2/x.1, b = 2x.1/x.2, c = 4x.2/x.1, and d = 3x.1/x.2, then, for any x.1 and x.2, Ax = Bx = y.
    Is this right, and if so, what does it mean when you deal with matrix inverses? If you have C as an inverse, and you do Cy = x, does there exist many possible C's where Cy = x instead of only one C?

    Thanks.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • piceratops ultimate style avatar for user Nabil Daoud
      Not quite. What you have shown is that two different matrices can transform a specific vector to the same image. By making your "new matrix A" (matrix B) dependent on the vector this holds only for the specific vector. (Also notice that your new matrix falls apart if x_1 or x_2 = 0. I think if your construction does not work for x = [1 0] and x = [0 1], then you're looking for trouble.)

      You should try a specific example for x_1 and x_2 != 0. You'll get two distinct matrices A != B that will transform your x_1 and x_2 to the same x_1' and x_2'. Yay! so far so good, but the two matrices, A and B will not transform a different x_1 and x_2 to the same image.

      Consider for example that both a rotation and a reflection can take a specific vector to the same image, but will not take all vectors (the entire space) to the same image.

      Some reflections transform specific vectors to the same vector, but that does not mean that they are the identity transformation.
      (2 votes)
  • aqualine ultimate style avatar for user Stefan
    Why are we checking whether things are linear transformations? are there some perks to being linear?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • ohnoes default style avatar for user Tejas
      Linear transformations are the simplest, and cover a very wide range of possible transformations of vectors. On the other hand, non-linear transformations do not work very well if you change your coordinate grid, making them very rare. But the main reason is that a linear transformation can always be represented as a matrix-vector product, which allows some neat simplifications.
      (2 votes)
  • blobby green style avatar for user fel.weathers
    can you have a linear transformation if there is an absolute value in the transformation matrix?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • leaf green style avatar for user Peter Barke
    On , why did Sal insist on writing a bold A?
    I thought only vectors were bolded?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Matthew Daly
      A[ v ] is a symbol that references a vector (since it is a member of R^m), so it makes sense to make it bold. But, as Sal suggests, he forgets to do it often and the world doesn't collapse. Indeed, if it were up to me, he would have made T bold as well, since it is a vector function, but meh.
      (1 vote)

Video transcript

I think you're pretty familiar with the idea of matrix vector products and what I want to do in this video is show you that taking a product of a vector with a matrix is equivalent to a transformation. It's actually a linear transformation. Let's say we have some matrix A and let's say that its terms are, or its columns are v1-- column vector is v2, all the way to vn. So this guy has n columns. Let's say it has m rows. So it's an m by n matrix. And let's say I define some transformation. Let's say my transformation goes from Rn to Rm. This is the domain. I can take any vector in Rn and it will map it to some factor in Rm. And I define my transformation. So T of x where this is some vector in Rn, is equal to A-- this is this A. Let me write it in this color right here. And it should be bolded. I kind of get careless sometimes with the bolding. But big bold A times the vector x. So the first thing you might, Sal, this transformation looks very odd relative to how we've been defining transformations or functions so far. So the first thing we have to just feel comfortable with is the idea that this is a transformation. So what are we doing? We're taking something from Rn and then what does A x produce? If we write A x like this, if this is x where it's x1, x2. It's going to have n terms because it's in Rn. This can be rewritten as x1 times v1 plus x2 times v2, all the way to xn times vn. So it's going to be a sum of a bunch of these column vectors. And each of these columns vectors, v1, v2, all the way to vn, what set are they members of? This is an m by n matrix, so they're going to have m-- the matrix has m rows, or each of these column vectors will have m entries. So all of these guys are members of Rm. So if I just take a linear combination of all of these guys, I'm going to get another member of Rm. So this guy right here is going to be a member of Rm, another vector. So clearly, by multiplying my vector x times a, I'm mapping, I'm creating a mapping from Rn-- and let me pick another color-- to Rm. And I'm saying it in very general terms. Maybe n is 3, maybe m is 5. Who knows? But I'm saying it in very general terms. And so if this is a particular instance, a particular member of set Rn, so it's that vector, our transformation or our function is going to map it to this guy right here. And this guy will be a member of Rm and we could call him a x. Or maybe if we said a x equaled b we could call him the vector b-- whatever. But this is our transformation mapping. So this does fit our kind of definition or our terminology for a function or a transformation as a mapping from one set to another. But it still might not be satisfying because everything we saw before looked kind of like this. If we had a transformation I would write it like the transformation of-- I would write, you know, x1 and x2 and xn is equal to. I'd write m terms here in commas. How does this relate to that? And to do that I'll do a specific example. So let's say that I had the matrix-- let me to a different letter. Let's say I have my matrix B and it is a fairly simple matrix. It's a 2, minus 1, 3 and 4. And I define some transformation. So I define some transformation T. And it goes from R2 to R2. And I define T. T of some vector x is equal to this matrix, B times that vector x. Now what would that equal? Well the matrix is right there. Let me write it in purple. 2, minus 1, 3, and 4 times x. x1, x2. And so what does this equal? Well this equals another vector. It equals a vector in the co-domain R2 where the first term is 2 times x1. I'm just doing the definition of matrix vector multiplication. 2 times x1 plus minus 1 times x2, or minus x2. That's that row times our vector. And then the second row times that factor. We get 3 times x1. Plus 4 times x2. So this is what we might be more familiar with. I could rewrite this transformation. I could rewrite this transformation as T of x1 x2 is equal to 2x1 minus x2 comma-- let me scroll over a little bit, comma 3x1 plus 4x2. So hopefully you're satisfied that a matrix multiplication, it isn't some new, exotic form of transformation. That they really are just another way. This statement right here is just another way of writing this exact transformation right here. Now, the next question you might ask and I already told you the answer to this at the beginning of the video is, is multiplication by a matrix always going to be a linear transformation? Now what are the two constraints for being a linear transformation? We know that the transformation of two vectors, a plus b, the sum of two vectors should be equal to the sum of their transformations. The transformation of a plus the transformation of b. And then the other requirement is that the transformation of a scaled version of a vector should be equal to a scaled version of the transformation. These are our two requirements for being a linear transformation. So let's see if matrix multiplication applies there. And I've touched on this in the past and I've even told you that you should prove it. I've already assumed you know it, but I'll prove it to you here because I'm tired of telling you that you should prove it. I should do it at least once. So let's see, matrix multiplication. If I multiply a matrix A times some vector x, we know that-- let me write it this way. We know that this is equivalent to-- I said our matrix. Let's say this is an m by n matrix. We can write any matrix as just a series of column vectors. So this guy could have n column vectors. So let's say it's v1, v2, all the way to vn column vectors. And each of these guys are going to have m components. Times x1, x2, all the way down to xn. And we've seen this multiple, multiple times before. This, by the definition of matrix vector multiplication is equal to x1 times v1. That times that. This scalar times that vector plus x2 times v2, all the way to plus xn times vn. This was by definition of a matrix vector multiplication. And of course, this is going to-- and I did this at the top of the video. This is going to have right here, this vector is going to be a member of Rm. It's going to have m components. So what happens if I take some matrix A, some m by n matrix A, and I multiply it times the sum of two vectors a plus b? So I could rewrite this as this thing right here. So my matrix A times. The sum of a plus b, the first term will just be a1 plus b1. Second term is a2 plus b2, all the way down to a n plus bn. This is the same thing as this. I'm not saying a of a plus b. I'm saying a times. Maybe I should put a dot right there. I'm multiplying the matrix. I want to be careful with my notation. This is the matrix vector multiplication. It's not some type of new matrix dot product. But this is the same thing as this multiplication right here. And based on what I just told you up here, which we've seen multiple, multiple times, this is the same thing as a1 plus b1 times the first column in a, which is that vector right there. This a is the same as this a. So times v1. Plus a2 plus b2 times v2, all the way to plus an plus bn times vn. Each xi term here is just being replaced by an ai plus bi term. So each x1 here is replaced by an a1 plus b1 here. This is equivalent to this. And then from the fact that we know that well vector products times scalars exhibit the distributive property, we can say that this is equal to a1 times v1. Let me actually write all of the a1 terms. Let me write this. a1 times v1 plus b1 times v1 plus a2 times v2 plus b2 times v2, all the way to plus a n times vn plus bn times vn. And then if we just re-associate this, if we just group all of the a's together, all of the a terms together, we get a1 plus a-- sorry. a1 plus-- let me write it this way. a1 times v1 plus a2 times v2 plus, all the way, a n times vn. I just grabbed all the a terms. We get that plus all the b terms. All the b terms I'll do in this color. All the b terms are like that. So plus b1 times v1 plus b2 times v2, all the way to plus bn times vn. That's that guy right there. Is equivalent to this statement up here; I just regrouped everything, which is of course, equivalent to that statement over there. But what's this equal to? This is equal to my vector-- these columns are remember, the column for the matrix capital A. So this is equal to the matrix capital A times a1, a2, all the way down to a n, which was our vector a. And what's this equal to? This is equal to plus these v1's. These are the columns for the a, so it's equal to the matrix A times my vector b. b1, b2, all the way down to bn. This is my vector b. We just showed you that if I add my two vectors, a and b, and then multiply it by the matrix, it's completely equivalent to multiplying each of the vectors times the matrix first and then adding them up. So we've satisfied-- and this is for an m by n matrix. So we've now satisfied this first condition right there. And then what about the second condition? And this one's even more straightforward to understand. c times a1, so let me write it this way. The vector a times-- sorry. The matrix capital A times the vector lowercase a-- let me do it this way because I want-- times the vector c lowercase a. So I'm multiplying my vector times the scalar first. Is equal to-- I can write my big matrix A. I've already labeled its columns. It's v1, v2, all the way to vn. That's my matrix a. And then, what does ca look like? ca, you just multiply its scalar times each of the terms of a. So it's ca1, ca2, all the way down to c a n. And what does this equal? We know this, we've seen this show multiple times before right there. So it just equals-- I'll write a little bit lower. That equals c a1 times this column vector, times v1. Plus c a2 times v2 times this guy, all the way to plus c a n times vn. And if you just factor this c out, once again, scalar multiplication times vectors exhibits the distributive property. I believe I've done a video on that, but it's very easy to prove. So this will be equal to c times-- I'll just stay in one color right now-- a1 v1 plus a2 v2 plus all the way to a n vn. And what is this thing equal to? Well that's just our matrix A times our vector-- or our matrix uppercase A. Maybe I'm overloading the letter A. My matrix uppercase A times my vector lowercase a. Where the lowercase a is just this thing right here, a1, a2 and so forth. This thing up here was the same thing as that. So I just showed you that if I take my matrix and multiply it times some vector that was multiplied by a scalar first, that's equivalent to first multiplying the matrix times a vector and then multiplying by the scalar. So we've shown you that matrix times vector products or matrix vector products satisfied this condition of linear transformations and this condition. So the big takeaway right here is matrix multiplication. And this is a important takeaway. Matrix multiplication or matrix products with vectors is always a linear transformation. And this is a bit of a side note. In the next video I'm going to show you that any linear transformation-- this is incredibly powerful-- can be represented by a matrix product or by-- any transformation on any vector can be equivalently, I guess, written as a product of that vector with a matrix. Has huge repercussions and you know, just as a side note, kind of tying this back to your everyday life. You have your Xbox, your Sony Playstation and you know, you have these 3D graphic programs where you're running around and shooting at things. And the way that the software renders those programs where you can see things from every different angle, you have a cube then if you kind of move this way a little bit, the cube will look more like this and it gets rotated, and you move up and down, these are all transformations of matrices. And we'll do this in more detail. These are all transformations of vectors or the positions of vectors and I'll do that in a lot more detail. And all of that is really just matrix multiplication. So all of these things that you're doing in your fancy 3D games on your Xbox or your Playstation, they're all just matrix multiplications. And I'm going to prove that to you in the next video. And so when you have these graphics cards or these graphics engines, all they are-- you know, we're jumping away from the theoretical. But all these graphics processors are, are hard wired matrix multipliers. If I have just a generalized, some type of CPU, I have to in software write how to multiply matrices. But if I'm making an Xbox or something and 99% of what I'm doing is just rotating these abstract objects and displaying them in transformed ways, I should have a dedicated piece of hardware, a chip, that all it does-- it's hard wired into it-- is multiplying matrices. And that's what those graphics processors or graphics engines really are.