If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

A projection onto a subspace is a linear transformation

Showing that a projection onto a subspace is a linear transformation. Created by Sal Khan.

Want to join the conversation?

  • orange juice squid orange style avatar for user Mark Henwood
    Why is it that at Sal cancels
    (Inverse(transpose(A) A) transpose(A) A)
    but at
    A  (Inverse(transpose(A) A) transpose(A)
    can not cancel out, or equivalently, what property would allow the movement of A, if it was allowed? In working with numbers x,y and z I know that x*y*z = y*x*z = y*z*x ... but what property lets me do this and why is it not valid in Matrix Algebra?

    Mark
    (7 votes)
    Default Khan Academy avatar avatar for user
    • old spice man green style avatar for user newbarker
      The property that would allow the movement of A is called commutativity. As you said, when working with numbers or in plain algebra, the multiplication order can be swapped around without affecting the result.

      http://en.wikipedia.org/wiki/Commutative_property

      Matrix multiplication is not commutative, so in the 2nd expression, matrix A cannot be relocated just before transpose(A).

      Why is it not commutative? My explanation is that the way it is defined makes it non-commutative. The extreme case of is with non-square matrices: Consider matrix C which is a 3x2 matrix (3 rows, 2 cols), and matrix D which is 2x11 (2 rows, 11 columns). The product CD is a 3x11 matrix. The product DC isn't even permitted/defined. For square matrices, my advice is to create a couple of 2x2 matrices with entries a, b, c, etc, multiply them then multiply with order swapped to see the result is different.

      I hope I've helped a bit, but please leave a comment if what I've said didn't properly address your question.
      (12 votes)
  • leaf green style avatar for user SteveSargentJr
    At Sal references a video--which video is he referring to? Since the playlists have been reshuffled I'm not sure where to find it...
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user jacob.hastings
    At , why can't you just say that x is equal to Ay? This would imply that x is a member of V so it's projection onto V would just be equal to itself. If x and Ay are not equal that would mean that multiplying by A^T is not a linear transformation; it would be equivalent to a relation outputting the same value for multiple inputs which precludes it as a function. Why can this be ignored? I have to be missing something here.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Kyler Kathan
      If ab = ac, then we can say that b = c, but only if a ≠ 0. A similar rule applies to matrix-vector multiplication. If A x⃑ = A y⃑, then x⃑ = y⃑ only if A is invertible. Since we're dealing with an nxk matrix, we don't know that it's invertible in the context of the video (and it probably isn't since n would have to equal k).
      (4 votes)
  • leaf green style avatar for user siddhantsaboo
    at he says a couple of times the project of v onto x.
    Shouldn't it be the projection of x onto V ? or are they one and the same thing?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Muhsin Al Namli
    At why Sal said that x is a member of Rn?
    shouldn't be in Rk??
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user GC
      I got confused there too. I think 'x' is a vector not included in the span of V. It would have been clearer with a diagram but I think 'x' is like the vector 'x' in the prior video, where it is outside the subspace V (V in that video was a plane, R2). So 'x' extended into R3 (outside the plane). We can therefore break 'x' into 2 components,
      1) its projection into the subspace V, and
      2) the component orthogonal to the subspace...

      If 'x' had been contained in the span of V there would have been no reason to break it into its components.
      (1 vote)
  • blobby green style avatar for user Shaun.Lahert
    Because the the matrix A is a basis matrix meaning its linearly independent isnt the nullspace and the left null space both trivial, so how can it have members other than the 0 vector?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • orange juice squid orange style avatar for user FishHead
      Remember, the LEFT nullspace is the null space of the transpose of A. A is the linearly independent matrix, the transpose of A is a different matrix.

      If you go back in the video he's talking about the left nullspace, not N(A), which as you pointed out would indeed be trivial.
      (3 votes)
  • blobby green style avatar for user Janis Auzins
    When Sal multiplies both sides of the equation by (ATA)-1, how does he know where to put it? In matrix multiplication, order matters, right? Is it simply sufficient to put the (ATA)-1 in the same place in both sides (in this case, right at the start)? Because surely if he would've put (ATA)-1 at the start on the left side and at the end on the right side, that would not be an equality anymore? What if he would've put both of them at the end of the respective sides, would that have been legal?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • leaf red style avatar for user Bob Fred
      when talking about matrices:
      if you have 3 matrices, A, B and C with inverses
      say you have A=BC
      then the following is valid:
      (B-1)A=(identity matrix)C -> (B-1)A=C
      A(B-1)=BC(B-1), doesnt simplify any more
      (id)=(A-1)BC
      (id)=BC(A-1)
      matrix multiplcation is associative, which means A(BC)=(AB)C
      matrix multiplication is not commutative, which means AB/=BA (AB is not equal to BA)
      (2 votes)
  • blobby green style avatar for user a.somjp
    A(T)A = invertible only if A is sq. martix?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • male robot donald style avatar for user Jeremy
    If we are working in a space with dimension n, (in other words, x is a vector with n components), I think I understand the concept of a projection on to some subspace with a dimension of k which is less than n. But what would the projection of x be onto the space Rn? Or does it even make sense to talk about a projection of an n-dimensional vector to Rn?

    My sense is that projecting the (n-dimensional) vector x onto Rn just yields the vector x. So it almost isn't worth even talking about it as a 'projection' at all. So, is there such a thing as a projection of an n-dimensional vector x onto Rn?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • leaf blue style avatar for user Tarun Akash
    wait, he said, you got 'k' no's of vectors in Rn to span V (subspace of Rn) as a 'basis'. Is that possible?
    (1 vote)
    Default Khan Academy avatar avatar for user

Video transcript

We've defined the notion of a projection onto a subspace, but I haven't shown you yet that it's definitely a linear transformation. Nor have I shown you that if you know a basis for a subspace, how do you actually find a projection onto it? So let's see if we can make any progress on that here. So let's say I have some subspace. Let's say v is a subspace of Rn. And let's say I've got some basis vectors for v. So let's say these are my basis vectors, basis vector 1, 2, and I have k of them. I don't know what v's dimension is, but let's say it's k. It's got k basis vectors, so it is a basis for v. And that means that any vector-- that's not v vector, that's v subspace-- now that means that any vector-- let me call some vector, I don't know, let's say, any vector a that is a member of my subspace can be represented. That means that a can be represented as a linear combination of these guys. So I'll make my linear combination. Let's say it is, I don't know, y1 times b1, plus y2 times b2, all the way to plus yk times bk. That's what the definition of a basis is. The span of these guys is your subspace v, so any member of v can be represented as a linear combination of my basis vectors. Now, if I were to construct a matrix-- let's make it an n by k matrix-- whose columns are essentially the basis vectors of my subspace. So A looks like this, the first column is my first basis vector. My second column is my second basis vector, and I go all the way to my k column, and I have k columns, is going to be my k'th basis vector. If I'm going to have my k'th basis vector-- let me make the closing bracket the same color as my opening bracket, just like that-- it's going to have n rows, because each of these basis vectors are members of Rn. Remember, v is a subspace of Rn, so each of these guys are going to have n terms. So this matrix is going to have n rows. Now, saying that any member of the subspace v can be represented as a linear combination of these basis vectors, is equivalent to saying that any member, that a, that any member a of our subspace v can be represented as the product of our matrix A, times some vector y, where [INAUDIBLE] was equal to a, for some y, that is a member of Rk. Now why is this statement and that statement equivalent? Well you can imagine, if you were to just multiply this times some vector y in Rk, so it's y1, y2, all the way down to yk, this is going to be equal to y1 times v1, plus y2, times b2, all the way to plus yk times bk, which is the same thing as this. So you can always pick the right linear combination. You can always pick the right member, yk, so that you get the right linear combination of your basis vectors to get any member of your subspace v. So any member of my subspace, right there, can be represented as the product of the matrix A with some vector in Rk. Now we don't know much about this vector here in Rk. Now, the projection-- let's say that x is just some arbitrary member of Rn-- the projection of x onto our subspace v, that is by definition going to be a member of your subspace. Or another way of saying it is that this guy, the projection onto v of x is going to be equal to my matrix A, is going to be equal to-- I'll do it in blue-- is going to be equal to A times some vector y, or some vector y in Rk. If we knew what that vector y was, if we could always find it, then we would have a formula, so to speak, for figuring out the projection of x onto v. But we don't have that yet. All I've said is, any member of v can be represented as a product of our matrix A, which has the basis for v as columns, and some member of Rk. That just comes out of the fact that these guys span v, that any member of v is a linear combination of those guys. We know that the projection of x onto v is a member of our subspace v, it has to be inside of v. So it can also be represented this way. Now what was our definition of our projection? Our definition of our projection, we say-- Well, let me write it this way. We know that x can be represented as the sum of the projection onto v of x, plus some member of v complement. Or maybe I could even write, plus the projection onto the orthogonal complement of v. You could write this way. I could have also written this as w, where w is a member of v complement. Actually, let me write it that way. That might make it simpler. I don't want to get too many projections here-- plus w, where w is a unique member of the orthogonal complement of v. Or you could say it this way, if you subtract a projection of x onto v from both sides, you get x minus the projection of x onto v, is equal to w. Or another way to say it is that this guy right here is going to be a member of the orthogonal complement of v, right, because this is the same thing as w. Now what's the orthogonal complement of v? We go back to this matrix here. I have these basis vectors. Right here is the columns. So the column space of A is going to be equal to v, right? The column space of A is just the span of these basis vectors. And by definition, that is going to be equal to my subspace v. Now what is the orthogonal complement of v? The orthogonal complement of v is going to be the orthogonal complement of my column space. And what's the orthogonal complement of a column space? That's equal to the null space of A transpose, or you could also call that the left null space of A. But we've seen that many, many, videos ago. So we could say that x minus the projection of x onto v as a member of-- let me write it this way-- x minus the projection onto v of x is a member of the orthogonal complement of my column space of my matrix, which is equal to the null space of A transpose. That's the orthogonal complement of v. This is the same thing as the orthogonal complement of v. But what does this mean? What does this mean right here? This means that if I take A transpose, and I multiply it times this vector, because it's a member of A transpose's null space. So if I multiply it times that vector right there-- so, projection of x onto v-- then I'm going to get 0. I'm going to get the 0 vector. That's the definition of a null space. So let's write this out a little bit more. Let's see if we can algebraically manipulate it a bit. So if we distribute this matrix vector product, we get A transpose times the vector x, minus A transpose, times the projection-- actually let me write this this way. Instead of keep writing the projection onto v of x, what did we say earlier in this video? We said the projection of v onto x can be represented as the matrix product of the matrix A, times some vector y in Rk. That's where we started off the video. So let me write it that way, because that's what going to simplify our work a little bit. So I'm going to distribute the A transpose, A transpose times x. And then A transpose, minus A transpose, times this thing. This thing I can write as A times some vector y, and this is just a byproduct of the notion that the projection is a member of our subspace. Because it's a member of our subspace, it's going to be some linear combination of the column vectors of A. We saw that up here, so it can be represented in this way. So instead of projection onto v of x, I can just write Ay. This thing and this thing are equivalent, because this thing is a member of v. And then all of that is going to be equal to 0. And then if we add this to both sides of this equation, we get that A transpose x is equal to A transpose A of y. Now this is interesting. Remember where we started off here. We said that the projection onto v of x is equal to Ay for some y that is a member of Rk. If we knew what that y was, if we could always solve for that y, then the projection of x would be well defined. And we could always just figure it out. Now can we solve for y here? Well, we'll be able to solve for y if we can take the inverse of that matrix. If this major is always invertible, then we're always going to be able to solve for y here. Because we just take the inverse of this matrix and multiply it times the left side of both sides in this equation. Now if you remember, three videos ago, I think it was three videos ago, I showed you that if I have a matrix A whose columns are linearly independent, then A transpose A is always invertible. The whole reason why I did that video was for this moment right here. Now, what about our matrix A? Well, our matrix A has column vectors that form the basis for a subspace. By definition, basis vectors are linearly independent. So our matrix A has columns that are linearly independent. And if you watched that video, and if you believe what I told you, then you'll know that A transpose A, in our case, is going to be invertible. It has to be inveritible. So let's take the inverse of it and multiply it times both sides. If we take A transpose A inverse-- we know that this exists, because A has linearly independent columns-- and multiply it times this side right here, A transpose x. And then on this side we get-- well, we're going to do the same thing-- A transpose A inverse, times this thing right here, A transpose Ay. These two things when you multiply them, when you multiply the inverse of a matrix, times the matrix, you're just going to get the identity matrix. So that's just going to be equal to the identity matrix. And the identity matrix times y is just going to be y, so we get-- and this is a vector-- so if I flip them around, I get that y is equal to this expression right here. A transpose A inverse, which'll always exist, times A transpose, times x. Now we said the projection of x onto v is going to be equal to A times y, for some y. Well we just solved for the y using our definition of a projection. We just were able to solve for y. So now, we can define our projection of x onto v as a matrix vector product. So we can write the projection onto v of our vector x is equal to A, times y, and y is just equal to that thing right there. So A times A transpose A inverse-- which always exists because A has linearly independent columns-- times A transpose, times x. And this thing right here, this long convoluted thing, that's just some matrix, some matrix which always exists for any subspace that has some basis. So we've just been able to express the projection of x onto a subspace as a matrix vector product. So anything that can be any matrix vector product transformation is a linear transformation. And not only did we show that it's a linear transformation, we showed that, look, if you can give me the basis for v, I'm going to make those column vectors equal to the column of some matrix A. And then if I take matrix A, if I take its transpose, if I take A transpose times A, and invert it, and if I multiply them all out in this way, I'm going to get the transformation matrix for the projection. Now this might seem really complicated, and it is hard to do by hand for many, many projections, but this is super useful if you're going to do some three-dimensional graphical programming. Let's say you have some three-dimensional object, and you want to know what it looks like from the point of view of some observer. So let's say you have some observer. Some observer's point of view is essentially going to be some subspace. You want to see what the projection of this cube onto the subspace, how would it look to the person who's essentially on to this flat screen right there. How would that cube look from this point of view? Well if you know the basis for this subspace, you can just apply this transformation. You can make a matrix whose columns are these basis vectors for this observer's point of view. And then you can apply this to every vector in this cube in R3, and you'll know exactly how this cube should look from this person's point of view. So this is actually a super useful result.