If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Projections onto subspaces with orthonormal bases

Projections onto subspaces with orthonormal bases. Created by Sal Khan.

Want to join the conversation?

  • If columns of the basis are orthonormal the rows must also be so
    (A)t A = I ((A)t A )t = (I)t ie A(A)t = I
    This makes the Proj(x) = I.
    How can this be?
    (13 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user AlexHuang861
      The basis is not always a basis of R^n. The matrix A you are talking about is not always square. most of the time you are projecting onto a subspace of R^n so it will have less than n basis vectors. In this case, if the columns are orthonormal than yes (A^T)A = I but A(A^T) is NOT necessarily equal to I. It is easy to find a counterexample such that A(A^T) = I is not true. However, if you have an orthogonal matrix, a square matrix where the columns are orthonormal, then the rows and the columns both form orthonormal basis and the projection matrix would be the identity. In fact, any square matrix A would cause the projection matrix to equal I.
      (18 votes)
  • blobby green style avatar for user brandenyoussef
    Isn't A^T * A = I as well? So why doesn't projection of x onto v = x?
    (5 votes)
    Default Khan Academy avatar avatar for user
    • ohnoes default style avatar for user Tejas
      A^T * A is I, but A * A^T is not necessarily. This is because we are no longer dotting the column vectors, which happen to be orthonormal, with themselves. We are instead dotting the rows which themselves. The row vectors, unlike the column vectors might not be orthonormal.
      (16 votes)
  • blobby green style avatar for user dadowns1159
    This and the previous video begin telling reasons why orthonormal bases make things easier to work with.

    Perhaps this was mentioned or implied but if the standard-basis is orthonormal, then do the benefits outlined in this and the previous video apply to all problems where we are using the standard basis?

    For example, could we just use Projv(x) = AA^Tx when we we are working with standard basis?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user jimmy simaundu
    what happens if the Gram - Schmidt procedure is applied to a list of vectors that is not linearly independent
    (2 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seedling style avatar for user Joel
    At the w subspace of orthogonal compliment. The upside down T represents orthogonal compliment?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • mr pants teal style avatar for user Robert
      Yes, that upside down T does indicate the orthgonal complement of a subspace. So if V is a subspace, V ^ _|_ would be the orthogonal complement of V, and we sometimes read it as "V perp", where the 'perp' is short for "perpendicular."
      (3 votes)
  • primosaur ultimate style avatar for user Hemen Taleb
    I have an observational question.
    Sal writes the proj_Vx = (v_1 dot x)v_1 +....etc etc
    My question is. From previous video, we said that (v_1 dot x) entries are just the entries of the vector x with respect to B where v_1, v_2...v_k were the basis vectors of B. Does this mean, when we take the proj_Vx = (v_1 dot x)v_1 +....etc etc, is it the same thing as taking [x]_B (x w/ respect to B) times all the basis vectors of the subspace V (V from this video). such that

    Proj_Vx= ([x]_B*(v_1, v_2, ...v_k) = projection of x onto the subspace V is equal to x written with respect to the basis of our subsapce V times the basis vectors of our subspace V)

    I hope it wasn't too convoluted.
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Stenten
      Unfortunately since x is outside of the subspace V, it cannot be represented solely in terms of coordinates with respect to the basis vectors of V. In other words, you cannot rewrite x as [x]_B.

      You can, however, write the projection of x onto V in V's coordinate system, since the projection lies in the subspace V. What you're calling [x]_B would be this projection written in basis B. You could then of course convert this projection into standard basis by multiplying B times this "[x]_B".
      (3 votes)
  • male robot donald style avatar for user Dylan Laing
    I would love to see how to turn an equation of a plane into an orthonormal basis.
    (1 vote)
    Default Khan Academy avatar avatar for user
    • leafers seed style avatar for user Noah Schwartz
      An orthonormal basis is a just column space of vectors that are orthogonal and normalized (length equaling 1), and an equation of a plane in R3 ax + by + cz = d gives you all the information you need for an orthonormal basis. In this case, dealing with a plane in R3, all you need are two orthogonal vectors. It doesn't matter what vectors they are, as long as v1 and v2 are both orthogonal to each other and lie on the same plane. Pick any random x and y and solve for z in the equation for a plane two times with different numbers and you get two points on the plane. The difference between them is your first vector in your orthonormal basis, v1. To find the second is a bit more tricky, but you are already given the normal vector to the plane <a, b, c>, and v2, by definition, is orthogonal to both v1 and n (where n is the normal vector). So, to find v2, all you have to do is take the cross product v1 and n. v2 = v1 * n. Now all you have to do is normalize v1 and v2 and you have your orthonormal basis! V = <v1/||v1||, (v1*n)/||(v1*n)||>
      (2 votes)
  • blobby green style avatar for user Vinod P
    Just would like to clarify one more point. If A = {v1, v2,....vk}, shouldn't A^T be equal to {v1/r/nv2/r/n...vk}? I see that in this video Sal uses the notation A^T = {v1^T/r/nv2^T/r/n...vk} instead.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Vinod P
    In this video, Sal refers to A = {v1,....vk} as the matrix with the column space composed of the orthonormal elements of the basis vector V for R^n. Based on this assumption, Sal is able to arrive at the conclusion at the end that the projection of a vector x which is an element of R^n onto V can be determined using the equation AA^Tx. But under the same assumption, shouldn't AA^T evaluate to the identity matrix Ik, in which case the result obtained is projection of x onto V is equal to x itself! Is there a basic flaw to the argument that I have presented here?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user josephroxasme
    if A is an orthonormal basis of a subspace, then the projection of a vector x unto that subspace becomes [AA^T]x.
    therefore [AA^T ]x = sum{ (x.u_i)u_i } where i = 1,2,3...
    is this correct?
    (1 vote)
    Default Khan Academy avatar avatar for user

Video transcript

We saw in the last video that orthonormal bases make for good coordinate systems, coordinate systems where it's easy to figure out the coordinates. That's what we did in the last video. Let's see if there are other useful reasons to have an orthonormal basis. So we already know, let's say I have some subspace V. Let's say V is a subspace of Rn. And let's say we have B, which is an orthonormal basis. B is equal to v1, v2, all the way to vk. And it is an orthonormal basis for V, which is just a fancy way of saying that all of these vectors have length 1, and they're all orthogonal with respect to each other. Now, we've seen many times before that if I have just any member of Rn-- So let's say that I have some vector x that is a member of Rn, then x can be represented as a sum of a member of V, as some vector V that is in our subspace, and some vector w, that is in the orthogonal complement of our subspace. Let me write that down. Where V is a member of my subspace, and w is a member of my subspace's orthogonal complement. We saw this when I was doing the whole set of videos on orthogonal complements. Now what is this thing right here? What is this thing right there? By definition, that is the projection of x onto V. This would be the projection of x onto V's orthogonal complement. And we know in the past, that this is not an easy thing to find. Let's say I set up some matrix A, that has my basis vectors as the columns-- So if I set up some matrix A that looks like this, v1, v2, all the way to vk, we learned before that if we wanted to figure out, and have a kind of a general way of figuring out what the projection is, we learned that the projection of any vector x onto V is equal to A times A transpose A inverse, times A, times x. And this was a pain to figure out. That is a pain to figure out. But let's see if the assumption that these guys are orthonormal, or that this is an orthonormal set, in any way simplifies this. So the first thing we can do is just explore this a little bit. This vector V, this is a member of our subspace, which means it could be represented as a linear combination of my basis vectors. So I can write x is equal to, instead of V, I can write c1 times v1, plus c2 times v2, all the way to plus ck times vk. This is the same thing as just any, or some unique, member of my subspace V. So that's V right there, and you can also view this as the projection of x onto the subspace V. So x can be represented as some member of V, and then some member of V's orthogonal complement, plus w right there. Now what happens if we take both sides of this equation, if we dot it with one of these guys with, let's say, vi? Let's dot both sides of this equation with vi. So if I take vi dot x, or vi of the ith basis vector up here, the ith basis vector in the basis for my subspace B, what am I going to get? This is going to be c1 times vi, times v1, plus c2 times vi, times v2, plus you're going to keep going. Eventually you'll get to the ith term, which will be ci times vi dotted with vi. And then, you know, assuming that i isn't 1, 2, or k, eventually we'll get to ck times vi dotted with vk. We saw this in the last video. I'm just dotting both sides. But we also have this w term. So then we'll go plus vi dot w. Now, you know, just to clarify things, in the last video, we assumed that x was inside of the subspace, so that x could be represented with coordinates here. Now x could be any member of Rn, and we're just looking at the projection of x. And because it's any member, it's going to be some combination of these guys plus some member of B's orthogonal complement. Now when I take the dot product of one of my basis vectors, the ith basis vector, with both sides of this equation, this side is just that, but in the right side something very similar happens to what we saw in the last video. What is vi dot v1? Well they're different members of this orthonormal set, so they're orthogonal. So that's going to be 0, vi dot v2, that's 0, assuming vi doesn't equal 2. vi dot vi is 1. So this term is just going to be ci, vi dot vk, that's also 0. It doesn't matter what our constant is because 0 times anything is 0. And then what is vi dot w? Well, by definition w is a member of our orthogonal complement to V, which means it is orthogonal to every member of V. Well this is a member of V, so these two guys are orthogonal. So that is also equal to 0. And just like that, you get ci is equal to vi times xi. Sorry, times x, just like that. So what does this do? This is a kind of a very similar result that we got last time. But remember we're not looking for-- We're not assuming that x is a member of V. In that case then, you know, the ci's would be the coordinates for x. In this case, we're looking for the projection of x onto V, or the member of V that is kind of x's component in V, or that represents x's projection onto V. So if we now want to find the projection of x onto V, it's equal to these ci's times their respective basis vectors, but now we know what the ci's are. They're that basis vector times your vector x. So just like that, we get a pretty simple way of figuring out the projection onto a subspace with an orthonormal basis. So let's see, c1 is just going to be v1 dot x. That's c1, and then we're going to multiply that times the vector v1. That's a vector too. And then the next, I guess we could say, you know, the next coefficient on v2 is going to be v2 dot x times the vector v2. And then you're going to go all the way to plus vk dot x times vk. And I don't know if you remember what we did when we took the projection of x onto some line. When we were taking the projection of x onto some line, where L is equal to the span of some unit vector, where this had a length 1. You know, for t as any real number, that's just a line, sum of the span of some unit vector. Where we assume this has length 1. Then the projection onto a line just simplified to the formula x dot-- let me write it this way --x dot u times the vector u. This was a projection onto a line. Notice when we're dealing with an orthonormal basis for a subspace, when you take a projection of any vector in Rn onto that subspace, it's essentially, you're just finding the projection onto the line spanned by each of these vectors, right? x dot v1 times the vector v1. x times v1 times the vector v1. You're taking x's projection onto the line spanned by each of these guys. That's all it is. But clearly this is a much, much simpler way of finding a projection than going through this mess of saying A times the inverse of A transpose A, times A transpose-- I forgot that A transpose when I wrote it the first time --times x. This is clearly a lot easier. But you might say, OK, this is easier but you told me that a projection is a linear transformation. You've told me it's a linear transformation, so I want to figure out the matrix here. So let's see if being orthonormal in any way simplifies this. So we could always just figure out for any particular x. We can just apply the dot product with each of the basis vectors, those will be the coefficients, and then apply those coefficients times the basis vectors, add them up and you know your projection. But, you know, some of us might actually want the transformation matrix. So let's figure out what it is. Let me just rewrite what we already know. We already know that the projection onto any subspace V of x is equal to A times A transpose A inverse, times A, times x. And where A's column vectors are just the basis vectors v1, v2, all the way to vk. Now, let's see if the assumption that these guys are an orthonormal basis, let's see if this simplifies it at all. Let's take the case in particular of A transpose A. A transpose A is going to be equal to what? It's going to be equal to A transpose-- Let's think about this. These guys are members of Rn, so it's going to be an n by k matrix. So this is n by k, this guy right here is k by n, times an n by k. We're going to have a k by k product. k by n times n by k is going to be k by k. A transpose A is going to be k by k. And what is it A transpose equal to? Well each of these columns are going to become rows. So the first row here is going to be v1 tranpose. The second column here is going to be v2 transpose. Then you're going to go all the way down. The kth column there is going to be to vk transpose. Just like that. And then A is, of course, this thing right there. So A looks like this. You have v1, like that. You have v2, like that. And then you keep going and you have vk, just like that. What's going to happen when we take this product? Let's do a couple of rows right here. So when I take this product, I'm going to get a k by k matrix. Let me write it big so I can explain it reasonably. So what's the first row, first column going to be? It's going to be this row dotted with this column, or v1 dot v1. Well, v1 dot v1, that's nice, that's just 1. And then what's the second row, second column? Well, that's just going be v2. You're going to get your row from this guy and your column from that guy. This row dotted with that column, so v2 dot v2, so that's nice. That'll be a 1. And in general, if you're finding the Aii or you're finding anything along the diagonal, you're going to take, let's say, the ith row with the ith column. So you're just going to have 1's that go all the way down the diagonal. Now what about everything else? Let's say that you're looking for this entry right here, which is the first row, second column. This guy right here is going to be the dot product of v2. This is going to be the dot product of this row-- Oh, sorry. The dot product of v1 is going to be the dot product of this row, with this column right there. So this is going to be v1 dot v2. But these two guys are orthogonal, so what's that going to be equal to? It's going to be equal to 0. This one right here is going to be v1 dot v3. Well that's going to be 0. v1 dot anything other than v1 is going to be 0. Similarly, everything here in the second row, it's going to be v2. The first column in the second row is going to be v2 dot v1, which is clearly 0. And then you have v2 dot v2, which is 1. And then v2 dot all the rest of stuff is going to be 0. They're all orthogonal with respect to each other. And so everything else, if your row and your column is not the same-- Well, if your row and your column is the same, you're going to be dotting the same vectors, so you're going to be getting 1, because all their lengths are 1. But if your row and column are not the same, you're going to be taking the dot product of two different members of your orthonormal basis. And they're all orthogonal, so you're just going to get a bunch of 0's. Now, what is this? You have 0's everywhere, with 1's down the diagonal. It's a k by k matrix. This is the identity matrix in Rk. So normally this was our definition. This was our definition of, or this is our way of finding our transformation matrix for the projection of x onto some subspace. But that simply, if we assume an orthonormal basis, then A transpose A becomes the k by k identity matrix. And so what's the inverse of the identity matrix? So A transpose A inverse becomes the inverse of the k by k identity matrix, which is just the k by k identity matrix. So this simplifies to the projection onto V of our vector x, simplifies to A times the inverse of the identity matrix-- It was just the identity matrix. So it's just A times Ik, times A transpose-- I always forget that second A transpose right there-- times x. And we could just ignore this. That does nothing to it. So it's just equal to A times A transpose, times x, which is a huge simplification. I still have to do a matrix product, but finding the transpose of a matrix is pretty straightforward. You just switch the rows and the columns. First multiplying the transpose times A, that's a lot of work. But it's a huge amount of work to find the inverse of this thing. But now, since we assumed that these columns here form an orthonormal set, this just gets reduced to the identity matrix. And the projection of x onto V is just equal to A times A transpose, where A is the matrix where each of the column vectors are the basis vectors for our subspace V. Anyway, hopefully that gives you even more appreciation for orthonormal bases.