Main content

# Projections onto subspaces with orthonormal bases

## Video transcript

We saw in the last video that orthonormal bases make for good coordinate systems, coordinate systems where it's easy to figure out the coordinates. That's what we did in the last video. Let's see if there are other useful reasons to have an orthonormal basis. So we already know, let's say I have some subspace V. Let's say V is a subspace of Rn. And let's say we have B, which is an orthonormal basis. B is equal to v1, v2, all the way to vk. And it is an orthonormal basis for V, which is just a fancy way of saying that all of these vectors have length 1, and they're all orthogonal with respect to each other. Now, we've seen many times before that if I have just any member of Rn-- So let's say that I have some vector x that is a member of Rn, then x can be represented as a sum of a member of V, as some vector V that is in our subspace, and some vector w, that is in the orthogonal complement of our subspace. Let me write that down. Where V is a member of my subspace, and w is a member of my subspace's orthogonal complement. We saw this when I was doing the whole set of videos on orthogonal complements. Now what is this thing right here? What is this thing right there? By definition, that is the projection of x onto V. This would be the projection of x onto V's orthogonal complement. And we know in the past, that this is not an easy thing to find. Let's say I set up some matrix A, that has my basis vectors as the columns-- So if I set up some matrix A that looks like this, v1, v2, all the way to vk, we learned before that if we wanted to figure out, and have a kind of a general way of figuring out what the projection is, we learned that the projection of any vector x onto V is equal to A times A transpose A inverse, times A, times x. And this was a pain to figure out. That is a pain to figure out. But let's see if the assumption that these guys are orthonormal, or that this is an orthonormal set, in any way simplifies this. So the first thing we can do is just explore this a little bit. This vector V, this is a member of our subspace, which means it could be represented as a linear combination of my basis vectors. So I can write x is equal to, instead of V, I can write c1 times v1, plus c2 times v2, all the way to plus ck times vk. This is the same thing as just any, or some unique, member of my subspace V. So that's V right there, and you can also view this as the projection of x onto the subspace V. So x can be represented as some member of V, and then some member of V's orthogonal complement, plus w right there. Now what happens if we take both sides of this equation, if we dot it with one of these guys with, let's say, vi? Let's dot both sides of this equation with vi. So if I take vi dot x, or vi of the ith basis vector up here, the ith basis vector in the basis for my subspace B, what am I going to get? This is going to be c1 times vi, times v1, plus c2 times vi, times v2, plus you're going to keep going. Eventually you'll get to the ith term, which will be ci times vi dotted with vi. And then, you know, assuming that i isn't 1, 2, or k, eventually we'll get to ck times vi dotted with vk. We saw this in the last video. I'm just dotting both sides. But we also have this w term. So then we'll go plus vi dot w. Now, you know, just to clarify things, in the last video, we assumed that x was inside of the subspace, so that x could be represented with coordinates here. Now x could be any member of Rn, and we're just looking at the projection of x. And because it's any member, it's going to be some combination of these guys plus some member of B's orthogonal complement. Now when I take the dot product of one of my basis vectors, the ith basis vector, with both sides of this equation, this side is just that, but in the right side something very similar happens to what we saw in the last video. What is vi dot v1? Well they're different members of this orthonormal set, so they're orthogonal. So that's going to be 0, vi dot v2, that's 0, assuming vi doesn't equal 2. vi dot vi is 1. So this term is just going to be ci, vi dot vk, that's also 0. It doesn't matter what our constant is because 0 times anything is 0. And then what is vi dot w? Well, by definition w is a member of our orthogonal complement to V, which means it is orthogonal to every member of V. Well this is a member of V, so these two guys are orthogonal. So that is also equal to 0. And just like that, you get ci is equal to vi times xi. Sorry, times x, just like that. So what does this do? This is a kind of a very similar result that we got last time. But remember we're not looking for-- We're not assuming that x is a member of V. In that case then, you know, the ci's would be the coordinates for x. In this case, we're looking for the projection of x onto V, or the member of V that is kind of x's component in V, or that represents x's projection onto V. So if we now want to find the projection of x onto V, it's equal to these ci's times their respective basis vectors, but now we know what the ci's are. They're that basis vector times your vector x. So just like that, we get a pretty simple way of figuring out the projection onto a subspace with an orthonormal basis. So let's see, c1 is just going to be v1 dot x. That's c1, and then we're going to multiply that times the vector v1. That's a vector too. And then the next, I guess we could say, you know, the next coefficient on v2 is going to be v2 dot x times the vector v2. And then you're going to go all the way to plus vk dot x times vk. And I don't know if you remember what we did when we took the projection of x onto some line. When we were taking the projection of x onto some line, where L is equal to the span of some unit vector, where this had a length 1. You know, for t as any real number, that's just a line, sum of the span of some unit vector. Where we assume this has length 1. Then the projection onto a line just simplified to the formula x dot-- let me write it this way --x dot u times the vector u. This was a projection onto a line. Notice when we're dealing with an orthonormal basis for a subspace, when you take a projection of any vector in Rn onto that subspace, it's essentially, you're just finding the projection onto the line spanned by each of these vectors, right? x dot v1 times the vector v1. x times v1 times the vector v1. You're taking x's projection onto the line spanned by each of these guys. That's all it is. But clearly this is a much, much simpler way of finding a projection than going through this mess of saying A times the inverse of A transpose A, times A transpose-- I forgot that A transpose when I wrote it the first time --times x. This is clearly a lot easier. But you might say, OK, this is easier but you told me that a projection is a linear transformation. You've told me it's a linear transformation, so I want to figure out the matrix here. So let's see if being orthonormal in any way simplifies this. So we could always just figure out for any particular x. We can just apply the dot product with each of the basis vectors, those will be the coefficients, and then apply those coefficients times the basis vectors, add them up and you know your projection. But, you know, some of us might actually want the transformation matrix. So let's figure out what it is. Let me just rewrite what we already know. We already know that the projection onto any subspace V of x is equal to A times A transpose A inverse, times A, times x. And where A's column vectors are just the basis vectors v1, v2, all the way to vk. Now, let's see if the assumption that these guys are an orthonormal basis, let's see if this simplifies it at all. Let's take the case in particular of A transpose A. A transpose A is going to be equal to what? It's going to be equal to A transpose-- Let's think about this. These guys are members of Rn, so it's going to be an n by k matrix. So this is n by k, this guy right here is k by n, times an n by k. We're going to have a k by k product. k by n times n by k is going to be k by k. A transpose A is going to be k by k. And what is it A transpose equal to? Well each of these columns are going to become rows. So the first row here is going to be v1 tranpose. The second column here is going to be v2 transpose. Then you're going to go all the way down. The kth column there is going to be to vk transpose. Just like that. And then A is, of course, this thing right there. So A looks like this. You have v1, like that. You have v2, like that. And then you keep going and you have vk, just like that. What's going to happen when we take this product? Let's do a couple of rows right here. So when I take this product, I'm going to get a k by k matrix. Let me write it big so I can explain it reasonably. So what's the first row, first column going to be? It's going to be this row dotted with this column, or v1 dot v1. Well, v1 dot v1, that's nice, that's just 1. And then what's the second row, second column? Well, that's just going be v2. You're going to get your row from this guy and your column from that guy. This row dotted with that column, so v2 dot v2, so that's nice. That'll be a 1. And in general, if you're finding the Aii or you're finding anything along the diagonal, you're going to take, let's say, the ith row with the ith column. So you're just going to have 1's that go all the way down the diagonal. Now what about everything else? Let's say that you're looking for this entry right here, which is the first row, second column. This guy right here is going to be the dot product of v2. This is going to be the dot product of this row-- Oh, sorry. The dot product of v1 is going to be the dot product of this row, with this column right there. So this is going to be v1 dot v2. But these two guys are orthogonal, so what's that going to be equal to? It's going to be equal to 0. This one right here is going to be v1 dot v3. Well that's going to be 0. v1 dot anything other than v1 is going to be 0. Similarly, everything here in the second row, it's going to be v2. The first column in the second row is going to be v2 dot v1, which is clearly 0. And then you have v2 dot v2, which is 1. And then v2 dot all the rest of stuff is going to be 0. They're all orthogonal with respect to each other. And so everything else, if your row and your column is not the same-- Well, if your row and your column is the same, you're going to be dotting the same vectors, so you're going to be getting 1, because all their lengths are 1. But if your row and column are not the same, you're going to be taking the dot product of two different members of your orthonormal basis. And they're all orthogonal, so you're just going to get a bunch of 0's. Now, what is this? You have 0's everywhere, with 1's down the diagonal. It's a k by k matrix. This is the identity matrix in Rk. So normally this was our definition. This was our definition of, or this is our way of finding our transformation matrix for the projection of x onto some subspace. But that simply, if we assume an orthonormal basis, then A transpose A becomes the k by k identity matrix. And so what's the inverse of the identity matrix? So A transpose A inverse becomes the inverse of the k by k identity matrix, which is just the k by k identity matrix. So this simplifies to the projection onto V of our vector x, simplifies to A times the inverse of the identity matrix-- It was just the identity matrix. So it's just A times Ik, times A transpose-- I always forget that second A transpose right there-- times x. And we could just ignore this. That does nothing to it. So it's just equal to A times A transpose, times x, which is a huge simplification. I still have to do a matrix product, but finding the transpose of a matrix is pretty straightforward. You just switch the rows and the columns. First multiplying the transpose times A, that's a lot of work. But it's a huge amount of work to find the inverse of this thing. But now, since we assumed that these columns here form an orthonormal set, this just gets reduced to the identity matrix. And the projection of x onto V is just equal to A times A transpose, where A is the matrix where each of the column vectors are the basis vectors for our subspace V. Anyway, hopefully that gives you even more appreciation for orthonormal bases.