If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

The Gram-Schmidt process

Finding an orthonormal basis for a subspace using the Gram-Schmidt Process. Created by Sal Khan.

Want to join the conversation?

Video transcript

Let's say I have a set of linearly independent vectors, V1, V2, all the way to Vk, that are a basis for V. We've seen this many times before. That's nice. It's a basis, but we've learned over the last few videos it would be even better to have an orthonormal basis for V. What we're going to address in this video is, is there some way that we can construct an orthonormal basis for V, given that we already have this basis, which I'm assuming isn't orthonormal. It'll work whether or not it's orthonormal. It'll just generate another orthonormal basis. But can we somehow, just given any basis, generate an orthonormal basis for V, and then be able to benefit from all of the properties of having an orthonormal basis? Let's see if we can make some headway here. Let's think about it for simple cases. Let's say that I have a one-dimensional subspace. Let's call that V1, for it's a one-dimensional subspace. I'm going to say it's the span of just this vector V1. Now, or you could say the vector V1 is the basis for the subspace V1. If I had this simple case, how can I ensure that this is orthonormal? What I could do is I could define some vector, let's call it U1. Obviously, this is orthogonal to all of the other guys. There aren't any other guys here. [INAUDIBLE] vectors there, but there aren't any other members of this set, so it's orthogonal to everything else because there isn't anything else. And then to make its length equal one, you can take this vector and divide it by its length. So if we define some vector U1 to be equal to V1 divided by the length of V1, then the length of U1 is going to be what? It's going to be the length of V1 over the length of V1 like this. This is just a constant right here. This is going to be 1 over the length of V1. That could be 5, times length of V1, which is just equal to 1. This one will have a length of 1. So just like that, if we have the set of just U1, we could say that just the set of U1 is an orthonormal basis for V1, for the subspace V1. Now, that was trivially easy. If k happens to be 1, we're done. That was super easy. We just have to divide by our length. You have to just essentially just normalize this vector, and you have an orthonormal basis because there's nothing else to be orthogonal with. Well let's complicated the issue a little bit. Let's say we go to two dimensions. Let's say we have a subspace V2. That's the span of, let's say it's the first two of these vectors. It's the span of V1 and the vector V2. Now, we know that V1, V1 can easily be represented as-- or, V1 is a linear combination of U1. How do I know that? Well, I can just multiply both sides of this times the length of V1. We have U1 times the length of V1 is equal to V1. So we could say this is the same thing. This is equivalent to saying the span of the vector U1 and the vector V2, where U1 is the vector we got up here. How can I say this? Because anything that can be a linear combination of these guys can be a linear combination of those guys. Because where ever you had a V1, you can replace the V1 with the linear combination of U1 that gives you V1. So you just multiply U1 times that scalar and you get it. I think you get the point. But how can we ensure that this is an orthonormal set? What do we do? Let's graph it. This is going to be a plane in Rn. So let's just make our chalkboard here, or our video board here, the plane. So we have U1, which is a unit vector. It has length 1. So that is U1. V1 and V2 are linearly independent, that's by definition of a basis. So you can't represent V2 as a linear multiple or linear combination of V1. Likewise, you can't represent V2 as a linear combination of U1 because U1 is a linear combination of U1. So, V1 is not going to be on the line spanned by U1. So I could actually draw that. The line spanned by U1 is just like this. This is the line spanned by U1. Let me draw it a little bit better than that. I don't want to make it too dark. So this is the line. I'll do one last try. The line spanned by U1 looks like this. It looks like that, it's just a line. And that is the subspace V1, right? With the span of U1. So this is equal to the span of U1. All we did is normalize V1 right there to get to U1. So the span of V1 is the same span of U1. That is this subspace. That's this line in Rn. We have vector V2, which is linearly independent from V1 and U1. V2 would look, let's just say something like that. V2. Now, what we want to do is replace V2 with another vector that is definitely orthogonal to this, and will still be able to construct V2 with some combination of this and our new vector. Well, the most obvious vector would be some vector that is orthogonal to V1. So it's orthogonal, so it's a member of the orthogonal complement of V1. If you look at it just visually here, if I add some member of V1 to that member of the orthogonal complement of V1, I'm going to get V2. In fact, we've seen that multiple times. We know that any vector in Rn, let's say V2, can be represented as a sum of two vectors. I'll call them x and y, where x is a member of V1 and y is a member of the orthogonal complement of V1. We've seen that multiple times. Now what was this thing by definition? We're looking for this thing. This is the x. And this is the y. We're looking for the y, because if we can find this vector y, then if we replace V with that vector y, we'll still be able to generate V because you can take a multiple of U and add it to y and get V. And so anything that you were able to generate with V2 before, you can now generate with our U1, multiples of that, plus multiples of our new vector. So, we're trying to solve, we're trying to figure out what this vector y is right there. So how do we do it? Well, it's going to be V2 minus this vector x. Right? What is this vector x, by definition? This vector x is, by definition, the projection of V2 onto the subspace V1. So, the vector that we're trying to find, the vector y, and if we find it, and that we can replace V2 with y, the vector y, is just equal to V2-- let me write it this way-- minus the projection of V2 onto V1. That's what y is. Remember, if we can replace V2-- the reason why the span of U1, V2 is the same thing as the span of U1-- let me call this y2 here. I'm going to just call that y2, because we're probably going to have to use y's in the future. You want y2. So the reason why this span is the same thing as the span of U1 and y2 is because I can generate V2 with a linear combination of U1 and y2, right? I can scale up U1 and then add y2 and I'll get to V2. So anything that I could have generated with V2 I can get with a linear combination of these guys. That's why these are equivalent. And what's neat about this now is that these guys are orthogonal complements-- or these guys are orthogonal relative to each other, right? By definition, y was a member of the orthogonal complements. If you dot these two guys, you're going to get 0. How can we actually solve for it? Well, this is useful as well because V1 has an orthonormal basis. And we saw, I think it was two or three videos ago, the neat thing about orthonormal bases is, it's very easy to determine the projection onto those bases. It's essentially, let me write it over here, the projection of the vector V2 onto the subspace V1 is equal to V2-- let me write it this way-- V2 dotted with the subspaces V1's first basis vector, which is the vector U1, right? That's its first basis vector. We're dealing with an orthonormal basis-- times U1. And then if we had more basis vectors, we would say plus V2 dotted with our next basis vector, times that basis vector, and so on and so forth. But V1 only has one basis vector. It only has the basis vector U1 right here. Right? It is only spanned by that. So, we can rewrite this thing right here. ys, this vector right here that I'm going to replace V2 with, is equal to V2 minus the projection of V2 onto the subspace V1, on to that line, which is just this. V2 dot U1 times the vector U1. And just like that, we've solved for y. So we have a basis for V2, where this guy and this guy are orthogonal. This guy is the unit vector. He's been normalized, but this guy hasn't been normalized yet. So to normalize it, let's just define some other vector as U2 and do the same thing. Let's just normalize it. So U2 is equal to y2, divided by the length of y2. Now, we can say that the subspace V2 is equal to the span of U1 and, instead of y2, I'll put U2 there, because I can generate y2 by just scaling up U2. What's neat about this now is, I have two unit vectors, or two normalized vectors, and they're orthogonal with respect to each and they span what V1 and V2 spanned originally. Now, we need to keep going. What about if we want to go to V3 here, what do we do? Well, we do the same idea. So let's define the subspace V3. This is going to be a three-dimensional subspace. Once we get beyond V3, it becomes a little hard to visualize, but I think you're going to see the pattern after this step. If we define V3 as equal to the span of these guys, U1, U2, and then, in our original basis, V3. I didn't write it there but there's a V3 there. So in our original non-orthonormal basis, we have V3. What is this going to look like? Or how can we turn this into an orthonormal basis? So, if you visualize all of these, so the span of U1 and U2, our subspace V2 is just going to be a plane. It's going to be a plane in R3. It's going to look like that right there. And so, our new span is going to be everything in that plane, all linear combinations of things in that plane, plus linear combinations of our vector V3, which is linearly independent from these guys because it was linearly independent from the stuff we used to construct these guys. So V3 is going to pop out of this plane. It can't be represented by the linear combination of those guys. So let's say V3 pops out of the plane, like that. Now, we want another vector that can represent everything of this span, but it is orthogonal to these dudes right here. Or another way to think about it, it's orthogonal to the plane. So we want another vector that is orthogonal to the plane. So let's call that vector y3, y sub 3, not y to the third power, y sub 3. And if we figure out our y sub three, we can replace V3 with it because V3 can be represented as a linear combination of U1 and U2. Right? That's going to be a linear combination of U1, U2 is going to be some vector in this plane plus y3. I can represent this guy with this green vector plus this vector right here. So if we replace him with y3, we can still get to V3 and we can still get to all the linear combinations that V3 can help construct. So what is this y3? Well, by the same exact logic, this green vector right here is the projection of my vector V3 onto the subspace V2 and your vector y3 is just equal to the vector V3 minus the projection of V3 onto V2. So, what is that going to be? Well, the projection-- I'll write it here-- the projection onto the subspace V2 of V3, of the vector V3 is going to be equal to-- we're going to use the same exact logic we did before. We saw this two or three videos ago. Because V2 is defined with an orthonormal basis, we can say that the projection of V3 onto that subspace is V3, dot our first basis vector, dot U1, times our first basis vector, plus V3 dot our second basis vector, our second orthonormal basis vector, times our second orthonormal basis vector. It's that easy. That's one of the neat things about having an orthonormal basis was. This is an orthonormal basis. So we can define the projection in this way. So y is just going to be V3 minus this business right here, and I'll just write out. So y3 is going to be V3 minus the projection of V3 onto V2. So minus that business right there. Let me copy it, and then let me paste it. And then you get y3. And y3 is nice. So if we replace this with y3, which we can because we can now get V3 as a linear combination of these guys and y3. That's nice because all of these guys are orthogonal with respect to each other, but y3 does not have length 1 yet. It hasn't been normalized. So we can just replace y3 with another unit vector. We'll just say, U3 is equal to y3 divided by the length of y3, whatever that might be. But if we have what y3 is, we can figure out its length and divide by that. And then if you put-- so this is equal to the span of U1, U2, and U3. So U3 might be a scaled-down version of y3. We're talking about linear combinations when we talk about span, so you can scale them up and add it to the projection. You're still going to get V3. And now, these are all normalized vectors. And so you would not have an orthonormal basis for the subspace V3. I think you get the pattern now. You can keep doing this. You can define V4, when I defined the problem I just said that we're dealing with a k dimensional subspace. We go up to Vk. But you just keep going until you go up to your k. If k was 3, we'd be done. If k is 4, you say you define a subspace V4 is equal to the span of U1, U2, U3, and then throw our non-orthonormal vector, V4, into the mix. And so you can replace V4 with y4 is equal V4 minus the projection. It becomes a little hard to visualize. You're not projecting on to a three-dimensional space. The projection of V4 onto our subspace V3. It's completely analogous to this. It's just V3 is now a three-dimensional space, not a plane. You're [? finding ?] this projection onto this. This is definitely going to be orthogonal to everything else, to our subspace V3. And you can construct V4 with y4 because V4, by definition, is equal to-- we can just rearrange this-- y4 plus the projection of V4 onto V3. So you can construct V4 with y4 and some linear combination of these guys. So I can replace this guy with y4 and then I would normalize y4. I would divide it by its length and I'd get U4. And I just keep doing that until I get up to k. If I do V5, I do the process over and over and over again. And this process of creating an orthonormal basis is called the Gram-Schmidt Process. And it might seem a little abstract, the way I did it here, but in the next video I'm actually going to find orthonormal bases for subspaces. You'll see it's not too bad when you have to deal with real numbers.