Main content

### Course: Linear algebra > Unit 3

Lesson 4: Orthonormal bases and the Gram-Schmidt process- Introduction to orthonormal bases
- Coordinates with respect to orthonormal bases
- Projections onto subspaces with orthonormal bases
- Finding projection onto subspace with orthonormal basis example
- Example using orthogonal change-of-basis matrix to find transformation matrix
- Orthogonal matrices preserve angles and lengths
- The Gram-Schmidt process
- Gram-Schmidt process example
- Gram-Schmidt example with 3 basis vectors

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Projections onto subspaces with orthonormal bases

Projections onto subspaces with orthonormal bases. Created by Sal Khan.

## Want to join the conversation?

- Isn't A^T * A = I as well? So why doesn't projection of x onto v = x?(5 votes)
- A^T * A is I, but A * A^T is not necessarily. This is because we are no longer dotting the column vectors, which happen to be orthonormal, with themselves. We are instead dotting the rows which themselves. The row vectors, unlike the column vectors might not be orthonormal.(20 votes)

- This and the previous video begin telling reasons why orthonormal bases make things easier to work with.

Perhaps this was mentioned or implied but if the standard-basis is orthonormal, then do the benefits outlined in this and the previous video apply to all problems where we are using the standard basis?

For example, could we just use Projv(x) = AA^Tx when we we are working with standard basis?(5 votes) - what happens if the Gram - Schmidt procedure is applied to a list of vectors that is not linearly independent(2 votes)
- In this video, Sal refers to A = {v1,....vk} as the matrix with the column space composed of the orthonormal elements of the basis vector V for R^n. Based on this assumption, Sal is able to arrive at the conclusion at the end that the projection of a vector x which is an element of R^n onto V can be determined using the equation AA^Tx. But under the same assumption, shouldn't AA^T evaluate to the identity matrix Ik, in which case the result obtained is projection of x onto V is equal to x itself! Is there a basic flaw to the argument that I have presented here?(2 votes)
- did u just say a matrix multiplied by its transpose is a scalar multiplied by the identity matrix? notation confusing as hell(1 vote)

- At1:50the w subspace of orthogonal compliment. The upside down T represents orthogonal compliment?(1 vote)
- Yes, that upside down T does indicate the orthgonal complement of a subspace. So if V is a subspace, V ^ _|_ would be the orthogonal complement of V, and we sometimes read it as "V perp", where the 'perp' is short for "perpendicular."(3 votes)

- I have an observational question.

Sal writes the proj_Vx = (v_1 dot x)v_1 +....etc etc

My question is. From previous video, we said that (v_1 dot x) entries are just the entries of the vector x with respect to B where v_1, v_2...v_k were the basis vectors of B. Does this mean, when we take the proj_Vx = (v_1 dot x)v_1 +....etc etc, is it the same thing as taking [x]_B (x w/ respect to B) times all the basis vectors of the subspace V (V from this video). such that

Proj_Vx= ([x]_B*(v_1, v_2, ...v_k) = projection of x onto the subspace V is equal to x written with respect to the basis of our subsapce V times the basis vectors of our subspace V)

I hope it wasn't too convoluted.(1 vote)- Unfortunately since x is outside of the subspace V, it cannot be represented solely in terms of coordinates with respect to the basis vectors of V. In other words, you cannot rewrite x as [x]_B.

You can, however, write the projection of x onto V in V's coordinate system, since the projection lies in the subspace V. What you're calling [x]_B would be this projection written in basis B. You could then of course convert this projection into standard basis by multiplying B times this "[x]_B".(3 votes)

- I would love to see how to turn an equation of a plane into an orthonormal basis.(1 vote)
- An orthonormal basis is a just column space of vectors that are orthogonal and normalized (length equaling 1), and an equation of a plane in R3 ax + by + cz = d gives you all the information you need for an orthonormal basis. In this case, dealing with a plane in R3, all you need are two orthogonal vectors. It doesn't matter what vectors they are, as long as v1 and v2 are both orthogonal to each other and lie on the same plane. Pick any random x and y and solve for z in the equation for a plane two times with different numbers and you get two points on the plane. The difference between them is your first vector in your orthonormal basis, v1. To find the second is a bit more tricky, but you are already given the normal vector to the plane <a, b, c>, and v2, by definition, is orthogonal to both v1 and n (where n is the normal vector). So, to find v2, all you have to do is take the cross product v1 and n. v2 = v1 * n. Now all you have to do is normalize v1 and v2 and you have your orthonormal basis! V = <v1/||v1||, (v1*n)/||(v1*n)||>(2 votes)

- Just would like to clarify one more point. If A = {v1, v2,....vk}, shouldn't A^T be equal to {v1/r/nv2/r/n...vk}? I see that in this video Sal uses the notation A^T = {v1^T/r/nv2^T/r/n...vk} instead.(1 vote)
- cant even tell what you're talking about with ur notation(1 vote)

- if A is an orthonormal basis of a subspace, then the projection of a vector x unto that subspace becomes [AA^T]x.

therefore [AA^T ]x = sum{ (x.u_i)u_i } where i = 1,2,3...

is this correct?(1 vote) - wait so at15:49A*A^t isnt the identity even if A^t*A is(1 vote)

## Video transcript

We saw in the last video that
orthonormal bases make for good coordinate systems,
coordinate systems where it's easy to figure out
the coordinates. That's what we did in
the last video. Let's see if there are other
useful reasons to have an orthonormal basis. So we already know, let's say
I have some subspace V. Let's say V is a
subspace of Rn. And let's say we have B, which
is an orthonormal basis. B is equal to v1, v2,
all the way to vk. And it is an orthonormal basis
for V, which is just a fancy way of saying that all of these
vectors have length 1, and they're all orthogonal with
respect to each other. Now, we've seen many times
before that if I have just any member of Rn-- So let's say that
I have some vector x that is a member of Rn, then x can
be represented as a sum of a member of V, as some vector V
that is in our subspace, and some vector w, that is
in the orthogonal complement of our subspace. Let me write that down. Where V is a member of my
subspace, and w is a member of my subspace's orthogonal
complement. We saw this when I was doing
the whole set of videos on orthogonal complements. Now what is this thing
right here? What is this thing
right there? By definition, that is the
projection of x onto V. This would be the projection
of x onto V's orthogonal complement. And we know in the past,
that this is not an easy thing to find. Let's say I set up some matrix
A, that has my basis vectors as the columns-- So if I set
up some matrix A that looks like this, v1, v2, all the way
to vk, we learned before that if we wanted to figure out, and
have a kind of a general way of figuring out what the
projection is, we learned that the projection of any vector x
onto V is equal to A times A transpose A inverse,
times A, times x. And this was a pain
to figure out. That is a pain to figure out. But let's see if the assumption
that these guys are orthonormal, or that this is an
orthonormal set, in any way simplifies this. So the first thing we can
do is just explore this a little bit. This vector V, this is a member
of our subspace, which means it could be represented as
a linear combination of my basis vectors. So I can write x is equal to,
instead of V, I can write c1 times v1, plus c2 times
v2, all the way to plus ck times vk. This is the same thing as just
any, or some unique, member of my subspace V. So that's V right there, and you
can also view this as the projection of x onto
the subspace V. So x can be represented as some
member of V, and then some member of V's orthogonal
complement, plus w right there. Now what happens if we take both
sides of this equation, if we dot it with one of these
guys with, let's say, vi? Let's dot both sides of
this equation with vi. So if I take vi dot x, or vi
of the ith basis vector up here, the ith basis vector in
the basis for my subspace B, what am I going to get? This is going to be c1 times
vi, times v1, plus c2 times vi, times v2, plus you're
going to keep going. Eventually you'll get to the
ith term, which will be ci times vi dotted with vi. And then, you know, assuming
that i isn't 1, 2, or k, eventually we'll get to ck
times vi dotted with vk. We saw this in the last video. I'm just dotting both sides. But we also have this w term. So then we'll go
plus vi dot w. Now, you know, just to clarify
things, in the last video, we assumed that x was inside of the
subspace, so that x could be represented with
coordinates here. Now x could be any member of Rn,
and we're just looking at the projection of x. And because it's any member,
it's going to be some combination of these guys plus
some member of B's orthogonal complement. Now when I take the dot product
of one of my basis vectors, the ith basis vector,
with both sides of this equation, this side is just
that, but in the right side something very similar
happens to what we saw in the last video. What is vi dot v1? Well they're different members
of this orthonormal set, so they're orthogonal. So that's going to be 0, vi dot
v2, that's 0, assuming vi doesn't equal 2. vi dot vi is 1. So this term is just going
to be ci, vi dot vk, that's also 0. It doesn't matter what our
constant is because 0 times anything is 0. And then what is vi dot w? Well, by definition w is a
member of our orthogonal complement to V, which means
it is orthogonal to every member of V. Well this is a member of V, so
these two guys are orthogonal. So that is also equal to 0. And just like that, you get ci
is equal to vi times xi. Sorry, times x, just
like that. So what does this do? This is a kind of a very
similar result that we got last time. But remember we're not looking
for-- We're not assuming that x is a member of V. In that case then, you know,
the ci's would be the coordinates for x. In this case, we're looking for
the projection of x onto V, or the member of V that is
kind of x's component in V, or that represents x's
projection onto V. So if we now want to find the
projection of x onto V, it's equal to these ci's times
their respective basis vectors, but now we know
what the ci's are. They're that basis vector
times your vector x. So just like that, we get a
pretty simple way of figuring out the projection onto
a subspace with an orthonormal basis. So let's see, c1 is just
going to be v1 dot x. That's c1, and then we're going
to multiply that times the vector v1. That's a vector too. And then the next, I guess we
could say, you know, the next coefficient on v2 is going
to be v2 dot x times the vector v2. And then you're going to
go all the way to plus vk dot x times vk. And I don't know if you remember
what we did when we took the projection of
x onto some line. When we were taking the
projection of x onto some line, where L is equal to the
span of some unit vector, where this had a length 1. You know, for t as any real
number, that's just a line, sum of the span of
some unit vector. Where we assume this
has length 1. Then the projection onto a line
just simplified to the formula x dot-- let me write
it this way --x dot u times the vector u. This was a projection
onto a line. Notice when we're dealing with
an orthonormal basis for a subspace, when you take a
projection of any vector in Rn onto that subspace, it's
essentially, you're just finding the projection onto the
line spanned by each of these vectors, right? x dot v1 times the vector v1. x times v1 times
the vector v1. You're taking x's projection
onto the line spanned by each of these guys. That's all it is. But clearly this is a much, much
simpler way of finding a projection than going through
this mess of saying A times the inverse of A transpose A,
times A transpose-- I forgot that A transpose when I wrote
it the first time --times x. This is clearly a lot easier. But you might say, OK, this is
easier but you told me that a projection is a linear
transformation. You've told me it's a linear
transformation, so I want to figure out the matrix here. So let's see if being
orthonormal in any way simplifies this. So we could always just figure
out for any particular x. We can just apply the dot
product with each of the basis vectors, those will be the
coefficients, and then apply those coefficients times the
basis vectors, add them up and you know your projection. But, you know, some of us
might actually want the transformation matrix. So let's figure out
what it is. Let me just rewrite what
we already know. We already know that the
projection onto any subspace V of x is equal to A times
A transpose A inverse, times A, times x. And where A's column vectors are
just the basis vectors v1, v2, all the way to vk. Now, let's see if the assumption
that these guys are an orthonormal basis,
let's see if this simplifies it at all. Let's take the case in
particular of A transpose A. A transpose A is going
to be equal to what? It's going to be equal
to A transpose-- Let's think about this. These guys are members of
Rn, so it's going to be an n by k matrix. So this is n by k, this guy
right here is k by n, times an n by k. We're going to have
a k by k product. k by n times n by k is
going to be k by k. A transpose A is going
to be k by k. And what is it A transpose
equal to? Well each of these columns
are going to become rows. So the first row here is going
to be v1 tranpose. The second column here is going
to be v2 transpose. Then you're going to go
all the way down. The kth column there is going
to be to vk transpose. Just like that. And then A is, of course,
this thing right there. So A looks like this. You have v1, like that. You have v2, like that. And then you keep going and you
have vk, just like that. What's going to happen when
we take this product? Let's do a couple of
rows right here. So when I take this product,
I'm going to get a k by k matrix. Let me write it big so I can
explain it reasonably. So what's the first row, first
column going to be? It's going to be this
row dotted with this column, or v1 dot v1. Well, v1 dot v1, that's
nice, that's just 1. And then what's the second
row, second column? Well, that's just going be v2. You're going to get your row
from this guy and your column from that guy. This row dotted with that
column, so v2 dot v2, so that's nice. That'll be a 1. And in general, if you're
finding the Aii or you're finding anything along the
diagonal, you're going to take, let's say, the ith row
with the ith column. So you're just going to have 1's
that go all the way down the diagonal. Now what about everything
else? Let's say that you're looking
for this entry right here, which is the first row,
second column. This guy right here is going to
be the dot product of v2. This is going to be
the dot product of this row-- Oh, sorry. The dot product of v1 is going
to be the dot product of this row, with this column
right there. So this is going to
be v1 dot v2. But these two guys are
orthogonal, so what's that going to be equal to? It's going to be equal to 0. This one right here is going
to be v1 dot v3. Well that's going to be 0. v1 dot anything other than
v1 is going to be 0. Similarly, everything here
in the second row, it's going to be v2. The first column in the second
row is going to be v2 dot v1, which is clearly 0. And then you have v2
dot v2, which is 1. And then v2 dot all the rest
of stuff is going to be 0. They're all orthogonal with
respect to each other. And so everything else, if your
row and your column is not the same-- Well, if your
row and your column is the same, you're going to be dotting
the same vectors, so you're going to be getting
1, because all their lengths are 1. But if your row and column are
not the same, you're going to be taking the dot product of two
different members of your orthonormal basis. And they're all orthogonal, so
you're just going to get a bunch of 0's. Now, what is this? You have 0's everywhere, with
1's down the diagonal. It's a k by k matrix. This is the identity
matrix in Rk. So normally this was
our definition. This was our definition of, or
this is our way of finding our transformation matrix
for the projection of x onto some subspace. But that simply, if we assume
an orthonormal basis, then A transpose A becomes the k
by k identity matrix. And so what's the inverse
of the identity matrix? So A transpose A inverse becomes
the inverse of the k by k identity matrix,
which is just the k by k identity matrix. So this simplifies to the
projection onto V of our vector x, simplifies to A
times the inverse of the identity matrix-- It was just
the identity matrix. So it's just A times Ik, times
A transpose-- I always forget that second A transpose
right there-- times x. And we could just ignore this. That does nothing to it. So it's just equal to A times A
transpose, times x, which is a huge simplification. I still have to do a matrix
product, but finding the transpose of a matrix is
pretty straightforward. You just switch the rows
and the columns. First multiplying the transpose
times A, that's a lot of work. But it's a huge amount
of work to find the inverse of this thing. But now, since we assumed that
these columns here form an orthonormal set, this
just gets reduced to the identity matrix. And the projection of x onto V
is just equal to A times A transpose, where A is the
matrix where each of the column vectors are the basis
vectors for our subspace V. Anyway, hopefully that gives you
even more appreciation for orthonormal bases.