Main content
Course: Linear algebra > Unit 3
Lesson 2: Orthogonal projections- Projections onto subspaces
- Visualizing a projection onto a plane
- A projection onto a subspace is a linear transformation
- Subspace projection matrix example
- Another example of a projection matrix
- Projection is closest vector in subspace
- Least squares approximation
- Least squares examples
- Another least squares example
© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice
A projection onto a subspace is a linear transformation
Showing that a projection onto a subspace is a linear transformation. Created by Sal Khan.
Want to join the conversation?
- Why is it that at12:36Sal cancels
but at(Inverse(transpose(A) A) transpose(A) A)
13:55
can not cancel out, or equivalently, what property would allow the movement of A, if it was allowed? In working with numbers x,y and z I know that x*y*z = y*x*z = y*z*x ... but what property lets me do this and why is it not valid in Matrix Algebra?A (Inverse(transpose(A) A) transpose(A)
Mark(7 votes)- The property that would allow the movement of A is called commutativity. As you said, when working with numbers or in plain algebra, the multiplication order can be swapped around without affecting the result.
http://en.wikipedia.org/wiki/Commutative_property
Matrix multiplication is not commutative, so in the 2nd expression, matrix A cannot be relocated just before transpose(A).
Why is it not commutative? My explanation is that the way it is defined makes it non-commutative. The extreme case of is with non-square matrices: Consider matrix C which is a 3x2 matrix (3 rows, 2 cols), and matrix D which is 2x11 (2 rows, 11 columns). The product CD is a 3x11 matrix. The product DC isn't even permitted/defined. For square matrices, my advice is to create a couple of 2x2 matrices with entries a, b, c, etc, multiply them then multiply with order swapped to see the result is different.
I hope I've helped a bit, but please leave a comment if what I've said didn't properly address your question.(12 votes)
- At11:25Sal references a video--which video is he referring to? Since the playlists have been reshuffled I'm not sure where to find it...(3 votes)
- The one where he shows A^TA is invertible? It's here: https://www.khanacademy.org/math/linear-algebra/matrix_transformations/matrix_transpose/v/lin-alg--showing-that-a-transpose-x-a-is-invertible(3 votes)
- At10:36, why can't you just say that x is equal to Ay? This would imply that x is a member of V so it's projection onto V would just be equal to itself. If x and Ay are not equal that would mean that multiplying by A^T is not a linear transformation; it would be equivalent to a relation outputting the same value for multiple inputs which precludes it as a function. Why can this be ignored? I have to be missing something here.(2 votes)
- If
ab = ac
, then we can say thatb = c
, but only ifa ≠ 0
. A similar rule applies to matrix-vector multiplication. IfA x⃑ = A y⃑
, thenx⃑ = y⃑
only if A is invertible. Since we're dealing with an nxk matrix, we don't know that it's invertible in the context of the video (and it probably isn't since n would have to equal k).(4 votes)
- at9:20he says a couple of times the project of v onto x.
Shouldn't it be the projection of x onto V ? or are they one and the same thing?(2 votes)- yup, you are right, he confused it that one time.(3 votes)
- At4:11why Sal said that x is a member of Rn?
shouldn't be in Rk??(2 votes)- I got confused there too. I think 'x' is a vector not included in the span of V. It would have been clearer with a diagram but I think 'x' is like the vector 'x' in the prior video, where it is outside the subspace V (V in that video was a plane, R2). So 'x' extended into R3 (outside the plane). We can therefore break 'x' into 2 components,
1) its projection into the subspace V, and
2) the component orthogonal to the subspace...
If 'x' had been contained in the span of V there would have been no reason to break it into its components.(1 vote)
- Because the the matrix A is a basis matrix meaning its linearly independent isnt the nullspace and the left null space both trivial, so how can it have members other than the 0 vector?(1 vote)
- Remember, the LEFT nullspace is the null space of the transpose of A. A is the linearly independent matrix, the transpose of A is a different matrix.
If you go back in the video he's talking about the left nullspace, not N(A), which as you pointed out would indeed be trivial.(3 votes)
- When Sal multiplies both sides of the equation by (ATA)-1, how does he know where to put it? In matrix multiplication, order matters, right? Is it simply sufficient to put the (ATA)-1 in the same place in both sides (in this case, right at the start)? Because surely if he would've put (ATA)-1 at the start on the left side and at the end on the right side, that would not be an equality anymore? What if he would've put both of them at the end of the respective sides, would that have been legal?(1 vote)
- when talking about matrices:
if you have 3 matrices, A, B and C with inverses
say you have A=BC
then the following is valid:
(B-1)A=(identity matrix)C -> (B-1)A=C
A(B-1)=BC(B-1), doesnt simplify any more
(id)=(A-1)BC
(id)=BC(A-1)
matrix multiplcation is associative, which means A(BC)=(AB)C
matrix multiplication is not commutative, which means AB/=BA (AB is not equal to BA)(2 votes)
- A(T)A = invertible only if A is sq. martix?(1 vote)
- If we are working in a space with dimension n, (in other words, x is a vector with n components), I think I understand the concept of a projection on to some subspace with a dimension of k which is less than n. But what would the projection of x be onto the space Rn? Or does it even make sense to talk about a projection of an n-dimensional vector to Rn?
My sense is that projecting the (n-dimensional) vector x onto Rn just yields the vector x. So it almost isn't worth even talking about it as a 'projection' at all. So, is there such a thing as a projection of an n-dimensional vector x onto Rn?(1 vote)- A n-dimensional (real) vector already exists in R^n, so I don't think there's any point to project it in R^n as it would yield the vector itself.(1 vote)
- wait, he said, you got 'k' no's of vectors in Rn to span V (subspace of Rn) as a 'basis'. Is that possible?(1 vote)
Video transcript
We've defined the notion of a
projection onto a subspace, but I haven't shown you yet that
it's definitely a linear transformation. Nor have I shown you that if
you know a basis for a subspace, how do you actually
find a projection onto it? So let's see if we can make
any progress on that here. So let's say I have
some subspace. Let's say v is a
subspace of Rn. And let's say I've got some
basis vectors for v. So let's say these are my basis
vectors, basis vector 1, 2, and I have k of them. I don't know what v's
dimension is, but let's say it's k. It's got k basis vectors,
so it is a basis for v. And that means that any vector--
that's not v vector, that's v subspace-- now that
means that any vector-- let me call some vector, I don't know,
let's say, any vector a that is a member of my subspace
can be represented. That means that a can be
represented as a linear combination of these guys. So I'll make my linear
combination. Let's say it is, I don't know,
y1 times b1, plus y2 times b2, all the way to plus
yk times bk. That's what the definition
of a basis is. The span of these guys is your
subspace v, so any member of v can be represented as a linear combination of my basis vectors. Now, if I were to construct a
matrix-- let's make it an n by k matrix-- whose columns are
essentially the basis vectors of my subspace. So A looks like this,
the first column is my first basis vector. My second column is my second
basis vector, and I go all the way to my k column, and I have
k columns, is going to be my k'th basis vector. If I'm going to have my k'th
basis vector-- let me make the closing bracket the same color
as my opening bracket, just like that-- it's going to have
n rows, because each of these basis vectors are
members of Rn. Remember, v is a subspace of Rn,
so each of these guys are going to have n terms.
So this matrix is going to have n rows. Now, saying that any member
of the subspace v can be represented as a linear
combination of these basis vectors, is equivalent to saying
that any member, that a, that any member a of our
subspace v can be represented as the product of our matrix A,
times some vector y, where [INAUDIBLE] was equal to a, for some y,
that is a member of Rk. Now why is this statement and
that statement equivalent? Well you can imagine, if you
were to just multiply this times some vector y in Rk, so
it's y1, y2, all the way down to yk, this is going to be equal
to y1 times v1, plus y2, times b2, all the way to plus yk
times bk, which is the same thing as this. So you can always pick the
right linear combination. You can always pick the right
member, yk, so that you get the right linear combination of
your basis vectors to get any member of your subspace v. So any member of my subspace,
right there, can be represented as the product
of the matrix A with some vector in Rk. Now we don't know much about
this vector here in Rk. Now, the projection-- let's
say that x is just some arbitrary member of Rn-- the
projection of x onto our subspace v, that is by
definition going to be a member of your subspace. Or another way of saying it is
that this guy, the projection onto v of x is going to be equal
to my matrix A, is going to be equal to-- I'll do it in
blue-- is going to be equal to A times some vector y, or
some vector y in Rk. If we knew what that vector y
was, if we could always find it, then we would have a
formula, so to speak, for figuring out the projection
of x onto v. But we don't have that yet. All I've said is, any member of
v can be represented as a product of our matrix A, which
has the basis for v as columns, and some
member of Rk. That just comes out of the fact
that these guys span v, that any member of
v is a linear combination of those guys. We know that the projection of
x onto v is a member of our subspace v, it has to
be inside of v. So it can also be represented
this way. Now what was our definition
of our projection? Our definition of our
projection, we say-- Well, let me write
it this way. We know that x can be
represented as the sum of the projection onto v of x, plus
some member of v complement. Or maybe I could even write,
plus the projection onto the orthogonal complement of v. You could write this way. I could have also written this
as w, where w is a member of v complement. Actually, let me write
it that way. That might make it simpler. I don't want to get too many
projections here-- plus w, where w is a unique member
of the orthogonal complement of v. Or you could say it this way,
if you subtract a projection of x onto v from both sides, you
get x minus the projection of x onto v, is equal to w. Or another way to say it is that
this guy right here is going to be a member of the
orthogonal complement of v, right, because this is
the same thing as w. Now what's the orthogonal
complement of v? We go back to this
matrix here. I have these basis vectors. Right here is the columns. So the column space of A is
going to be equal to v, right? The column space of
A is just the span of these basis vectors. And by definition, that is
going to be equal to my subspace v. Now what is the orthogonal
complement of v? The orthogonal complement of v
is going to be the orthogonal complement of my column space. And what's the orthogonal
complement of a column space? That's equal to the null space
of A transpose, or you could also call that the left
null space of A. But we've seen that many,
many, videos ago. So we could say that x minus the
projection of x onto v as a member of-- let me write
it this way-- x minus the projection onto v of x is a
member of the orthogonal complement of my column space
of my matrix, which is equal to the null space
of A transpose. That's the orthogonal
complement of v. This is the same thing as the
orthogonal complement of v. But what does this mean? What does this mean
right here? This means that if I take A
transpose, and I multiply it times this vector, because it's
a member of A transpose's null space. So if I multiply it times that
vector right there-- so, projection of x onto v-- then
I'm going to get 0. I'm going to get the 0 vector. That's the definition
of a null space. So let's write this out
a little bit more. Let's see if we can
algebraically manipulate it a bit. So if we distribute this matrix
vector product, we get A transpose times the vector
x, minus A transpose, times the projection-- actually let
me write this this way. Instead of keep writing the
projection onto v of x, what did we say earlier
in this video? We said the projection of v onto
x can be represented as the matrix product of
the matrix A, times some vector y in Rk. That's where we started
off the video. So let me write it that way,
because that's what going to simplify our work
a little bit. So I'm going to distribute
the A transpose, A transpose times x. And then A transpose, minus A
transpose, times this thing. This thing I can write as A
times some vector y, and this is just a byproduct of the
notion that the projection is a member of our subspace. Because it's a member of our
subspace, it's going to be some linear combination of
the column vectors of A. We saw that up here, so it can
be represented in this way. So instead of projection onto
v of x, I can just write Ay. This thing and this thing are
equivalent, because this thing is a member of v. And then all of that is going
to be equal to 0. And then if we add this to both
sides of this equation, we get that A transpose x is
equal to A transpose A of y. Now this is interesting. Remember where we started
off here. We said that the projection onto
v of x is equal to Ay for some y that is a member of Rk. If we knew what that y was, if
we could always solve for that y, then the projection of
x would be well defined. And we could always just
figure it out. Now can we solve for y here? Well, we'll be able to solve
for y if we can take the inverse of that matrix. If this major is always
invertible, then we're always going to be able to
solve for y here. Because we just take the inverse
of this matrix and multiply it times the
left side of both sides in this equation. Now if you remember, three
videos ago, I think it was three videos ago, I showed you
that if I have a matrix A whose columns are linearly
independent, then A transpose A is always invertible. The whole reason why I did that
video was for this moment right here. Now, what about our matrix A? Well, our matrix A has column
vectors that form the basis for a subspace. By definition, basis vectors
are linearly independent. So our matrix A has columns that
are linearly independent. And if you watched that video,
and if you believe what I told you, then you'll know that A
transpose A, in our case, is going to be invertible. It has to be inveritible. So let's take the inverse
of it and multiply it times both sides. If we take A transpose A
inverse-- we know that this exists, because A has linearly
independent columns-- and multiply it times this side
right here, A transpose x. And then on this side we get--
well, we're going to do the same thing-- A transpose A
inverse, times this thing right here, A transpose Ay. These two things when you
multiply them, when you multiply the inverse of a
matrix, times the matrix, you're just going to get
the identity matrix. So that's just going to be equal
to the identity matrix. And the identity matrix times y
is just going to be y, so we get-- and this is a vector--
so if I flip them around, I get that y is equal to this
expression right here. A transpose A inverse, which'll
always exist, times A transpose, times x. Now we said the projection of x
onto v is going to be equal to A times y, for some y. Well we just solved for the y
using our definition of a projection. We just were able
to solve for y. So now, we can define our
projection of x onto v as a matrix vector product. So we can write the projection
onto v of our vector x is equal to A, times y, and
y is just equal to that thing right there. So A times A transpose A
inverse-- which always exists because A has linearly
independent columns-- times A transpose, times x. And this thing right here, this
long convoluted thing, that's just some matrix, some
matrix which always exists for any subspace that
has some basis. So we've just been able to
express the projection of x onto a subspace as a matrix
vector product. So anything that can be any
matrix vector product transformation is a linear
transformation. And not only did we show that
it's a linear transformation, we showed that, look, if you can
give me the basis for v, I'm going to make those column
vectors equal to the column of some matrix A. And then if I take matrix A, if
I take its transpose, if I take A transpose times A, and
invert it, and if I multiply them all out in this way,
I'm going to get the transformation matrix
for the projection. Now this might seem really
complicated, and it is hard to do by hand for many, many
projections, but this is super useful if you're going to do
some three-dimensional graphical programming. Let's say you have some
three-dimensional object, and you want to know what it looks
like from the point of view of some observer. So let's say you have
some observer. Some observer's point of view
is essentially going to be some subspace. You want to see what the
projection of this cube onto the subspace, how would it
look to the person who's essentially on to this flat
screen right there. How would that cube look from
this point of view? Well if you know the basis for
this subspace, you can just apply this transformation. You can make a matrix whose
columns are these basis vectors for this observer's
point of view. And then you can apply this to
every vector in this cube in R3, and you'll know exactly how
this cube should look from this person's point of view. So this is actually a
super useful result.