Main content

## Linear algebra

### Course: Linear algebra > Unit 2

Lesson 7: Transpose of a matrix- Transpose of a matrix
- Determinant of transpose
- Transpose of a matrix product
- Transposes of sums and inverses
- Transpose of a vector
- Rowspace and left nullspace
- Visualizations of left nullspace and rowspace
- rank(a) = rank(transpose of a)
- Showing that A-transpose x A is invertible

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Showing that A-transpose x A is invertible

Showing that (transpose of A)(A) is invertible if A has linearly independent columns. Created by Sal Khan.

## Want to join the conversation?

- Although not explicitly stated, it's obvious that the n (in n x k) must be >= k isn't it? I say this because if there were more columns than rows, one couldn't make all the column vectors linearly independent.(36 votes)
- Astute observation, the amount rows must indeed be equal to or greater than the columns in order for it to be linearly independent, otherwise you'd get some non-pivot columns.(19 votes)

- If a square matrix needs all columns/rows to be linearly independent, and also determinant not equal to 0 in order to be invertible, so is determinant just the kind of measure of non-linear-dependence of rows/columns of a matrix?(4 votes)
- Yes it is. If the determinant is not zero, then the rows and columns will be linearly independent and if the determinant is zero, then the rows and columns will not be linearly independent.(3 votes)

- At6:25,Sal mentions the transpose of the "reverse product". Is "reverse product" an actual term used in Linear Algebra or did Sal just make it up on the spot?(3 votes)
- So does this only work if you multiply a matrix by its transpose in that order or can you switch them around?

Also, if you try this with a matrix that doesn't have linearly independent rows then does that mean you know for sure that the product won't be invertible?(3 votes)- It's only true if A is a square matrix. Because AxA(transpose) =/= A(transpose)xA that's why we can't say that A x A-transpose is invertible. You can prove it if you follow the same process for A x A-transpose. You won't end up at the same conclusion.(1 vote)

- Why must the Null(A) only contain the 0 vector? Can't it contain any perpendicular vector, especially when the Rank(A)<n?(2 votes)
- As I understand it the columns/ vectors must be linearly independent. Then you are solving the function Ax=0, so x must be a vector with k elements.

If the rank is less than n like you offer, or in other words k<n, then you can get them in the form similar to an identity matrix with extra 0s below each 1. You also know that the vector x you are multiplying is going to have k elements. Hopefully you can see the only vector k that will make the 0 vector is another 0 vector, if not I can show it.

if n=k, it's a similar argument except you will literally get an identity matrix.

if k>n, so more columns than rows it is impossible to make the matrix linearly independent. There will not be enough pivot columns to fill each column.

To deal with the case you specifically offer let's use a 3x2 matrix. Linear independence means it will eventually be reduced to [<1,0,0>,<0,1,0>] (Hopefully that makes sense what it should look like.) Now your solution is make a dot product with a perpendicular vector, which we could observe is <0,0,1> So we have a 3x2 multiplied by a 3x1. This cannot be done due to the dimensions

If you have learned about left nullspaces, or the null space of the transpose of a matrix, that's what <0,0,1> is here. or it could be <0,0,a> where a is any number. left nullspace can also be shown that the transpose of the left nullspace times A on the left, so x^T * A = 0

Let me know if that didn't make sense.(2 votes)

- What happens if the column vectors of A are not L.I?

(A^T x A) is still invertible?

what if instead of taking the product of (A^T x A), I take the product of (A x A^T)?

By the way, thanks a lot Sal, I've Learned too much with your videos.(3 votes)- For your second question:

If (A_t)A is invertible, then so is A(A_t), because

A(A_t) = ((A_t)_t)(A_t) = (B_t)B, which is also the transpose of a matrix times the matrix.(0 votes)

- In this video Sal mentions that the dot product of the transpose of a vector to itself is equivalent to the product of the vector to itself, i.e., y^T . y = y . y. This is definitely intuitive but is there a formal proof, if at all, given in any other video?(2 votes)
- Using the rules for matrix multiplication, what's the product of the matrix A = [a1 a2 ... an] and the matrix A^T (a matrix consisting of one column vector)?

For a = (a1, a2, ..., an), what's a dot a?

These calculations are just the direct applications of the definitions of matrix multiplication and dot product, which, as definitions, are not provable.

If you're not sure how to calculate these, search "khan academy matrix multiplication" and "khan academy dot product" at DuckDuckGo for the videos.(1 vote)

- Does (A^T A)^-1 represent any particular mathematic property or definition? Maybe covariance or dispersion? what do the individual elements mean?

Thanks(2 votes)- I don't know. But since for any matrix B,

rank(B) = rank(B^T) = rank(row space of B),

both the columns and rows of "B = (A^T)A" are linearly independent sets, and so

both rref(B) and rref(B ^T) are identity matrices, and the solution spaces for "Bx=b" and "(B^T)x=c" are just fixed vectors, with no free variables and so in general no vector spaces (unless it's the null space).(1 vote)

- Heyyy, i just thought, in order for the column vectors to be linearly independent (L.I) and for the row vectors also to be L.I. The number columns and the number of rows have to be ATLEAST equal to each other. Isn't it? I mean you atleast need the same number of variables as the number of equations?(1 vote)
- That's correct. The rows being independent, the columns being independent, and the matrix being invertible are all equivalent properties, and only square matrices are invertible.(2 votes)

- What is the utility of showing that A^T * A is invertible? In real-life, is there ever a need to multiply a matrix by its transpose to make it invertible?(1 vote)
- Yes. The least squares regression method uses A^T A and is one of the most important methods of statistics.(2 votes)

## Video transcript

OK. I've got some matrix A. It's an n by k matrix. Let's say it's not just
any n by k matrix. This matrix A has a bunch of
columns that are all linearly independent. So, a1. a2, all the way through ak
are linearly independent. They are linearly independent
columns. Let me write that down. a1, a2, all the column
vectors of A. All the way through ak are
linearly independent. Now, what does that mean? That means that the only
solution to x1 times a1 plus x2 times a2, plus all the
way to xk times ak. The only solution to this is all
of these x's have to be 0. So, all xi's must
be equal to 0. That's what linear independence
implies. Or another way to write it is
all the solutions to this equation x1, x2, all
the way down to xk equaling the zero vector. That all the solutions to this
are all of these entries have to be equal to 0. This is just another way of
writing this right there. We've seen it multiple times. That's the zero vector
right there. So if all of these have to be 0,
that's like saying that the only solution to ax is equal
to 0, is x is equal to the zero vector. Or another way to say it-- this
is all coming out of the fact that this guy's columns
are linearly independent. So linear independence
of columns. Based on that, we can say, since
the only solution to ax is equal to 0 is x is equal to
the zero vector, we know that the null space of a must be
equal to the zero vector. Or it's a set with the just
the zero vector in it. And that is all a
bit of review. Now, n by k. We don't know its dimensions. It may or may not be
a square matrix. So we don't know, necessarily,
whether it's invertible and all of that. But maybe we can construct an
invertible matrix with it. So, let's study a transpose
times a. a transpose times a. A is an n by k matrix. A transpose will be
a k by n matrix. So, A transpose a is going
to be a k by k matrix. So it's a square matrix. So that's a nice place
to start for an invertible matrix. So let's see if it is
actually invertible. We don't know anything
about A. All we know is its columns
are linearly independent. Let's see if A transpose
a is invertible. Essentially, to show that it's
invertible, if we can show that all of its columns are
linearly independent, then we'll know it's invertible. If we have any-- and I'll
get back to this at the end of the video. But if you have a square
matrix with linearly independent columns-- remember,
the linearly independent columns all are
associated with pivot columns when you put them in reduced
row echelon form. So if you have a square matrix,
then you're going to have exactly-- so if it's a k by
k matrix, that means you're going to have k-- that means
that the reduced row echelon form of a matrix will have k
pivot columns and be k by k. And be a square k by k matrix. And there's only one k by k
matrix with k pivot columns. And that's the identity
matrix. The k by k identity matrix. And if when you do something
to reduce row echelon form, and it you got the identity
matrix, that means that your matrix is invertible. I could have probably left that
to the end of the video, but I just want to show you. If we can show that-- we already
know that this guy's square, that a transpose
A is a square matrix. If we can show that, given
that a has linearly independent columns, that a
transpose times A also has linearly independent columns,
and given the columns are linearly independent, and it's a
square matrix, that tells us that when we put it into reduced
row echelon form, we'll get the identity matrix. And that tells us that this
thing would be invertible. Let's see if we can prove that
all of this guy's columns are linearly independent. So let's say I have
some vector V. Let's say my vector V is a
member of the null space of a transpose A. That means that if I take a
transpose A times my vector v, I'm going to get the
zero vector. Fair enough? Now, what happens if I multiply
both sides of the equation times the transpose
of this guy? So I'll get a v transpose--
actually let me just do it right here. I multiply v transpose
on this side, and v transpose on this side. You could view this as a
matrix vector product. Right? Or, in general, if you take a
row vector times a column vector, it's essentially
their dot product. So this right-hand side of the
equation, you dot anything with the zero vector. That is just going to
be the zero vector. Now what is the left-hand side
of this going to be? We've seen this before. If you have the transpose of--
we can view this as, even though it's a transpose of a
vector, you can view it as a-- it is a row factor,
but you could also view it as a matrix. Right? Let's say v is a
k by 1 matrix. v transpose will be
a 1 by k matrix. We've seen this before. That that is equal to the
reverse product, the transpose of the reverse product. Or if we take the product of two
things and transpose it, that's the same thing as taking
the reverse product of the transposes of either
of those two matrices. So given that, we can replace
this right here with a times a vector v transpose-- and we're
multiplying this vector times av times this vector
right here. And that is going to be equal
to the zero vector. Now, what is this? If I'm taking some vector's
transpose, and let's say this is a vector. Remember, even though I have a
matrix vector product right here, when I multiply a matrix
times this vector, it will result in another vector. So this is a vector, and this
is a vector right here. And if I take some vector and I
multiply its transpose times that vector-- we've
seen this before. That is the same thing
as y dot y. These two statements
are identical. So this thing right here is the
same thing as av dot av. And so what does the right-hand
side equal? The right-hand side is going
to be equal to 0. Actually let me just make
a correction up here. When I take v transpose times
the zero vector, v transpose is going to have k elements. And then the zero vector is also
going to have k elements. And when I take this product
that's like dotting it. You're taking the dot
product of v and 0. So this is a dot product of v
with the zero vector which is equal to zero, the
scalar zero. So this right here's
the scalar zero. I want to make sure
I clarify that. It wouldn't make sense
otherwise. So the right-hand side, when I
multiply the zero vector times the transpose of v, gets
just the number zero. No vector zero there. So this av dot av is going
to be equal to 0. Or we could say that the
magnitude, or the length, of av squared is equal to 0. Or that tells us that av
has to be equal to 0. The only vector whose length
is 0, is the zero vector. So av-- let me switch colors. Using that a little
bit too much. So we know that av
must be equal to 0, to the zero vector. This must be equal to the zero
vector since its length is 0. Now, we started off with saying
v is a member of the null space of a transpose A. v can be any member of the null
space of a transpose A. But then from that assumption,
it turns out that V also has to be a member of the
null space of A. That av is equal to 0. Let's write that down. If v is a member of the null
space of a transpose A, then v is a member of the
null space of a. Now, our null space of A,
because A's columns are linearly independent, it only
contains one vector. It only contains the
zero vector. So, if this guy's a member of
the null space of A transpose A, and he has to be a member
of the null space of A, there's only one thing
he can be. There's only one entry there. So then v has to be equal
to the zero vector. Or another way to say that is,
any v that's in our null space of a transpose A has to
be the zero vector. Or the null space of a transpose
A is equal to the null space of a which is equal
to just the zero factor sitting there. Now, what does that do for us? That tells us that the only
solution to a transpose A times some vector x equal to
zero, this says that the only solution is the zero vector is
equal to the zero vector. Right? Because the null space of a
transpose A is the same as the null space of a. And that just has the
zero vector in it. The null space is just
the solution to this. So if the only solution to the
null space is this, that means that the columns of a transpose
A are linearly independent. You could, essentially, write
all of the linear combinations of the columns by the weights
of the entries of x. We actually did that
at the beginning. It's the same argument
we used up here. So if all of their columns are
linearly independent, and I said it over here, a transpose
A has linearly independent columns, and it's a square
matrix, that was from the definition of it. So we now know that A transpose
A if I were to put it-- let me do this way. That tells me that the reduced
row echelon form of a transpose A is going to be equal
to the k by k identity matrix which tells me that a
transpose A is invertible. Which is a pretty neat result. I started with the
matrix that has linearly independent columns. So it wasn't just any matrix. It wasn't just any run
of the mill matrix. It did have linearly independent
columns, but it might have weird dimensions. It's not necessarily
a square matrix. But I could construct
a square matrix. a transpose A with it. And we now know that
it also has linearly independent columns. It's a square matrix. And therefore it is invertible.