Although not explicitly stated, it's obvious that the n (in n x k) must be >= k isn't it? I say this because if there were more columns than rows, one couldn't make all the column vectors linearly independent.

Astute observation, the amount rows must indeed be equal to or greater than the columns in order for it to be linearly independent, otherwise you'd get some non-pivot columns.

If a square matrix needs all columns/rows to be linearly independent, and also determinant not equal to 0 in order to be invertible, so is determinant just the kind of measure of non-linear-dependence of rows/columns of a matrix?

Yes it is. If the determinant is not zero, then the rows and columns will be linearly independent and if the determinant is zero, then the rows and columns will not be linearly independent.

Why must the Null(A) only contain the 0 vector? Can't it contain any perpendicular vector, especially when the Rank(A)<n?

As I understand it the columns/ vectors must be linearly independent. Then you are solving the function Ax=0, so x must be a vector with k elements. If the rank is less than n like you offer, or in other words k<n, then you can get them in the form similar to an identity matrix with extra 0s below each 1. You also know that the vector x you are multiplying is going to have k elements. Hopefully you can see the only vector k that will make the 0 vector is another 0 vector, if not I can show it. if n=k, it's a similar argument except you will literally get an identity matrix. if k>n, so more columns than rows it is impossible to make the matrix linearly independent. There will not be enough pivot columns to fill each column. To deal with the case you specifically offer let's use a 3x2 matrix. Linear independence means it will eventually be reduced to [<1,0,0>,<0,1,0>] (Hopefully that makes sense what it should look like.) Now your solution is make a dot product with a perpendicular vector, which we could observe is <0,0,1> So we have a 3x2 multiplied by a 3x1. This cannot be done due to the dimensions If you have learned about left nullspaces, or the null space of the transpose of a matrix, that's what <0,0,1> is here. or it could be <0,0,a> where a is any number. left nullspace can also be shown that the transpose of the left nullspace times A on the left, so x^T * A = 0 Let me know if that didn't make sense.

Heyyy, i just thought, in order for the column vectors to be linearly independent (L.I) and for the row vectors also to be L.I. The number columns and the number of rows have to be ATLEAST equal to each other. Isn't it? I mean you atleast need the same number of variables as the number of equations?

That's correct. The rows being independent, the columns being independent, and the matrix being invertible are all equivalent properties, and only square matrices are invertible.

What is the utility of showing that A^T * A is invertible? In real-life, is there ever a need to multiply a matrix by its transpose to make it invertible?

Yes. The least squares regression method uses A^T A and is one of the most important methods of statistics.

Main content

Course: Linear algebra > Unit 2

Lesson 7: Transpose of a matrix

Showing that A-transpose x A is invertible

Name: Showing that A-transpose x A is invertible
Uploaded: 2011-02-20T16:47:20Z
Description: Showing that (transpose of A)(A) is invertible if A has linearly independent columns

Google Classroom

Showing that (transpose of A)(A) is invertible if A has linearly independent columns. Created by Sal Khan.

Want to join the conversation?

Sort by:

newbarker
Posted 12 years ago. Direct link to newbarker's post “Although not explicitly s...”
Although not explicitly stated, it's obvious that the n (in n x k) must be >= k isn't it? I say this because if there were more columns than rows, one couldn't make all the column vectors linearly independent.
Button navigates to signup pageButton navigates to signup page
(37 votes)
Answer
- Zafar Shaikhli
  Posted 11 years ago. Direct link to Zafar Shaikhli's post “Astute observation, the a...”
  Astute observation, the amount rows must indeed be equal to or greater than the columns in order for it to be linearly independent, otherwise you'd get some non-pivot columns.
  Comment on Zafar Shaikhli's post “Astute observation, the a...”
  (20 votes)
CodeLoader
Posted 9 years ago. Direct link to CodeLoader's post “If a square matrix needs ...”
If a square matrix needs all columns/rows to be linearly independent, and also determinant not equal to 0 in order to be invertible, so is determinant just the kind of measure of non-linear-dependence of rows/columns of a matrix?
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- Tejas
  Posted 9 years ago. Direct link to Tejas's post “Yes it is. If the determi...”
  Yes it is. If the determinant is not zero, then the rows and columns will be linearly independent and if the determinant is zero, then the rows and columns will not be linearly independent.
  Button navigates to signup page
  (3 votes)
SteveSargentJr
Posted 11 years ago. Direct link to SteveSargentJr's post “At 6:25,Sal mentions the ...”
At
6:25
,Sal mentions the transpose of the "reverse product". Is "reverse product" an actual term used in Linear Algebra or did Sal just make it up on the spot?
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
Kyle Delaney
Posted 8 years ago. Direct link to Kyle Delaney's post “So does this only work if...”
So does this only work if you multiply a matrix by its transpose in that order or can you switch them around?

Also, if you try this with a matrix that doesn't have linearly independent rows then does that mean you know for sure that the product won't be invertible?
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
- Dhoomketu
  Posted 7 years ago. Direct link to Dhoomketu's post “It's only true if A is a ...”
  It's only true if A is a square matrix. Because AxA(transpose) =/= A(transpose)xA that's why we can't say that A x A-transpose is invertible. You can prove it if you follow the same process for A x A-transpose. You won't end up at the same conclusion.
  Button navigates to signup page
  (1 vote)
Muhammad Moosa
Posted 4 years ago. Direct link to Muhammad Moosa's post “Why must the Null(A) only...”
Why must the Null(A) only contain the 0 vector? Can't it contain any perpendicular vector, especially when the Rank(A)<n?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- loumast17
  Posted 4 years ago. Direct link to loumast17's post “As I understand it the co...”
  As I understand it the columns/ vectors must be linearly independent. Then you are solving the function Ax=0, so x must be a vector with k elements.
  
  If the rank is less than n like you offer, or in other words k<n, then you can get them in the form similar to an identity matrix with extra 0s below each 1. You also know that the vector x you are multiplying is going to have k elements. Hopefully you can see the only vector k that will make the 0 vector is another 0 vector, if not I can show it.
  
  if n=k, it's a similar argument except you will literally get an identity matrix.
  
  if k>n, so more columns than rows it is impossible to make the matrix linearly independent. There will not be enough pivot columns to fill each column.
  
  To deal with the case you specifically offer let's use a 3x2 matrix. Linear independence means it will eventually be reduced to [<1,0,0>,<0,1,0>] (Hopefully that makes sense what it should look like.) Now your solution is make a dot product with a perpendicular vector, which we could observe is <0,0,1> So we have a 3x2 multiplied by a 3x1. This cannot be done due to the dimensions
  
  If you have learned about left nullspaces, or the null space of the transpose of a matrix, that's what <0,0,1> is here. or it could be <0,0,a> where a is any number. left nullspace can also be shown that the transpose of the left nullspace times A on the left, so x^T * A = 0
  
  Let me know if that didn't make sense.
  Button navigates to signup page
  (2 votes)
Daniel Celis Barrera
Posted 12 years ago. Direct link to Daniel Celis Barrera's post “What happens if the colum...”
What happens if the column vectors of A are not L.I?
(A^T x A) is still invertible?
what if instead of taking the product of (A^T x A), I take the product of (A x A^T)?

By the way, thanks a lot Sal, I've Learned too much with your videos.
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
- InnocentRealist
  Posted 7 years ago. Direct link to InnocentRealist's post “For your second question:...”
  For your second question:
  
  If (A_t)A is invertible, then so is A(A_t), because
  
  A(A_t) = ((A_t)_t)(A_t) = (B_t)B, which is also the transpose of a matrix times the matrix.
  Button navigates to signup page
  (0 votes)
Vinod P
Posted 10 years ago. Direct link to Vinod P's post “In this video Sal mention...”
In this video Sal mentions that the dot product of the transpose of a vector to itself is equivalent to the product of the vector to itself, i.e., y^T . y = y . y. This is definitely intuitive but is there a formal proof, if at all, given in any other video?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- InnocentRealist
  Posted 7 years ago. Direct link to InnocentRealist's post “Using the rules for matri...”
  Using the rules for matrix multiplication, what's the product of the matrix A = [a1 a2 ... an] and the matrix A^T (a matrix consisting of one column vector)?
  
  For a = (a1, a2, ..., an), what's a dot a?
  
  These calculations are just the direct applications of the definitions of matrix multiplication and dot product, which, as definitions, are not provable.
  
  If you're not sure how to calculate these, search "khan academy matrix multiplication" and "khan academy dot product" at DuckDuckGo for the videos.
  Button navigates to signup page
  (1 vote)
Adrian Diaz Fortich
Posted 9 years ago. Direct link to Adrian Diaz Fortich's post “Does (A^T A)^-1 represent...”
Does (A^T A)^-1 represent any particular mathematic property or definition? Maybe covariance or dispersion? what do the individual elements mean?
Thanks
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- InnocentRealist
  Posted 7 years ago. Direct link to InnocentRealist's post “I don't know. But since f...”
  I don't know. But since for any matrix B,
  
  rank(B) = rank(B^T) = rank(row space of B),
  
  both the columns and rows of "B = (A^T)A" are linearly independent sets, and so
  
  both rref(B) and rref(B ^T) are identity matrices, and the solution spaces for "Bx=b" and "(B^T)x=c" are just fixed vectors, with no free variables and so in general no vector spaces (unless it's the null space).
  Button navigates to signup page
  (1 vote)
Tarun Akash
Posted 5 years ago. Direct link to Tarun Akash's post “Heyyy, i just thought, in...”
Heyyy, i just thought, in order for the column vectors to be linearly independent (L.I) and for the row vectors also to be L.I. The number columns and the number of rows have to be ATLEAST equal to each other. Isn't it? I mean you atleast need the same number of variables as the number of equations?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- kubleeka
  Posted 5 years ago. Direct link to kubleeka's post “That's correct. The rows ...”
  That's correct. The rows being independent, the columns being independent, and the matrix being invertible are all equivalent properties, and only square matrices are invertible.
  Button navigates to signup page
  (2 votes)
Elisa Warner
Posted 6 years ago. Direct link to Elisa Warner's post “What is the utility of sh...”
What is the utility of showing that A^T * A is invertible? In real-life, is there ever a need to multiply a matrix by its transpose to make it invertible?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- Alexander Wu
  Posted 6 years ago. Direct link to Alexander Wu's post “Yes. The least squares re...”
  Yes. The least squares regression method uses A^T A and is one of the most important methods of statistics.
  Button navigates to signup page
  (2 votes)

Video transcript

OK. I've got some matrix A. It's an n by k matrix. Let's say it's not just any n by k matrix. This matrix A has a bunch of columns that are all linearly independent. So, a1. a2, all the way through ak are linearly independent. They are linearly independent columns. Let me write that down. a1, a2, all the column vectors of A. All the way through ak are linearly independent. Now, what does that mean? That means that the only solution to x1 times a1 plus x2 times a2, plus all the way to xk times ak. The only solution to this is all of these x's have to be 0. So, all xi's must be equal to 0. That's what linear independence implies. Or another way to write it is all the solutions to this equation x1, x2, all the way down to xk equaling the zero vector. That all the solutions to this are all of these entries have to be equal to 0. This is just another way of writing this right there. We've seen it multiple times. That's the zero vector right there. So if all of these have to be 0, that's like saying that the only solution to ax is equal to 0, is x is equal to the zero vector. Or another way to say it-- this is all coming out of the fact that this guy's columns are linearly independent. So linear independence of columns. Based on that, we can say, since the only solution to ax is equal to 0 is x is equal to the zero vector, we know that the null space of a must be equal to the zero vector. Or it's a set with the just the zero vector in it. And that is all a bit of review. Now, n by k. We don't know its dimensions. It may or may not be a square matrix. So we don't know, necessarily, whether it's invertible and all of that. But maybe we can construct an invertible matrix with it. So, let's study a transpose times a. a transpose times a. A is an n by k matrix. A transpose will be a k by n matrix. So, A transpose a is going to be a k by k matrix. So it's a square matrix. So that's a nice place to start for an invertible matrix. So let's see if it is actually invertible. We don't know anything about A. All we know is its columns are linearly independent. Let's see if A transpose a is invertible. Essentially, to show that it's invertible, if we can show that all of its columns are linearly independent, then we'll know it's invertible. If we have any-- and I'll get back to this at the end of the video. But if you have a square matrix with linearly independent columns-- remember, the linearly independent columns all are associated with pivot columns when you put them in reduced row echelon form. So if you have a square matrix, then you're going to have exactly-- so if it's a k by k matrix, that means you're going to have k-- that means that the reduced row echelon form of a matrix will have k pivot columns and be k by k. And be a square k by k matrix. And there's only one k by k matrix with k pivot columns. And that's the identity matrix. The k by k identity matrix. And if when you do something to reduce row echelon form, and it you got the identity matrix, that means that your matrix is invertible. I could have probably left that to the end of the video, but I just want to show you. If we can show that-- we already know that this guy's square, that a transpose A is a square matrix. If we can show that, given that a has linearly independent columns, that a transpose times A also has linearly independent columns, and given the columns are linearly independent, and it's a square matrix, that tells us that when we put it into reduced row echelon form, we'll get the identity matrix. And that tells us that this thing would be invertible. Let's see if we can prove that all of this guy's columns are linearly independent. So let's say I have some vector V. Let's say my vector V is a member of the null space of a transpose A. That means that if I take a transpose A times my vector v, I'm going to get the zero vector. Fair enough? Now, what happens if I multiply both sides of the equation times the transpose of this guy? So I'll get a v transpose-- actually let me just do it right here. I multiply v transpose on this side, and v transpose on this side. You could view this as a matrix vector product. Right? Or, in general, if you take a row vector times a column vector, it's essentially their dot product. So this right-hand side of the equation, you dot anything with the zero vector. That is just going to be the zero vector. Now what is the left-hand side of this going to be? We've seen this before. If you have the transpose of-- we can view this as, even though it's a transpose of a vector, you can view it as a-- it is a row factor, but you could also view it as a matrix. Right? Let's say v is a k by 1 matrix. v transpose will be a 1 by k matrix. We've seen this before. That that is equal to the reverse product, the transpose of the reverse product. Or if we take the product of two things and transpose it, that's the same thing as taking the reverse product of the transposes of either of those two matrices. So given that, we can replace this right here with a times a vector v transpose-- and we're multiplying this vector times av times this vector right here. And that is going to be equal to the zero vector. Now, what is this? If I'm taking some vector's transpose, and let's say this is a vector. Remember, even though I have a matrix vector product right here, when I multiply a matrix times this vector, it will result in another vector. So this is a vector, and this is a vector right here. And if I take some vector and I multiply its transpose times that vector-- we've seen this before. That is the same thing as y dot y. These two statements are identical. So this thing right here is the same thing as av dot av. And so what does the right-hand side equal? The right-hand side is going to be equal to 0. Actually let me just make a correction up here. When I take v transpose times the zero vector, v transpose is going to have k elements. And then the zero vector is also going to have k elements. And when I take this product that's like dotting it. You're taking the dot product of v and 0. So this is a dot product of v with the zero vector which is equal to zero, the scalar zero. So this right here's the scalar zero. I want to make sure I clarify that. It wouldn't make sense otherwise. So the right-hand side, when I multiply the zero vector times the transpose of v, gets just the number zero. No vector zero there. So this av dot av is going to be equal to 0. Or we could say that the magnitude, or the length, of av squared is equal to 0. Or that tells us that av has to be equal to 0. The only vector whose length is 0, is the zero vector. So av-- let me switch colors. Using that a little bit too much. So we know that av must be equal to 0, to the zero vector. This must be equal to the zero vector since its length is 0. Now, we started off with saying v is a member of the null space of a transpose A. v can be any member of the null space of a transpose A. But then from that assumption, it turns out that V also has to be a member of the null space of A. That av is equal to 0. Let's write that down. If v is a member of the null space of a transpose A, then v is a member of the null space of a. Now, our null space of A, because A's columns are linearly independent, it only contains one vector. It only contains the zero vector. So, if this guy's a member of the null space of A transpose A, and he has to be a member of the null space of A, there's only one thing he can be. There's only one entry there. So then v has to be equal to the zero vector. Or another way to say that is, any v that's in our null space of a transpose A has to be the zero vector. Or the null space of a transpose A is equal to the null space of a which is equal to just the zero factor sitting there. Now, what does that do for us? That tells us that the only solution to a transpose A times some vector x equal to zero, this says that the only solution is the zero vector is equal to the zero vector. Right? Because the null space of a transpose A is the same as the null space of a. And that just has the zero vector in it. The null space is just the solution to this. So if the only solution to the null space is this, that means that the columns of a transpose A are linearly independent. You could, essentially, write all of the linear combinations of the columns by the weights of the entries of x. We actually did that at the beginning. It's the same argument we used up here. So if all of their columns are linearly independent, and I said it over here, a transpose A has linearly independent columns, and it's a square matrix, that was from the definition of it. So we now know that A transpose A if I were to put it-- let me do this way. That tells me that the reduced row echelon form of a transpose A is going to be equal to the k by k identity matrix which tells me that a transpose A is invertible. Which is a pretty neat result. I started with the matrix that has linearly independent columns. So it wasn't just any matrix. It wasn't just any run of the mill matrix. It did have linearly independent columns, but it might have weird dimensions. It's not necessarily a square matrix. But I could construct a square matrix. a transpose A with it. And we now know that it also has linearly independent columns. It's a square matrix. And therefore it is invertible.