So, if I am correct, this is how a calculator takes a set of data points and creates a linear regression? Because it seems as if the "closest solution" to the line is easily translatable to "best fit line". If so, would any other type of regression just be non-linear transformations?

No, most calculators actually use the statistics approach, not the linear algebra approach. The way that works is it first calculates the correlation coefficient by averaging the products of the x-coordinate's z-score and the y-coordinate's z-score. It then multiplies the correlation coefficient by the standard deviation of the y's and divides by the standard deviation of the x's. That number is the slope. From there, it is trivial to find the y-intercept given the fact that the line must pass through the grand mean, the point whose x-coordinate is the x-mean and whose y-coordinate is the y-mean. The reason calculators don't use the Linear Algebra method is because that entails finding the inverse of a potentially very large matrix. That takes a long time, even for a calculator. Of course, it does still give the same answer.

Why did he write 2/5 for b and graph it when [m_star, b_star] = [2/5, 4/5]? He starts writing it at the 12:05 mark. I do not know what I am missing here, please help me!

You didn't miss a thing. In fact, you caught something Sal missed. I understand that he doesn't want to remake the video, but they should add an annotation stating that the LSE fit line should be y = 2/5x + 4/5.

If Ax = b and b is not in C(A) and A^T*A is a singular matrix, how can we find the least square solution of Ax = b?

That is not possible. You can never have a vector b which is equal to Ax and yet not in C(A). The C(A) is, by definition, the space of all vectors b such that Ax = b.

Main content

Course: Linear algebra > Unit 3

Lesson 2: Orthogonal projections

Another least squares example

Name: Another least squares example
Uploaded: 2011-02-20T16:47:21Z
Description: Using least squares approximation to fit a line to points

Google Classroom

Using least squares approximation to fit a line to points. Created by Sal Khan.

Want to join the conversation?

Sort by:

Leandro Aldair Izquierdo Vallejos
Posted 5 years ago. Direct link to Leandro Aldair Izquierdo Vallejos's post “At 12:00 it is y = (2/5)x...”
At
12:00
it is y = (2/5)x + (4/5), since m* = (2/5) and b* = (4/5).
Button navigates to signup pageComment on Leandro Aldair Izquierdo Vallejos's post “At 12:00 it is y = (2/5)x...”
(23 votes)
Answer
CTP
Posted 8 years ago. Direct link to CTP's post “So, if I am correct, this...”
So, if I am correct, this is how a calculator takes a set of data points and creates a linear regression? Because it seems as if the "closest solution" to the line is easily translatable to "best fit line". If so, would any other type of regression just be non-linear transformations?
Button navigates to signup pageButton navigates to signup page
(9 votes)
Answer
- Tejas
  Posted 8 years ago. Direct link to Tejas's post “No, most calculators actu...”
  No, most calculators actually use the statistics approach, not the linear algebra approach. The way that works is it first calculates the correlation coefficient by averaging the products of the x-coordinate's z-score and the y-coordinate's z-score. It then multiplies the correlation coefficient by the standard deviation of the y's and divides by the standard deviation of the x's. That number is the slope. From there, it is trivial to find the y-intercept given the fact that the line must pass through the grand mean, the point whose x-coordinate is the x-mean and whose y-coordinate is the y-mean.
  
  The reason calculators don't use the Linear Algebra method is because that entails finding the inverse of a potentially very large matrix. That takes a long time, even for a calculator. Of course, it does still give the same answer.
  Comment on Tejas's post “No, most calculators actu...”
  (13 votes)
Saad Taame
Posted 13 years ago. Direct link to Saad Taame's post “This is linear regression...”
This is linear regression, right?
Button navigates to signup pageButton navigates to signup page
(10 votes)
Answer
Ellen Elizabeth Collins
Posted 6 years ago. Direct link to Ellen Elizabeth Collins's post “Why did he write 2/5 for...”
Why did he write 2/5 for b and graph it when [m_star, b_star] = [2/5, 4/5]? He starts writing it at the
12:05
mark. I do not know what I am missing here, please help me!
Button navigates to signup pageComment on Ellen Elizabeth Collins's post “Why did he write 2/5 for...”
(6 votes)
Answer
- Frank Baird
  Posted 5 years ago. Direct link to Frank Baird's post “You didn't miss a thing. ...”
  You didn't miss a thing. In fact, you caught something Sal missed. I understand that he doesn't want to remake the video, but they should add an annotation stating that the LSE fit line should be y = 2/5x + 4/5.
  Comment on Frank Baird's post “You didn't miss a thing. ...”
  (5 votes)
trisha.panka
Posted 12 years ago. Direct link to trisha.panka's post “In the Video "Regression ...”
In the Video "Regression Line Example" you use the least squares method with equations for m, and b. Couldn't you use that strategy for this example too? Are the least squares solutions the same as they are in the regression line? Mainly, my question is, what are is the difference between this video and "Regression Line Video"? Thanks.
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
- alphabetagamma
  Posted 12 years ago. Direct link to alphabetagamma's post “You are right.”
  You are right.
  Button navigates to signup page
  (1 vote)
leibo
Posted 11 years ago. Direct link to leibo's post “I understand the techniqu...”
I understand the technique Sal used, but I can not understand something more fundamental: should not the solution set of Ax=b remains the same (i.e. no solution) when we multiply both side by A_transpose? Its seems like multiplying both sides by the same quantity somehow produce new solution.
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
- Seth Paulo Bangerter
  Posted 9 years ago. Direct link to Seth Paulo Bangerter's post “Sal explains how he got t...”
  Sal explains how he got that one clip earlier
  Button navigates to signup page
  (1 vote)
Pramoth Viswan
Posted 9 years ago. Direct link to Pramoth Viswan's post “Why cant we take the line...”
Why cant we take the line equation as "ax+by=1" instead of "y=mx+b" and find the values of a and b using least square method? I tried doing so, but iam arriving at a different answer.
Button navigates to signup pageButton navigates to signup page
(3 votes)
Answer
- Erwin
  Posted 7 years ago. Direct link to Erwin's post “In the previous video, Sa...”
  In the previous video, Sal did it the way you suggest (ax+by=c, for the three lines of the triangle). However, this doesn't optimize for the distance y, it optimizes for c, which has dubious value.
  By first converting the lines to mx+b=y, we can now optimize (via least-squared distance) for y.
  Button navigates to signup page
  (1 vote)
Jungwhan Kim
Posted 11 years ago. Direct link to Jungwhan Kim's post “Is this Gauss Markov theo...”
Is this Gauss Markov theorem?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
Lim.ChaiYong
Posted 9 years ago. Direct link to Lim.ChaiYong's post “If Ax = b and b is not in...”
If Ax = b and b is not in C(A) and A^T*A is a singular matrix, how can we find the least square solution of Ax = b?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- Tejas
  Posted 9 years ago. Direct link to Tejas's post “That is not possible. You...”
  That is not possible. You can never have a vector b which is equal to Ax and yet not in C(A). The C(A) is, by definition, the space of all vectors b such that Ax = b.
  Comment on Tejas's post “That is not possible. You...”
  (2 votes)
Akshay Tiwary
Posted 8 years ago. Direct link to Akshay Tiwary's post “Suppose we want to find t...”
Suppose we want to find the least squares solution of Ax = b. Here, we solve for x by solving A^T * A * x = A * b. Since A has linearly independent columns, we can find a unique solution for x.
But what if the columns of A are not linearly independent? Then there may not be a unique solution for A^T A x = b. How do we find x* then?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer

Video transcript

So I've got four Cartesian coordinates here. This first one is minus 1, 0. I tried to draw them ahead of time. So minus 1, 0 is this point right there. Doing this in these new colors. The next point is a 0, 1, which is that point right there. Then the next point is 1, 2, which is that point right up there. And then the last point is 2, 1, which is that point there. Now my goal in this video is to find some line, y equals mx plus v, that goes through these points. Now the first thing I'd say is, hey Sal, there is not going to be any line that goes through these points, and you can see that immediately. You could find a line that maybe goes through these points, but it's not going to go through this point over here. If you try to make a line to goes through these two points, it's not going to go through those points there. So you're not going to be able to find a solution that goes through those points. Let's set up the equation that we know we can't find the solution to and maybe we can use our least squares approximation to find a line that almost goes through all these points. Or it's at least the best approximation for a line that goes through those points. So this first one, I can express my line, y equals mx plus b. Let me just express it as f of x is equal to mx plus b, or y is equal to f of x. We can write it that way. So our first point right there -- let me do it in that color, that orange -- that tells us that f of minus 1, which is equal to m times -- let me just write this way -- minus 1 times m, it's minus m plus b, that that is going to be equal to 0. That's what that first equation tells us. The second equation tells us that f of 0, which is equal to 0 times m, which is just 0 plus b is equal to 1. f of 0 is 1. This is f of x. The next one -- let me do it in this yellow color -- tells us that f of 1, which is equal to 1 times m, or just m, plus b, is going to be equal to 2. And then this last one down here tells us that f of 2, which is of course 2 times m plus b, that that is going to be equal to 1. These are the constraints. If we assume that our line can go through all of these points, then all of these things must be true. Now you could immediately, if you wish, try to solve this equation, but you'll find that you won't find a solution. We want to find some m's and b's that satisfy all of these equations. Or another way of writing this -- We want to write it as a matrix vector or a matrix equation . We could write it like this. Minus 1, 1, 0, 1, 1, 1, 2, 1, times the vector mv has got to be equal to the vector 0, 1, 2, 1. These two systems, this system and this system right here, are equivalent statements, right? Minus 1 times m plus 1 times b has got to be equal to that 0. 0 times m plus 1 times b has got to be equal to that 1 That's equivalent to that statement right here. And this isn't going to have a solution. The solution would have to go through all of those points. So let's at least try to find a least squares solution. So if we call this a, if we call that x, and let's call this b, there is no solution to ax is equal to b. Now maybe we can find a least -- Well, we can definitely find a least squares solution. So let's find our least squares solution such that a transpose a times our least squares solution is equal to a transpose times b. Our least squares solution is the one that satisfies this equation. We proved it two videos ago. So let's figure out what a transpose a is and what a transpose b is, and then we can solve. So a transpose will look like this. b minus 1, 1, 0, 1, 1, 1, and then 2, 1. This first column becomes this first row; this second column becomes this second row. So we're going to take the product of a transpose and then a-- a is that thing right there --minus 1, 0, 1, 2, and we just get a bunch of 1's. So what does this equal to? We have a 2 by 4 times a 4 by 2. So we're going to have a 2 by 2 matrix. So this is going to be -- Let's do it this way. Well, we're going to have minus 1 times minus 1, which is 1, plus 0 times 0, which is 0 -- so we're at 1 right now --plus 1 times 1. So that's 1 plus the other 1 up there, so that's 2, plus 2 times 2. 2 times 2 is 4, so we get 6. That's that row, dotted with that column, was equal to 6. Now let's take this row dotted with this column. So it's going to be negative 1 times 1, plus 0 times 1, so all of these guys times 1 plus each other. So minus 1 plus 0 plus 1 -- that's all 0's --plus 2. So it's going to get a 2. I just dotted that guy with that guy. Now I need to take the dot of this guy with this column. So it's just going to be 1 times minus 1 plus 1 times 0 plus 1 times 1 plus 1 times 2. Well, these are all 1 times everything, so it's minus 1 plus 0 plus 1, which is 0 plus 2. It's going to be 2. And then finally -- Well. I mean, I think you see some symmetry here. We're going to have to take the dot of this guy and this guy over here. So what is that? That's 1 times 1, which is 1, plus 1 times 1, which is 2, plus 1 times 1. So we're going to have 1 plus itself four times. So we're going to get that it's equal to 4. So this is a transpose a. And let's figure out what a transpose b looks like. Scroll down a little bit. So a transpose is this matrix again-- let me switch colors --minus 1, 0, 1, 2. We get all of our 1's just like that. And then the matrix b is 0, 1, 2, 1. We have a 2 by 4 times a 4 by 1, so we're just going to get a 2 by 1 matrix. So this is going to be equal to a 2 by 1 matrix. We have here, let's see, minus 1 times 0 is 0, plus 0 times 1 is still 0. Plus 1 times 2, which is 2, plus 2 times 1, which is 4, right? This is 2 plus 2, so it's going to be 4 right there. And then we have 1 times 0, plus 1 times 2, plus-- So 1 times all of these guys added up. So 0 plus 1 is 1, 1 plus 2 is 3, 3 plus 1 is 4. So this right here is a transpose b. So just like that, we know that the least squares solution will be the solution to this system. 6, 2, 2, 4, times our least squares solution, is going to be equal to 4, 4. Or we could write it this way. We could write it 6, 2, 2, 4, times our least squares solution, which I'll write-- Remember, the first entry was m . I'll write it as m star. That's our least square m, and this is our least square b, is equal to 4, 4. And I can do this as an augmented matrix or I could just write this as a system of two unknowns, which is actually probably easier. So let's do it that way. So this, if I were to write it as a system of equations, is 6 times m star plus 2 times b star, is equal to 4. And then I get 2 times m star plus 4 times b star is equal to this 4. So let me solve for my m stars and my b stars. So let's multiply this second equation, actually let's multiply that top equation by 2. This is just straight Algebra 1. So times 2, what do we get? We get 12m star plus 4b star is equal to 8. We just multiplied that top guy by 2. Now let's multiply this magenta 1 by negative 1. So this becomes a minus, this becomes a minus, that becomes a minus, and now we can add these two equations. So we get minus 2 plus 12m star, that's 10m star. And then the minus 4b and the 4b cancel out, is equal to 4, or m star is equal to 4 over 10, which is equal to 2/5. Now we can just go and back-substitute into this. We can say 6 times m star-- This is just straight Algebra 1. So 6 times our m star, so 6 times 2 over 5, plus 2 times our b star is equal to 4. Enough white, let me use yellow. So we get 12 over 5 plus 2b star is equal to 4, or we could say 2b star-- let me scroll down a little bit --2b star is equal to 4. Which is the same thing as 20 over 5, minus 12 over 5, which is equal to-- I'm just subtracting the 12 over 5 from both sides --which is equal to 8 over 5. And you divide both sides of the equation by 2, you get b star is equal to 4/5. And just like that, we got our m star and our b star. Our least squares solution is equal to 2/5 and 4/5. So m is equal to 2/5 and b is equal to 4/5. And remember, the whole point of this was to find an equation of the line. y is equal to mx plus b. Now we can't find a line that went through all of those points up there, but this is going to be our least squares solution. This is the one that minimizes the distance between a times our vector and b. No vector, when you multiply times that matrix a-- that's not a, that's transpose a --no other solution is going to give us a closer solution to b than when we put our newly-found x star into this equation. This is going to give us our best solution. It's going to minimize the distance to b. So let's write it out. y is equal to mx plus b. So y is equal to 2/5 x plus 2/5. Let's graph that out. y is equal to 2/5 x plus 2/5. So its y-intercept is 2/5, which is about there . This is at 1. 2/5 is right about there. And then its slope is 2/5. Let's think of it this way: for every 2 and 1/2 you go to the right, you're going to go up 1. So if you go 1, 2 and 1/2, we're going to go up 1. We're going to go up 1 like that. So our line-- and obviously this isn't precise --but our line is going to look something like this. I want to do my best shot at drawing it because this is the fun part. It's going to look something like that. And that right there is my least squares estimate for a line that goes through all of those points. And you're not going to find a line that minimizes the error in a better way, at least when you measure the error as the distance between this vector and the vector a times our least squares estimate. Anyway, thought you would find that neat.