Linear regression and correlation
-
Estimating the line of best fit
-
Correlation and Causality
-
Squared Error of Regression Line
-
Proof (Part 1) Minimizing Squared Error to Regression Line
-
Proof Part 2 Minimizing Squared Error to Line
-
Proof (Part 3) Minimizing Squared Error to Regression Line
-
Proof (Part 4) Minimizing Squared Error to Regression Line
-
Regression Line Example
-
Second Regression Example
-
R-Squared or Coefficient of Determination
-
Calculating R-Squared
-
Covariance and the Regression Line
Proof (Part 3) Minimizing Squared Error to Regression Line Proof (Part 3) Minimizing Squared Error to Regression Line
⇐ Use this menu to view and help create subtitles for this video in many different languages.
You'll probably want to hide YouTube's captions if using these subtitles.
- All right, so where we left off, we had simplified our
- algebraic expression for the squared error to the line from
- the n data points.
- We kind of visualized it.
- This expression right here would be a surface, I guess
- you could view it as a surface in three dimensions, where for
- any m and b is going to be a point on that surface that
- represents the squared error for that line.
- Our goal is to find the m and the b, which would define an
- actual line, that minimize the squared error.
- The way that we do that, is we find a point where the partial
- derivative of the squared error with respect to m is 0,
- and the partial derivative with respect to b is also
- equal to 0.
- So it's flat with respect to m.
- So that means that the slope in this direction
- is going to be flat.
- Let me do it in the same color.
- So the slope in this direction, that's the partial
- derivative with respect to m, is going to be flat.
- It's not going to change in that direction.
- The partial derivative with respect to b
- is going to be flat.
- So it will be a flat point right over there.
- The slope at that point in that direction will also be 0,
- and that is our minimum point.
- So let's figure out the m and b's that give us this.
- So if I were to take the partial derivative of this
- expression with respect to m.
- Well this first term has no m terms in it.
- So it's a constant from the point of view of m.
- Just as a reminder, partial derivatives, it's just like
- taking a regular derivative.
- You're just assuming that everything but the variable
- that you're doing the partial derivative with respect to,
- you're assuming everything else is a constant.
- So in this expression, all the x's, the y's, the b's, the
- n's, those are all constant.
- The only variable, when we take the partial derivative
- with respect to m, that matters is the m.
- So this is a constant.
- There's no m here.
- This term right over here, we're taking
- with respect to m.
- So the derivative of this with respect to m, it's kind of the
- coefficients on the m.
- So negative 2 times n times the mean of the xy's, that's
- the partial of this with respect to m.
- Then this term or right here has no m's in it.
- So it's constant with respect to m.
- So its partial derivative with respect to m is 0.
- Then this term here, you have n times the mean of the x
- squared times m squared.
- So this is going to be-- we're talking about a partial
- derivative with respect to m-- so it's going to be 2 times n
- times the mean of the x [? squareds ?]
- times m.
- The derivative of m squared is 2m, and then you just have
- this coefficient there as well.
- Now this term, you also have an m over there.
- So let's see, everything else is just kind of a
- coefficient on this m.
- So the derivative with respect to m is 2bn times
- the mean of the x's.
- If I took the derivative of 3m, the derivative is just 3.
- It's just the coefficient on it.
- Then finally, this is a constant with respect to m.
- So we don't see it.
- So this is the partial derivative with respect to m.
- That's that right over there.
- We want to set this equal to 0.
- Now let's do the same thing with respect to b.
- This term, once again, is a constant from the
- perspective of b.
- There's no b here.
- There's no b over here.
- So the partial derivatives of either of these with
- respect to b is 0.
- Then over here you have a negative 2n times the mean of
- y's as a coefficient on a b.
- So the partial derivative with respect to b is going to be
- minus 2n, or negative 2n, times the mean of the y's.
- Then there's no b over here.
- Then we do have a b over here.
- So it's plus 2mn times the mean of the x's.
- This is essentially the coefficient
- on the b over here.
- It was written in a mixed order, but all of these are
- constants from the point of view of b.
- They are the coefficient in front of the b.
- The partial derivative of that with respect to b is just
- going to be the coefficient.
- Then finally, the partial derivative of this with
- respect to b is going to be 2nb, Or 2nb to the first you
- could even say.
- We want to set this equal to 0.
- So it looks very complicated.
- But remember, we're just trying to solve for the m's
- and the b 's.
- We have two equations with two unknowns here.
- We have the m's and then we have the b's.
- To simplify this, both of these equations, actually the
- top one and the bottom one, both sides are
- divisible by 2n.
- I mean 0 is divisible by anything.
- It'll be just 0.
- So let's divide the top equation and by 2n and see
- what we get.
- If we divide the top equation by 2n, this'll become just 1.
- That goes away, and then those go away.
- You would just be left with negative times the mean, the
- negative mean of the xy's plus m times the mean of the x
- squareds, plus b times the mean of the x's is equal to 0.
- That's this first expression when you divide both sides by
- negative 2n.
- The second expression will be, this will go away.
- This is when you divide it by 2n.
- I don't want to say negative 2n.
- When you divide this by 2n, that'll go away, that will go
- away, and then those will go away.
- You're just left with the negative mean of the y's plus
- m times the mean of the x's plus b is equal to 0.
- So if we find the m and the b values that satisfy the system
- of equations, we have minimized the squared error.
- We could just solve it in a traditional way.
- But I want to rewrite this, because I think it's kind of
- interesting to see what these really represents.
- So let's add this mean of the xy's to both
- sides of this top equation.
- What do we get?
- We get m times the mean of the x [? squareds ?]
- plus b times the mean of the x's is equal to, these are
- going to cancel out, is equal to the mean of the xy's.
- That's that top equation.
- This bottom equation, right here, let's add the mean of y
- to both sides of this equation.
- I do that so that that cancels out.
- And then we're left with m-- I'll do that in the blue color
- to show you the same equation-- we have m times the
- mean of the x's plus b is equal to the mean of the y's.
- Now, I actually want to get both of these
- into mx plus b form.
- This is actually already there.
- Actually you can see, that if our best-fitting line is going
- to be y is equal to mx plus b-- we still have to find the
- m and the b-- but we see on that best-fitting line,
- because the m and the b that satisfy both of these
- equations are going to be the m and the b on that
- best-fitting line.
- So that best-fitting line actually contains the point,
- and we get this from the second equation right here.
- It contains the point.
- I should write it this way.
- The coordinate mean of x mean of y lies on the line.
- And you could see it right over here.
- If you put the mean of x in this for the optimal m and b,
- you are going to get the mean of the y.
- So that's interesting.
- This optimal line.
- Let's never forget what we're even trying to do.
- This optimal line is going to contain some point on it-- let
- me do that in a new color-- it's going to contain some
- point on it right here that is the mean of all of the x
- values and the mean of all the y values.
- That's just interesting.
- It kind of makes sense.
- It kind of makes intuitive sense.
- Now this other thing, just to kind of get it in the same
- point of view.
- Then it will actually become a kind of an easier way to solve
- the system.
- You could solve this a million different ways.
- But just to give us an intuition of what even is
- going on here, what's another point that's on the line?
- Because if you have two points on the line, you know what the
- equation of the line is going to be.
- Well the other point, we want this to be in mx plus b form.
- So let's divide both sides of this equation by this term
- right here, by the mean of the x 's.
- If we do that, we get m times the mean of the x
- [? squareds ?]
- divided by the mean of the x's plus b is equal to the mean of
- the xy's divided by the mean of the x's.
- So when you write it in this form, this is the exact same
- equation as that, I just divided both sides by the mean
- of the x's, you get another interesting point that will
- lie on this optimal fitting line, at least from the point
- of view of the squared distances.
- So another point that will lie on it, on this optimal line,
- the x value is going to be this, the mean of the x
- [? squareds ?]
- divided by the mean of the x's.
- Then the y value is going to be the mean of the xy's
- divided by the mean of the x's.
- I'll let you think about that a little bit more.
- But already, this is actually the two points that lie on the
- line, so both of these on the best-fitting line based on how
- we're measuring a good fit, which is the squared distance.
- These are on the line that minimize
- that squared distance.
- What I'm going to do in next video, and this is turning
- into like a six or seven video saga on trying to prove the
- best-fitting line or finding the formula for the
- best-fitting line.
- But it's interesting.
- There's all sorts of kind of neat little mathematical
- things to ponder over here.
- But in the next video, we can actually use this information.
- We could have just solved the system straight up.
- But we can actually use this information right here to
- solve for our m and b's.
- Maybe we'll do it both ways depending on my mood.
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
|
Have something that's not a question about this content? |
This discussion area is not meant for answering homework questions.
Discuss the site
For general discussions about Khan Academy, visit our Reddit discussion page.
Flag inappropriate posts
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
abuse
- disrespectful or offensive
- an advertisement
not helpful
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
wrong category
- a tip or feedback in Questions
- a question in Tips & Feedback
- an answer that should be its own question
about the site
Share a tip
Suggest a fix
Have something that's not a tip or feedback about this content?
This discussion area is not meant for answering homework questions.