- [Instructor] What we're
going to do in this video is calculate a typical measure of how well the actual data points agree with a model, in
this case, a linear model and there's several names for it. We could consider this to
be the standard deviation of the residuals and that's essentially what
we're going to calculate. You could also call it
the root-mean-square error and you'll see why it's called this because this really describes
how we calculate it. So, what we're going to do is look at the residuals
for each of these points and then we're going to find
the standard deviation of them. So, just as a bit of review, the ith residual is going to
be equal to the ith Y value for a given X minus the predicted Y value for a given X. Now, when I say Y hat right over here, this just says what would
the linear regression predict for a given X? And this is the actual Y for a given X. So, for example, and we've
done this in other videos, this is all review, the residual here when X is equal to one, we have Y is equal to one but what was predicted by the model is 2.5 times one minus two which is .5. So, one minus .5, so this residual here, this residual is equal to one minus 0.5 which is equal to 0.5 and it's a positive 0.5 and if the actual point is above the model you're going to have a positive residual. Now, the residual over here you also have the actual point
being higher than the model, so this is also going to
be a positive residual and once again, when X is equal to three, the actual Y is six, the predicted Y is 2.5 times three, which is 7.5 minus two which is 5.5. So, you have six minus 5.5, so here I'll write residual
is equal to six minus 5.5 which is equal to 0.5. So, once again you have
a positive residual. Now, for this point that
sits right on the model, the actual is the predicted, when X is two, the actual is three and what was predicted
by the model is three, so the residual here is
equal to the actual is three and the predicted is three, so it's equal to zero and then last but not least, you have this data point where the residual is
going to be the actual, when X is equal to two is two, minus the predicted. Well, when X is equal to two, you have 2.5 times two, which is equal to five
minus two is equal to three. So, two minus three is
equal to negative one. And so, when your actual is
below your regression line, you're going to have a negative residual, so this is going to be
negative one right over there. Now we can calculate
the standard deviation of the residuals. We're going to take this first residual which is 0.5, and we're going to square it, we're going to add it
to the second residual right over here, I'll use
this blue or this teal color, that's zero, gonna square that. Then we have this third
residual which is negative one, so plus negative one squared and then finally, we
have that fourth residual which is 0.5 squared, 0.5 squared, so once again, we took
each of the residuals, which you could view as the distance between the points and what
the model would predict, we are squaring them, when you take a typical
standard deviation, you're taking the distance
between a point and the mean. Here we're taking the
distance between a point and what the model would have predicted but we're squaring each of those residuals and adding them all up together, and just like we do with the
sample standard deviation, we are now going to divide by one less than the number of residuals
we just squared and added, so we have four residuals, we're going to divide by four minus one which is equal to of course three. You could view this part as
a mean of the squared errors and now we're gonna take
the square root of it. So, let's see, this is going
to be equal to square root of this is 0.25, 0.25, this is just zero, this is going to be positive one, and then this 0.5 squared is going to be 0.25, 0.25, all of that over three. Now, this numerator is going to be 1.5 over three, so this is
going to be equal to, 1.5 is exactly half of three, so we could say this is equal to the square root of one half, this one over the square root of two, one divided by the square root of two which gets us to, so if we round to the nearest thousandths, it's roughly 0.707. So, approximately 0.707. And if you wanted to visualize that, one standard deviation of
the residuals below the line would look like this, and one standard deviation above the line for any given X value would
go one standard deviation of the residuals above it, it would look something like that. And this is obviously just
a hand-drawn approximation but you do see that this does
seem to be roughly indicative of the typical residual. Now, it's worth noting, sometimes people will say
it's the average residual and it depends how you
think about the word average because we are squaring the residuals, so outliers, things that are
really far from the line, when you square it are going to have disproportionate impact here. If you didn't want to have that behavior we could have done
something like find the mean of the absolute residuals, that actually in some ways
would have been the simple one but this is a standard way of
people trying to figure out how much a model disagrees
with the actual data, and so you can imagine
the lower this number is the better the fit of the model.