If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:6:12
AP.STATS:
DAT‑1 (EU)
,
DAT‑1.E (LO)
,
DAT‑1.E.2 (EK)
,
DAT‑1.F (LO)
,
DAT‑1.F.1 (EK)
,
DAT‑1.F.2 (EK)

Video transcript

what we're going to do in this video is talk about the idea of a residual plot for a given regression and the data that it's trying to explain so right over here we have a fairly simple least squares regression we're trying to fit four points and in previous videos we actually came up with the equation of this least squares regression line what I'm going to do now is plot the residuals for each of these points so what is a residual well just as a reminder your residual for a given point is equal to the actual minus the expected so how do I make that tangible well what's the residual for this point right over here for this point here the actual y when x equals 1 is 1 but the expected when x equals 1 for this least squares regression line 2.5 times 1 minus 2 well that's going to be 0.5 and so our residual is 1 minus 0.5 so we have a positive we have a positive 0.5 residual over for this point you have 0 residual the actual is the expected for this point right over here the actual when x equals 2 for y is 2 but the expected is 3 so our residual over here once again the actual is y equals 2 when x equals 2 the expected 2 times 2 point 5 minus 2 is 3 so this is going to be 2 minus 3 which equals a residual of negative 1 and then over here our residual our actual when x equals 3 is 6 our expected when x equals 3 is a 5 point 5 so 6 minus 5.5 that is a positive 0.5 so those are the residuals but how do we plot it well we would set up our axes let me do it right over here one two and three and let's see the maximum residual here is positive point five and then the minimum one here is negative one so let's see this could be 0.5 1 negative 0.5 negative 1 so this is negative 1 this is positive 1 here and so when x equals 1 what was the residual well the actual was 1 expected was 0.5 1 minus 0.5 is 0.5 so this right over here we can plot right over here the residual is 0.5 when x equals 2 we actually have two data points first I'll do this one when we have the point 2 comma 3 the residual there is 0 so for one of them the residual is 0 now for the other one the residual is negative 1 let me do that in a different color for the other one the residual is negative 1 so we would plot it right over here and then this last point the residual is positive 0.5 so it is just like that and so this thing that I have just created where we're just seeing for each X where we have a corresponding point we plot the point above or below the line based on the residual this is called a residual plot now one question is why do people even go through the trouble of creating a residual plot like this the answer is regardless of whether the regression line is upward sloping or downward sloping this gives you a sense of how good a fit it is and whether a line is good at explaining the relationship between the variables the general idea is if you see the points pretty evenly scattered or randomly scattered above and below this line you don't really discern any trend here then a line is probably a good model for the data but if you do see some type of trend if there Jules had an upward trend like this if they were curving up and then curving down or they had a downward trend then you might say hey this line isn't a good fit and maybe we would have to do a non-linear model what are some examples of other residual plots and let's try to analyze them a bit so right here you have a regression line and it's corresponding residual plot and once again you see here the residual is slightly positive but actual is slightly above the line and you see it right over there it's slightly positive this one's even more positive you see it there but like the example we just looked at it looks like these residuals are pretty evenly scattered above and below the line there isn't any discernible trend and so I would say that a linear model here and in particular this regression line is a good model for this data but if we see something like this a different picture emerges when I look at just the residual plot it doesn't look like they're evenly scatter it looks like there's some type of trend here I'm going down here but then I'm going back up when you see something like this where all the residual plot you're going below the x-axis then above then it might say hey a linear model might not be appropriate maybe some type of nonlinear model some type of nonlinear curve might better fit the data or the relationship between the Y and the X is non linear another way you could think about it is when you have a lot of residuals that are pretty far away from the x axis in the residual plot you would also say this line isn't such a good fit if you calculate the R value here it would only be slightly positive but it would not be close to 1