If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: AP®︎/College Statistics>Unit 5

Lesson 3: Residuals

# Residual plots

Creating and analyzing residual plots based on regression lines.

## Want to join the conversation?

• In the last example shown, can a situation be explained by two linear relationships? in the example shown, the first few data points closer to the Y axis, explained by a negative linear relationship and the ones to the right by a positive linear relationship. Is this possible?
• Unfortunately, no.

We have to always describe the trend/relationship in the data values with just one pattern which 'best' fits the data. It can be a line, curve, etc.

Making a positive AND a negative sloping line to describe the shape would mean the data has both positive and negative trend, which is impossible because a bivariate data always has either one relationship, or doesn't have any at all.

Hope it helped!
• why does an evenly or randomly scattered residual plot indicate the line is a good line of best fit?
• There are a few different assumptions we have to check against to make sure simple linear regression is the correct analysis to use. One of the assumptions we check is the assumption of equal variance and we check this with a residual vs fitted plot. Essentially, to perform linear analysis we need to have roughly equal variance in our residuals. If there is a shape in our residuals vs fitted plot, or the variance of the residuals seems to change, then that suggests that we have evidence against there being equal variance, meaning that the results of our linear analysis are likely to be less robust and other analyses should be considered.
• Is there a way how I can print the worksheet out?
• If you mean the practices, take a screenshot of the question you want and then print it. If your intent is to mark up the page to help you visualize what's going on, I'd suggest using your screenshot editing tool (like Snip and Sketch for Microsoft) or a basic photo editor to draw on the image so you don't have to print it.

Hope this helps!😀
• In the second example, can we say that we have sine function trend so the line is not a good fit?
• In the second example, if the residual plot exhibits a sinusoidal trend (oscillating above and below the x-axis), it suggests that the linear regression model may not adequately capture the underlying relationship between the variables. This could indicate that the relationship is better described by a periodic function like a sine wave rather than a straight line. In such cases, a linear model would not be appropriate, and fitting a non-linear model, such as a sine function, might provide a better fit to the data.
(1 vote)
• Y'know, I got a residual question on i-Ready, so my teacher told me to search it up. I did, and it made -32% sense... KA is so much better.
• So, if the dots are not close to the x axis, it is not a line of best fit?
• We're going off of the assumption here that the line is the line of best fit. If the dots aren't close to the X axis in the residual plot, then it's most likely that the data points aren't linear. The data set may in fact have an exponential or sinasudacal form (among other things).

So a line of best fit doesn't always work well for all data sets since a line of best fit will always be a line. And not all data sets can be described well from a line.

Hope this makes sense! (:
• could u do an actual + expected for a residual plot
• plotting actual + expected wouldn't really give you anything of statistical relevance.
• we measure residual from x-axis viewpoint. or from the independent variable perspective, but what about the y-axis. shouldn't we measure residual from the y-axis viewpoint or dependent variable perspective? like (residual of x) = (actual value of x) - (expected value of x)
• In the context of residual plots, residuals are typically measured from the y-axis viewpoint or dependent variable perspective. The residual for a specific data point is indeed calculated as the difference between the actual value of the dependent variable (y) and the predicted value of y based on the regression line. So, you're correct that the residual is essentially the vertical distance between the observed data point and the regression line. The x-axis is used to represent the independent variable, and the y-axis represents the dependent variable. So, while we analyze the distribution of residuals along the x-axis in a residual plot, it's ultimately to assess how well the regression line explains the variability in the dependent variable (y).
(1 vote)
• How do you find out the residual when x or y are not given?
(1 vote)
• You don't need to know the precise value of the residual.

Most likely if you were performing regression analysis you would be using a programming language e.g. Python. From there you could write code to graph the residual or have a dataframe indicating what are the residuals for each point.