# Introduction to residuals

Build a basic understanding of what a residual is.
We run into a problem in stats when we're trying to fit a line to data points in a scatter plot. The problem is this: It's hard to say for sure which line fits the data best.
For example, imagine three scientists, $\text{\maroonD{Andrea}}$, $\text{\tealD{Jeremy}}$, and $\text{\purpleC{Brooke}}$, are working with the same data set. If each scientist draws a different line of fit, how do they decide which line is best?
If only we had some way to measure how well each line fit each data point...

## Residuals to the rescue!

A residual is a measure of how well a line fits an individual data point.
Consider this simple data set with a line of fit drawn through it
and notice how point $(2,8)$ is $\greenD4$ units above the line:
This vertical distance is known as a residual. For data points above the line, the residual is positive, and for data points below the line, the residual is negative.
For example, the residual for the point $(4,3)$ is $\redD{-2}$:
The closer a data point's residual is to $0$, the better the fit. In this case, the line fits the point $(4,3)$ better than it fits the point $(2,8)$.

## Try to find the remaining residuals yourself

What is the residual of the point $(6,7)$ in the graph above?
What is the residual of the point $(8,8)$ in the graph above?
What is the residual of the point $(1,2)$ in the graph above?