# Introduction to residuals

Build a basic understanding of what a residual is.
We run into a problem in stats when we're trying to fit a line to data points in a scatter plot. The problem is this: It's hard to say for sure which line fits the data best.
For example, imagine three scientists, start color maroonD, A, n, d, r, e, a, end color maroonD, start color tealD, J, e, r, e, m, y, end color tealD, and start color purpleC, B, r, o, o, k, e, end color purpleC, are working with the same data set. If each scientist draws a different line of fit, how do they decide which line is best?
If only we had some way to measure how well each line fit each data point...

## Residuals to the rescue!

A residual is a measure of how well a line fits an individual data point.
Consider this simple data set with a line of fit drawn through it
and notice how point left parenthesis, 2, comma, 8, right parenthesis is start color greenD, 4, end color greenD units above the line:
This vertical distance is known as a residual. For data points above the line, the residual is positive, and for data points below the line, the residual is negative.
For example, the residual for the point left parenthesis, 4, comma, 3, right parenthesis is start color redD, minus, 2, end color redD:
The closer a data point's residual is to 0, the better the fit. In this case, the line fits the point left parenthesis, 4, comma, 3, right parenthesis better than it fits the point left parenthesis, 2, comma, 8, right parenthesis.

## Try to find the remaining residuals yourself

What is the residual of the point left parenthesis, 6, comma, 7, right parenthesis in the graph above?

What is the residual of the point left parenthesis, 8, comma, 8, right parenthesis in the graph above?

What is the residual of the point left parenthesis, 1, comma, 2, right parenthesis in the graph above?