If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

R-squared intuition

AP.STATS:
DAT‑1 (EU)
,
DAT‑1.G (LO)
,
DAT‑1.G.4 (EK)
When we first learned about the correlation coefficient, r, we focused on what it meant rather than how to calculate it, since the computations are lengthy and computers usually take care of them for us.
We'll do the same with r, squared and concentrate on how to interpret what it means.
In a way, r, squared measures how much prediction error is eliminated when we use least-squares regression.

Predicting without regression

We use linear regression to predict y given some value of x. But suppose that we had to predict a y value without a corresponding x value.
Without using regression on the x variable, our most reasonable estimate would be to simply predict the average of the y values.
Here's an example, where the prediction line is simply the mean of the y data:
Notice that this line doesn't seem to fit the data very well. One way to measure the fit of the line is to calculate the sum of the squared residuals—this gives us an overall sense of how much prediction error a given model has.
So without least-squares regression, our sum of squares is 41, point, 1879
Would using least-squares regression reduce the amount of prediction error? If so, by how much? Let's see!

Predicting with regression

Here's the same data with the corresponding least-squares regression line and summary statistics:
Equationrr, squared
y, with, hat, on top, equals, 0, point, 5, x, plus, 1, point, 50, point, 8160, point, 6659
This line seems to fit the data pretty well, but to measure how much better it fits, we can look again at the sum of the squared residuals:
Using least-squares regression reduced the sum of the squared residuals from 41, point, 1879 to 13, point, 7627.
So using least-squares regression eliminated a considerable amount of prediction error. How much though?

R-squared measures how much prediction error we eliminated

Without using regression, our model had an overall sum of squares of 41, point, 1879. Using least-squares regression reduced that down to 13, point, 7627.
So the total reduction there is 41, point, 1879, minus, 13, point, 7627, equals, 27, point, 4252.
We can represent this reduction as a percentage of the original amount of prediction error:
start fraction, 41, point, 1879, minus, 13, point, 7627, divided by, 41, point, 1879, end fraction, equals, start fraction, 27, point, 4252, divided by, 41, point, 1879, end fraction, approximately equals, 66, point, 59, percent
If you look back up above, you'll see that r, squared, equals, 0, point, 6659.
R-squared tells us what percent of the prediction error in the y variable is eliminated when we use least-squares regression on the x variable.
As a result, r, squared is also called the coefficient of determination.
Many formal definitions say that r, squared tells us what percent of the variability in the y variable is accounted for by the regression on the x variable.
It seems pretty remarkable that simply squaring r gives us this measurement. Proving this relationship between r and r, squared is pretty complex, and is beyond the scope of an introductory statistics course.

Want to join the conversation?

  • leafers seed style avatar for user ivan08urbieta
    Which parameter is then better to evaluate the fit of a line to a data set? the correlation coefficient (r) or the coefficient of determination (r2)?
    (22 votes)
    Default Khan Academy avatar avatar for user
    • male robot donald style avatar for user Nahuel Prieto
      The short answer is this: In the case of the Least Squares Regression Line, according to traditional statistics literature, the metric you're looking for is r^2.

      Longer answer:
      IMHO, neither r o r^2 are the best for this. In the case of r, it is calculated using the Standard Deviation, which itself is a statistic that has been long put to doubt because it squares numbers just to remove the sign and then takes a square root AFTER having added those numbers, which resembles more an Euclidean distance than a good dispersion statistic (it introduces an error to the result that is never fully removed). Here is a paper about that topic presented at the British Educational Research Association Annual Conference in 2004: https://www.leeds.ac.uk/educol/documents/00003759.htm .

      If we used the MAD (mean absolute deviation) instead of the standard deviation to calculate both r and the regression line, then the line, as well as r as a metric of its effectiveness, would be more realistic, and we would not even need to square r at all.

      This is a very extensive subject and there are still lots of different opinions out there, so I encourage other people to complement my answer with what they think.

      Hope you found my answer helpful or at least interesting.

      Cheers!
      (3 votes)
  • blobby green style avatar for user morecmy
    what's the difference between R-squared and the total sum of squared residual?
    (8 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Brown Wang
    How we predict sum of squares in the regression line?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • duskpin seed style avatar for user 347231
      Tbh, you really cannot get around squaring every number. I guess if you have decimals, you could round them them off, but really,, other than that, there’s no shortcut. It is difficult to predict because the powers have to be applied to each and every number. You could always do a bit of mental math and round things off into easier numbers, but it’s not always reliable.
      (4 votes)
  • blobby green style avatar for user Maryam Azmat
    If you have two models of a set of data, a linear model and a quadratic model, and you have worked out the R-squared value through linear regression, and are then asked to explain what the R-squared value of the quadratic model is, without using any figures, what would this explanation be?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • primosaur seed style avatar for user Ian Pulizzotto
      A quadratic model has one extra parameter (the coefficient on x^2) compared to a linear model. Therefore, the quadratic model is either as accurate as, or more accurate than, the linear model for the same data. Recall that the stronger the correlation (i.e. the greater the accuracy of the model), the higher the R^2. So the R^2 for the quadratic model is greater than or equal to the R^2 for the linear model.

      Have a blessed, wonderful day!
      (3 votes)
  • blobby green style avatar for user Neel Kumar
    Can I get the exact data set, based on that this dot plot have been created.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • leafers ultimate style avatar for user Suni Sam
    how do you calculate r^2
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Shannon Hegewald
    They lost me at the squares
    (2 votes)
    Default Khan Academy avatar avatar for user
  • duskpin tree style avatar for user Amanda Pang
    What about the definition for 1-r^2? I see some formal definitions say it's "the remaining variation left in the residuals", so what does this really mean?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • starky seedling style avatar for user deka
      it's the remaining variations of y datasets, after fitting them onto a regression line

      in the example above, (1-r^2)*41.1879 = 13.7627
      1. the left 41~ says how far the y datapoints are from their average (line)
      2. the right 13~ says how far they are from the regression line
      3. and the regression line was from the assumption that variable x must affect or at least have a correlation with variable y

      in sum, r^2 says the extent of a linear model on explaining why y datapoints vary that much using x's variation. and 1-r^2 is the portion of the left unexplained part
      (1 vote)
  • eggleston blue style avatar for user Griff
    Can you predict it without solving for it
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Jeff
    r^2 is a prediction of error removed from what previous model?
    (1 vote)
    Default Khan Academy avatar avatar for user