If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:5:13
DAT‑1 (EU)
DAT‑1.G (LO)

Video transcript

in other videos we've done linear regressions by hand but we mentioned that most regressions are actually done using some type of computer or calculator and so what we're going to do in this video is look at an example of the output that we might see from a computer and to not be intimidated by it and to see how it gives us the equation for the regression line and some of the other data it gives us so here it tells us cheryl Dixon is interested to see if students who consumed more caffeine tend to study more as well she randomly selects 20 students at her school and records their caffeine intake in milligrams and the number of hours spent studying a scatter plot of the data showed a linear relationship this is a computer output from a least-squares regression analysis on the data so we have these things called the predictors coefficient and then we have these other things standard error coefficient t and p and then all of these things down here how do we make sense of this in order to come up with an equation for our linear regression so let's just get straight on our variables let's just say that we say that Y is the thing that we're trying to predict so this is the hours spent studying hours studying and then let's say X is what we think explains the our study here's one of the things that explains the hour studying and this is the amount of caffeine ingested so this is caffeine consumed in milligrams and so our regression line would have the form Y hat this tells us this is a linear regression it's trying to estimate the actual Y values for given X's is going to be equal to MX plus B now how do we figure out what m and B are based on this computer output so when you look at this table here this first column says predictor and it says constant and it has Fein and so all this is saying is when you're trying to predict the number of hours studying when you're trying to predict why there's essentially two inputs there there is the constant value and there is your variable in this case caffeine that you're using to predict the amount that you study and so this tells you the coefficients on each the coefficient on a constant is the constant you could view this as the coefficient on the X to the 0th term and so the coefficient on the constant that is the constant two point five four four and then the coefficient on the caffeine well we just said that X is the caffeine consumed so this is that coefficient 0.164 so just like that we actually have the equation for the regression line that is why these computer things are useful so we can just write it out Y hat is equal to zero point one six four X plus two point five four four two point five four four so that's the regression line what is this other information to give us well I won't give you a very satisfying answer because all of this is actually useful for inferential statistics to think about things like well what is the probability that this is chance that we got something to fit this well so this right over here is the r-squared and if you wanted to figure out the r from this you would just take the square root here we could say that R is going to be equal to the square root of zero point six zero zero three two depending on how much precision you have but you might say well how do we know if R is a positive square root or the negative square root of that our can take on values between negative one and positive one and the answer is you would look at the slope here we have a positive slope which tells us that R is going to be positive we had a negative slope than R well then we would take the negative square root now this right here is the adjusted R squared and we really don't have to worry about it too much when we're thinking about just bivariate data we're talking about caffeine and our studying in this case if we started to have more variables that tried to explain the our studying then we would care about adjusted r-squared but we're not going to do that just yet last but not least this S variable this is the standard deviation of the residuals which we study in other videos and why is that useful well that's a measure of how well the regression line fits the data it's a measure of we could say the typical error so big takeaway computers are useful they'll give you a lot of data and the key thing is how do you pick out the things that you actually need because if you know how to do it it can be quite straightforward
AP® is a registered trademark of the College Board, which has not reviewed this resource.