If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains ***.kastatic.org** and ***.kasandbox.org** are unblocked.

Main content

Current time:0:00Total duration:15:08

what I want to do in this video is introduce you to the idea of the covariance between two between two random variables and it's defined as the expected value the expected value of the distance or against the product of the distances of each random variable from their mean or from their expected value so let me just write that down so if I take expect I'll have X for some of this in another color so it's the expected value of random variable X minus the expected value of X you could view this as the population mean of X x and then this is random variable Y so times the distance from Y to its expected value or the population mean the population mean of Y the population mean of Y and if it doesn't make a lot of intuitive sense yet well one you can just always kind of think about what's doing play around with some numbers here but the reality is it's saying how much they vary together so you always take to you an x and a y for each of the data points so let's see here the whole population so every x and y this kind of they go together with each other that are a coordinate you put into this and what happens is let's say that X is above its mean when Y is below its mean so let's say in the population you had the point so one instantiation of the random variables you have you sample one so I guess from the universe and you get X is equal to one and that Y and y is equal to let's say Y is equal to three and let's say that you knew ahead of time you know you knew ahead of time that the expected value of x is zero and let's say that the expected value expected value of y is equal to four so in this situation what just happened now we don't know the entire covariance we only have one sample here four of this random variable but what just happened here we have one minus so we're just going to calculate we're not going to calculate the entire expected value I just want to calculate what happens when we do what's inside the expected value we'll have one minus zero so you'll have a1 times a3 minus 4 times a negative 1 so you have 1 times negative 1 which is negative 1 and what is that telling us well it's telling us at least for this sample this one time that we sample the random variables x and y x was above its expected value when y was below its expected value and if we kept doing this let's say for the entire population this happened then it would make sense that they have a negative covariance when one goes up the other one goes down when one goes down the other one goes up if they both go up together they would have a positive variance or they go both go down together and and the degree to which they do it together will tell you kind of the magnitude of the covariance hopefully that gives you a little bit of intuition about what the covariance is trying to tell us for the more important thing that I want to do in this video is to connect this formula is I want to connect this definition of covariance to everything we've been doing with least-squared regression and really it's just kind of a fun math thing to do to show you how well you know to show you all of these connections and we're really the definition of covariance really becomes useful and I really do think it's motivated to a large degree by where it shows up in regressions and this is all stuff that we've kind of seen before you're just gonna see it in a different way so let's just I'm just this whole video I'm just going to rewrite this regret I'm just going to rewrite this definition of covariance right over here so this is going to be the same thing so this is going to be the same thing as the expected value the expected value of and I'm just going to multiply these two binomials in here so the expected value of our random variable x times our random variable Y times our random variable Y minus well I'll just do the X first so plus x times the negative expected value of y so I'll just say minus minus minus x times the expected value of y the expected value of y and that negative sign comes from this negative sign right over here and then we have minus expected value of x times y minus the expected value of x times this y is doing the distributive property twice and then finally you have the negative expected value of x times the negative expected value of y and the negatives cancel out and so you're just going to have plus the expected value of x plus the expected value of x times the expected value of y times the expected value of y and of course it's the expected value of this entire thing it's the expected value of this entire thing now let's see if we can rewrite this well the expected value of the sum of a bunch of random variables or the sum and difference of a bunch of random variables is just the sum or difference of their expected values so this is going to be the same thing this is going to be remember expected value in a lot of contexts you can view it as just the arithmetic mean or in a continuous distribution you could view it as a probability weighted sum or probability weighted integral either way it's nothing we've seen it before I think so let's rewrite this so this is equal to the expected value of the random variables x + y x times y trying to keep them color-coded for you color-coded and then we have minus x times the expected value of y so then we're going to have minus the expected value minus the expected value of x of x times expected value of y of x times the expected value of y times the expected value of y stay with the right colors then you're going to have minus the expected value of this thing minus the expected value of close the parentheses of this thing right over here expected value of x expected value of x times y and now this might look really confusing with all of the embedded expected values but one way to think about is the things that already have the expected value can kind of view these as numbers you already view them as known so we're actually going to take them out of the expected value because the expected value of an expected value is the same thing as the expected value actually let me write this over here just to remind ourselves the expected value of the expected value of X is just going to be the expected value of X is just going to be the expected value of X think of it this way this is that you could view this as the population mean for the random variable so that's just going to be a known it's out there it's in the universe so the expected value of that is just going to be itself if the population mean or the expected value of X is 5 this is like saying the expected value of 5 with the expected value of 5 is going to be 5 which is the same thing as the expected value of X hopefully that makes sense we're going to use that in a second so we're almost done we did the expected value of this and we have one term left and then the final term the expected value of this guy and here we can actually use the property right from the get-go I'll write it down so the expected value of put some big brackets up of this thing right over here expected value of x times the expected value of y times the expected value of y and let's see if we can simplify it right here so this is just going to be the expected value of the product of these two random variables I'll just leave that the way it is so let me just the stuff that I'm going to leave the way it is I'm just going to kind of freeze them so the expected value the expected value of X Y now what do we have over here we have we have the expected value of x times once again you can kind of view it if you go back to what we just said is this is just going to be a number expected value of y so we can just bring this out if this was the expected value of 3x it would be the same thing as 3 times the expected value of X so we could rewrite this as negative expected value of y expected value of Y times the expected value of x times the expected value of x times the expected out of the expected value we factored it out so just like that and then you have - same thing over here you can factor out this expected value of x minus the expected value of x times the expected value of y times the expected value of Y let me write it times the expected value of y this is getting confusing with all the ease laying around and then finally you have the expected value of this thing of to expected values but that's just going to be the product of those two expected values so that's just going to be plus I'll freeze this expected value of expected value of x times the expected value of y now what do we have here we have expected value of y times the expected value of x and then we are subtracting the expected value of x times expected value of y these two things are the exact same thing right so this is going to be and actually look at this we're subtracting it twice and then we have one more these these are all the same thing this is the expected value of Y times the expected value of X this is the expected value of Y times the expected value of x just written in a different order and this is the expected value of Y times the expected value of X we're subtracting it twice and then we're adding it or one way to think about it is that this guy and that guy will cancel out you could have also picked that guy and that guy but what do we have left we have the covariance of these two random variables x and y are equal to the expected value of I'll switch back to my colors just because this is the final result the expected value of x times the expected value of the product of X Y the expected value of the product the expected value of the product minus - what is this the expected value of y the expected value of Y times the expected value of x times the expected value of x now if you know you can calculate these expected values if you know everything about the probability distribution or density functions for each of these random variables or if you had the entire population that you're sampling from whenever you take an instantiation of these random variables but let's say you just had a sample of these random variables how could you estimate them how could you estimate them well if you if you were estimating it the expected value let's say you just have a bunch of data points a bunch of coordinates and I think you'll start to see how this relates to what we did with regression the expected very the expected the expected value of x times y it can be approximated by the sample mean of X and of the products of x and y this is going to be the sample mean of x and y you take each of your XY associations take the product and then take the mean of all of them so that's going to be the product of x and y and then this thing right over here the expected value of y that can be approximated by the sample by the sample mean of Y and the expected value of X can be approximated by the sample mean of X so what can this what can the covariance of two random variables be approximated by what can it be approximated by well this right here is the mean of their product this right here is the mean of their product from your sample minus the mean of your sample wise minus the mean of your sample wise times the mean of your sample X's times the mean of your sample X's and this should start looking familiar this should look a little bit familiar because what is this this was the numerator this right here is the numerator this is the numerator when we were trying to figure out the slope of the regression line so we tried to figure out the slope of the regression line we had the let me just rewrite the formula here just to remind you it was literally the mean of the products of each of our data points - or the X YS minus the mean of Y times the mean of the X's all of that over all of that over the mean of the X Squared's and you could even view it as this over the mean of the x times the X's but I could just write the X Squared's over here - minus the mean the mean of x squared this was this is how we figured out the slope of our regression line or maybe a better way to think about it if we assumed in our regression line that we we were the points that we have were sample from an entire universe of possible points then you could say that we are approximating the approximating the slope of our regression line and you might see this little hat notation in a lot of books don't want to be confused or saying that you're approximating the populations regression line from a sample from a sample of it now this right here so everything we've learned right now this right here is the covariance or this is an estimate of the covariance of X and Y now what is this over here well I just said you could rewrite this very easily as so this bottom part right here you could write as the mean of x times X that's the same thing as x squared minus minus the mean of x times the mean of X right that's what the mean of x squared is well what's this well you could view this as the covariance the covariance of X with X but we already we've actually already seen this and I've actually shown you many many videos goes when we first learned about it what this is the covariance of a random variable with itself is really just the variance is really just the variance of that random variable and you could verify it for yourself if you take if you change this Y to an X this becomes X minus the expected value of x times X minus the expected value of x or that's the expected value of X minus the expected value of x squared that's your definition of variance so our another way of thinking about the slope of our regression line it can be literally viewed as the covariance of our two random variables over the variance over the variance of over the variance of X or you can kind of view it as the independent random variable that right there is the slope of our regression line anyway I thought that was interesting and I wanted to make connections between things you see in different parts of statistics and show you that there they really are connected