If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Deriving the variance of the difference of random variables

AP.STATS:
VAR‑5 (EU)
,
VAR‑5.E (LO)
,
VAR‑5.E.2 (EK)
,
VAR‑5.E.3 (EK)

## Video transcript

What I want to do in this video is build up some tools in our tool kit for dealing with sums and differences of random variables. So let's say that we have two random variables, x and y, and they are completely independent. They are independent random variables. And I'm just going to go over a little bit of a notation here. If we wanted to know the expected, or if we talked about the expected value of this random variable x, that is the same thing as the mean value of this random variable x. If we talk about the expected the value of y, that is the same thing as the mean of y. If we talk about the variance of the random variable x, that is it the same thing as the expected value of the squared distances between our random variable x and its mean. And that right there squared. So the expected value of these squared differences, and that you could also use the notation sigma squared for the random variable x. This is just a review of things we already know, but I just want to reintroduce it because I'll use this to build up some of our tools. So you do the same thing with this with random variable y. The variance of random variable y is the expected value of the squared difference between our random variable y and the mean of y, or the expected value of y, squared. And that's the same thing as sigma squared of y. There is the variance of y. Now you may or may not already know these properties of expected values and variances, but I will reintroduce them to you. And I won't go into some rigorous proof-- actually, I think they're fairly easy to digest. So one is is that if I have some third random variable, let's say I have some third random variable that is defined as being the random variable x plus the random variable y. Let me stay with my colors just so everything becomes clear. The random variable x plus the random variable y. What is the expected value of z going to be? The expected the value of z is going to be equal to the expected value of x plus y. And this is a property of expected values-- I'm not going to prove it rigorously right here-- but the expected value of x plus the expected value of y, or another way to think about this is that the mean of z is going to be the mean of x plus the mean of y. Or another way to view it is if I wanted to take, let's say I have some other random variable. I'm running out of letters here. Let's say I have the random variable a, and I define random variable a to be x minus y. So what's its expected value going to be? The expected value of a is going to be equal to the expected value of x minus y, which is equal to-- you could either view it as the expected value of x plus the expected value of negative y, or the expected value of x minus the expected value of y, which is the same thing as the mean of x minus the mean of y. So this is what the mean of our random variable a would be equal to. And all of this is review and I'm going to use this when we start talking about the distributions that are sums and differences of other distributions. Now let's think about what the variance of random variable z is and what the variance of random variable a is. So the variance of z-- and just to kind of always focus back on the intuition, it makes sense. If x is completely independent of y and if I have some random variable that is the sum of the two, then the expected value of that variable, of that new variable, is going to be the sum of the expected values of the other two because they are unrelated. If my expected value here is 5 and my expected value here is 7, completely reasonable that my expected value here is 12, assuming that they're completely independent. Now if we have a situation, so what is the variance of my random variable z? And once again, I'm not going do a rigorous proof here, this is really just a property of variances. But I'm going to use this to establish what the variance of our random variable a is. So if this squared distance on average is some variance, and this one is completely independent, it's squared distance on average is some distance, then the variance of their sum is actually going to be the sum of their variances. So this is going to be equal to the variance of random variable x plus the variance of random variable y. Or another way of thinking about it is that the variance of z, which is the same thing as the variance of x plus y, is equal to the variance of x plus the variance of random variable y. Hopefully that make some sense. I'm not proving it to you rigorously. And you'll see this in a lot of statistics books. Now what I want to show you is that the variance of random variable a is actually this exact same thing. And that's the interesting thing, because you might say, hey, why wouldn't it be the difference? We had the differences over here. So let's experiment with this a little bit. The variance-- so I'll just write this-- the variance of random variable a is the same thing as the variance of-- I'll write it like this-- as x minus y, which is equal to-- you could view it this way-- which is equal to the variance of x plus negative y. These are equivalent statements. So you could view this as being equal to-- just using this over here, the sum of these two variances, so it's going to be equal to the sum of the variance of x plus the variance of negative y. Now what I need to show you is that the variance of negative y, of the negative of that random variables are going to be the same thing as the variance of y. So what is the variance of negative y? The variance of negative y is the same thing as the variance of negative y, which is equal to the expected value of the distance between negative y and the expected value of negative y squared. That's all the variance actually is. Now what is the expected value of negative y right over here? Actually, even better let me factor out a negative 1. So what's in the parentheses right here, this is the exact same thing as negative 1 squared times y plus the expected value of negative y. So that's the same exact same thing in the parentheses, squared. So everything in magenta is everything in magenta here, and it is the expected value of that thing. Now what is the expected value of negative y? The expected value of negative y-- I'll do it over here-- the expected value of the negative of a random variable is just a negative of the expected value of that random variable. So if you look at this we can re-write this-- I'll give myself a little bit more space-- we can re-write this as the expected value-- the variance of negative y is the expected value-- this is just 1. Negative 1 squared is just 1. And over here you have y, and instead just write plus the expected value of negative y, that's the same thing as minus the expected value of y. So you have that, and then all of that squared. Now notice, this is the exact same thing by definition as the variance of y. So what we just showed you just now, so this is the variance of y. So we just showed you is that the variance of the difference of two independent random variables is equal to the sum of the variances. You could definitely believe this, it's equal to the sum of the variance of the first one plus the variance of the negative of the second one. And we just showed that that variance is the same thing as the variance of the positive version of that variable, which makes sense. Your distance from the mean is going to be-- it doesn't matter whether you're taking the positive or the negative of the variable. You just cared about absolute distance. So it makes complete sense that that quantity and that quantity is going to be the same thing. Now the whole reason why I went through this exercise, kind of the important takeaways here is that the mean of differences right over here-- so I could re-write it as the differences of the random variable is the same thing as the differences of their means. And then the other important takeaway, and I'm going to build on this in the next few videos, is that the variance of the difference-- if I define a new random variable is the difference of two other random variables, the variance of that random variable is actually the sum of the variances of the two random variables. So these are the two important takeaways that we'll use to build on in future videos. Anyway, hopefully that wasn't too confusing. If it was, you can kind of just accept these at face value and just assume that these are tools that you can use.
AP® is a registered trademark of the College Board, which has not reviewed this resource.