If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

# Intuition for why independence matters for variance of sum

Intuition for why independence matters for variance of sum.

## Want to join the conversation?

• Would someone put into words what is being measured by Var(X + Y)? I understand that of all the people in our sample, for both random variables, there was an average spread from the mean of 2 hours. What does it mean when we add these?
(6 votes)
• idk may be because that is how it was supposed to be.
(2 votes)
• I want to ask 2 things here.
If the variances of dependent variables is 0 because they dependent on each other, like if this one changes the other one will be changed too thus there will no change in variances, Am I understand it correct?
and What about the mean of dependent variables?
(2 votes)
• In this particular case 𝑋 + 𝑌 is a constant, which is why Var(𝑋 + 𝑌) = 0.

This isn't always the case, though, and besides it's not very relevant.
What Sal wanted to show is that the equation
Var(𝑋 ± 𝑌) = Var(𝑋) + Var(𝑌) doesn't necessarily hold up if 𝑋 and 𝑌 are dependent.

– – –

For your second question, since the outcome of 𝑌 depends on the outcome of 𝑋, then the mean of 𝑌 depends on the mean of 𝑋.

In this case 𝜇(𝑋) is the number of hours that the average person slept yesterday, while 𝜇(𝑌) is the number of hours the average person was awake yesterday.
That gives us 𝜇(𝑌) = 24 − 𝜇(𝑋)
(7 votes)
• The key concept I do not understand here is how to combine two random variables? In the last video we summed or subtracted X and Y as extreme values of both, why we do not do that here and if so we would got variability? What is the rule of the game of combining two r. v.?
(2 votes)
• why doesn't Var(X+Y)=8 (hrs.)2 make sense?
(2 votes)
• So the only way to calculate the variance of sum of two dependent variables is sum the individual data points to form a new variable then apply the variance formula?
what if the two variables have different size?
(1 vote)

## Video transcript

- [Narrator] So in previous videos we talked about the claim that if I have two random variables, x and y, that are independent, then the variance of the sum of those two random variables or the difference of those two random variables is going to be equal to the sum of the variances. So that if you have independent random variables, your variation is going to increase when you take a sum or a difference. And we've built a little bit of intuition there. What I wanna talk about in this video, it's really about building even more intuition, is get a gut feeling for why this independence is important for making this claim. And to get that intuition, let's look at two random variables that are definitely random variables but that are definitely not independent. So let's let x is equal to the number of hours that the next person you meet, so I'll say random person, random person slept yesterday. And let's say that y is equal to the number of hours that same person was awake yesterday. And appreciate why these are not independent random variables. One of them is gonna completely determine the other. If I slept eight hours yesterday then I would have been awake for 16 hours. Or if I slept for 16 hours then I would have been awake for eight hours. We know that x plus y, even though they're random variables, and there could be variation in x and there could be variation in y. But for any given person, remember, these are still based on that same person. X plus y is always going to be equal to 24 hours. So these are not independent, not independent. If you're given one of the variables it would completely determine what the other variable is. The probability of getting a certain value for one variable is going to be very different, given what value you got for the other variable. So they're not independent at all. So in this situation, if someone said, let's just say for the sake of argument, that the variance of x, the variance of x is equal to, I don't know, let's say it's equal to four, the unit's four variance so it would be squared hours. So four hours squared. We could say that the standard deviation for x in this case would be two hours. And let's say that the variance, let's say the standard deviation of y is also equal to two hours. And let's say that the variance of y, variance of y, well it would be the square of the standard deviation. And so it would be four hours, four hours squared would be our units. So if we just tried to blindly say, "Oh, I'm just gonna apply this little "expression, this claim we have," without thinking about the independents we would try to say, "Well then, the variance "of x plus y, the variance of x plus y "must be equal to the sum of their variances." So it would be four plus four. So is it equal to eight hours squared? Well that doesn't make any sense. Because we know that a random variable that is equal to x plus y, this is always going to be 24 hours. In fact, it's not going to have any variation. X plus y is always gonna be 24 hours. So for these two random variables, because they are so connected. They are not independent at all, this is actually going to be zero. There is zero variance here. X plus y is always going to be 24. At least on earth where we have a 24 hour day. I guess if someone lived on another planet or something it could be slightly different. And we're assuming that we have an exactly 24 hour day on earth. So this is to give you a gut sense of why independence matters for making this claim. And if you have things that are not independent it gives you a good sense for why this claim doesn't hold up as much.