Main content

## Statistics and probability

### Course: Statistics and probability > Unit 9

Lesson 4: Combining random variables- Mean of sum and difference of random variables
- Variance of sum and difference of random variables
- Intuition for why independence matters for variance of sum
- Deriving the variance of the difference of random variables
- Combining random variables
- Combining random variables
- Example: Analyzing distribution of sum of two normally distributed random variables
- Example: Analyzing the difference in distributions
- Combining normal random variables
- Combining normal random variables

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Example: Analyzing distribution of sum of two normally distributed random variables

Finding the probability that the total of some random variables exceeds an amount by understanding the distribution of the sum of normally distributed variables.

## Want to join the conversation?

- So, I tried solving this problem on my own. I figured out that 25 Liters is 2 standard deviations away form the mean. Using the 69, 95, 99.7 rule, I calculated that the chance that Shinji uses fuel between 25 and 15 is 95 percent. So there is a 5 percent chance he uses between 0 and 15 and 25 to infinity. I divide 5% by 2 because I am only interested in 25 to infinity, and get 2.5%, which is close to what Sal got but worng. Can someone explain the flaw in my logic?(18 votes)
- The 68-95-99.7 rule says that
*approximately*95% of a normal distribution lies within 2 standard deviations from the mean, so there's no flaw in your logic, it's just that the percentages given by the rule aren't exact.(19 votes)

- what if the two variables aren't independent , what would be the sum variance?(4 votes)
- If two variables, X and Y, aren't independent, then the variance of their sum is Var(X) + Var(Y) + 2*Cov(X,Y), where "Cov" means covariance. For further information, see https://www.probabilitycourse.com/chapter5/5_3_1_covariance_correlation.php(8 votes)

- At4:54you find out the value of probability for values within twice the standard deviation. So, it includes Probability of the fuel ranging from values (mean-2sd) to (mean+2sd). Subtracting probability from the table from 1 will give us the value of P( fuel consumed<15L)+ P(fuel consumed>25L). What we want is just P(fuel consumed>25L). So, shouldn't you divide the final answer by two?(2 votes)
- Actually, Sal is right. We are used to the 68 - 95 - 99.7 rule, which tells us the percentage occupied by +- sd, +- 2sd ... So that leaves area on both sides. But, What the Z table gives us, is not that!! it gives us the percentage, that is below whatever sd you are at. So when you subtract that value(0.97..) you are actually ONLY left with the tiny area only on the right hand side.(3 votes)

- why can we add variance, but can't add standard deviation?(2 votes)
- we can add standard deviation. However, the formula originates from adding variances so he shows it.

In general

sigma x+y = sqrt(sigma^2 X + sigma^2 Y)(2 votes)

- If the amount of fuel he uses follows a normal distribution, wouldn't there be a small but positive chance that he uses a negative amount of fuel, since the normal distribution extends to infinity in both directions?(2 votes)
- At2:05, he says you can't just add the standard deviations.

Basically, you have to sum the two variances and then take the square root to get the standard deviation as used at4:20.(1 vote) - In the video st 4.56 the first row Z to .09 what do these number indicate for ?(1 vote)
- Is there a proof anywhere on KA or anywhere else for

E(A+B) = E(A) + E(B)

and

Var(A+B) = Var(A) + Var(B)

for independent variables

and that summing two normally distributed gives you another normal distribution? Could someone link me to some good sources to find answers for these, even if I don't understand them at least I'll have a source to look forward to once I get better in math.(1 vote)- I found a good one for the sum of variances thing from someone else's comment if anyone's interested: https://apcentral.collegeboard.org/courses/ap-statistics/classroom-resources/why-variances-add-and-why-it-matters(1 vote)

- Are there any
**Normal Dist CDF**formulas out there?(1 vote)

## Video transcript

- [Instructor] Shinji commutes to work and he worries about running out of fuel. The amount of fuel he uses follows a normal distribution for
each part of his commute, but the amount of fuel he uses
on the way home varies more. The amounts of fuel he uses
for each part of the commute are also independent of each other. Here are summary statistics for the amount of fuel Shinji uses for
each part of his commute. So when he goes to work he uses
a mean of 10 liters of fuel, with a standard deviation of 1.5 liters. And on the way home, he also
has a mean of 10 liters, but there is more variation. There is more spread. He has a standard deviation of two liters. Suppose that Shinji has 25
liters of fuel in his tank and he intends to drive
to work and back home. What is the probability that
Shinji runs out of fuel? All right, this is really interesting. We have the distributions
for the amount of fuel he uses to work and to home, and they say that these
are normal distributions. They say that right over here, follows a normal distribution. But here we're talking
about the total amount of fuel he has to go
to work and to go home. So what we wanna do is come
up with a total distribution, home and back, I guess you could say. We could say, call this work plus home. Home and back. If you have two random variables that can be described
by normal distributions and you were to define a new
random variable as their sum, the distribution of
that new random variable will still be a normal distribution and its mean will be the sum of the means of those other random variables. So the mean here, I'll say the mean of work plus home is going
to be equal to 20 liters. He will use a mean of 20
liters in the roundtrip. Now for the standard
deviation, from home plus work, you can't just add the standard deviations going and coming back. But because the amount
of fuel going to work and the amount of fuel coming home are independent random variables, because they are
independent of each other, we can add the variances. And only because they are independent can we add the variances. So what you can say is that the variance of the combined trip is equal to the variance of going to work plus the variance of going home. So what's the variance of going to work? Well, 1.5 squared is, so
this will be 1.5 squared, and what's the variance coming home? Well, this is going to be
two squared, two squared. Well, this is 2.25 plus four, which is equal to 6.25. So the variance on the roundtrip is equal to 6.25. If I were to take the square root of that, which is equal to 2.5, we can now describe
the normal distribution of the roundtrip and use
that to answer the question. So we have this normal distribution that might look something like this. We know its mean is 20 liters. So this is 20 liters. And we want to know
what is the probability that Shinji runs out of fuel. Well, to run out of fuel, he would need to require
more than 25 liters of fuel. So if 25 liters of fuel
is right over here, so this is 25 liters of fuel, the scenario where Shinji runs out of fuel is right over here, this is where he needs
more than 25 liters. He actually has 25 liters in his tank. So how do we figure out
that area right over there? Well, we could use a z-table. We could say how many standard deviations above the mean is 25 liters? Well, it is five liters above the mean, so let me write this down. So the Z here, the Z is equal to 25 minus the mean, minus 20, divided by the standard deviation for, I guess you could say this
combined normal distribution. This is two standard
deviations above the mean or a z-score of plus two. So if we look at a z-table
and we look exactly two standard deviations above the mean, that will give us this
area, the cumulative area below two standard
deviations above the mean. And then if we subtract that from one, we will get the area that we care about. So let's get our z-table out. We care about a z-score of exactly two, so 2.00 is right over here, .9772. So that tells us that this
area right over here is 0.9772, and so that blue area, the probability that
Shinji runs out of fuel is going to be one minus 0.9772, and what is that going to be equal to? Let's see, this is going to be equal to 0.0228. Did I do that right? I think I did that right. Yes, 0.0228 is the probability that
Shinji runs out of fuel. If you want to think of it as a percent, 2.28% chance that he runs out of fuel.