Main content
Statistics and probability
Course: Statistics and probability > Unit 9
Lesson 4: Combining random variables- Mean of sum and difference of random variables
- Variance of sum and difference of random variables
- Intuition for why independence matters for variance of sum
- Deriving the variance of the difference of random variables
- Combining random variables
- Combining random variables
- Example: Analyzing distribution of sum of two normally distributed random variables
- Example: Analyzing the difference in distributions
- Combining normal random variables
- Combining normal random variables
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Combining normal random variables
When we combine variables that each follow a normal distribution, the resulting distribution is also normally distributed. This lets us answer interesting questions about the resulting distribution.
Example 1: Total amount of candy
Each bag of candy is filled at a factory by 4 machines. The first machine fills the bag with blue candies, the second with green candies, the third with red candies, and the fourth with yellow candies. The amount of candy each machine dispenses is normally distributed with a mean of 50, start text, g, end text and a standard deviation of 5, start text, g, end text. Also, assume that the amount dispensed by any given machine is independent from the other machines.
Let T be the total weight of candy in a randomly selected bag.
Find the probability that a randomly selected bag contains less than 178, start text, g, end text of candy.
Let's solve this problem by breaking it into smaller pieces.
Example 2: Difference in bowling scores
Adam and Mike go bowling every week. Adam's scores are normally distributed with a mean of 175 pins and a standard deviation of 30 pins. Mike's scores are normally distributed with a mean of 150 pins and a standard deviation of 40 pins. Assume that their scores in any given game are independent.
Let A be Adam's score in a random game, M be Mike's score in a random game, and D be the difference between Adam's and Mike's scores where D, equals, A, minus, M.
Find the probability that Mike scores higher than Adam in a randomly selected game.
Let's solve this problem by breaking it into smaller pieces.
Want to join the conversation?
- In Example 2: The hint says P(D < 0), why the probability of the difference between the two data has to be less than 0?(6 votes)
- We have D = A - M. If D < 0, then it can happen only when M > A, which means Mike scores higher than Adam.
P(D < 0) means probability of an event where Mike scores higher than Adam.
Hope that helps.(18 votes)
- In example 2 the number of pins is discrete, how could you represent that using a density curve ?(7 votes)
- Very good question! It turns out that, if Mike and Adam play a large number of games the distribution of their scores will be very well approximated by a normal distribution (even if their scores are discrete variables!). This is a consequence of something called the "Central Limit Theorem". Here is a video of Sal talking about it from the AP/ College Statistics series: https://www.khanacademy.org/math/ap-statistics/sampling-distribution-ap/what-is-sampling-distribution/v/central-limit-theorem(2 votes)
- In the Practice quiz they keep having an absolute value probability question. How does one go about solving that? For instance one example is Sam's mean of washing cars is 20 minutes with a standard deviation of 6.4 minutes. Taylor's mean of washing the interior of cars is 18 minutes with a mean of 4.8 minutes.
Then it says find the probability that a randomly selected time of Sam and Taylor falls within 10 minutes of each other and gives the equation find P(D less than |10|). So I do D=(S-T) and I get mean of D is 2 minutes and the standard deviation is 8 minutes. So far so good, but after that I always go wrong somehow. When I click on the explanation it says to do two z scores one of -10 and one of 10 and then calculate between them, but why would I do -10 and 10, it says within ten minutes of each other, wouldn't that mean you would do ten above and ten below the mean of D?(6 votes)- The mean and standard deviation explain the shape of the curve and can tell which percentages are above and below certain points. However, the question asks whether they finish within 10 minutes of each other. Since Taylor is 2 minutes quicker than Sam, the area under the curve is shifted. The center where they both have the same time is 0, reflecting where both Sam and Taylor have the same finish time (S-T). Calculate 10 minutes below and 10 minutes above 0, the place where they are equal, to find the percentages where they are finishing within 10 minutes of each other.(1 vote)
- Hiya Sal and everyone at Khan, thank you for all your hard work. It would be really nice if we could get a worked example of a probability of an absolute value as that is something that comes up in the practice questions but wasn't covered in the videos leading up to it. Like P(X |5|) or something like that.(0 votes)
- P(X = |5|) = P(X = 5)
I think you meant something different, since you can always just replace |5| with 5.
If you meant to ask something like
P(|X| > 5)
this would be (I think)
P(|X| > 5) = P(X < -5) + P(X > 5)(2 votes)
- For example 2, I am confused why we are finding P(D<0). Since, Adam and Mike are playing bowling the difference of the two normal distributions must be discrete whole numbers? Then, D can't be any value between -1 and 0. Also, 0 isn't part of the solution space. Then we should be trying to find P(D<=-1)?
<= means less than or equal too.(1 vote)- As r.v. D = A - M. The only possible way to that Mike has more pins than Adam is when Mike's pins is greater than Adam's.
Therefore, for our question D should be a negative number. That's why in hint D < 0(0 votes)
- what happens when the probability is greater than a set mean? for example P(X>14)(0 votes)
- again makes more sense that D=M-A, since the probability in demand is M scoring more than A(0 votes)