If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Combining random variables

Effect on mean, standard deviation, and variance

We can form new distributions by combining random variables. If we know the mean and standard deviation of the original distributions, we can use that information to find the mean and standard deviation of the resulting distribution.
We can combine means directly, but we can't do this with standard deviations. We can combine variances as long as it's reasonable to assume that the variables are independent.
MeanVariance
Adding: T=X+YμT=μX+μYσT2=σX2+σY2
Subtracting: D=XYμD=μXμYσD2=σX2+σY2
Here's a few important facts about combining variances:
  • Make sure that the variables are independent or that it's reasonable to assume independence, before combining variances.
  • Even when we subtract two random variables, we still add their variances; subtracting two variables increases the overall variability in the outcomes.
  • We can find the standard deviation of the combined distributions by taking the square root of the combined variances.

Example 1: Establishing independence

To combine the variances of two random variables, we need to know, or be willing to assume, that the two variables are independent.
QUESTION A (Example 1)
For which pairs of variables would it be reasonable to assume independence?
Choose all answers that apply:

Example 2: SAT scores

Approximately 1.7 million students took the SAT in 2015. Each student received a critical reading score and a mathematics score.
Here are summary statistics for each section of the test in 2015:
SectionMeanStandard deviation
Critical readingμCR=495σCR=116
MathematicsμM=511σM=120
TotalμT=?σT=?
Suppose we choose a student at random from this population.
Question A (Example 2)
What is the mean of the sum of a student’s critical reading and mathematics scores?
Choose 1 answer:

Question B (Example 2)
What is the standard deviation of the sum of a student’s critical reading and mathematics scores?
Choose 1 answer:

Example 3: Item inspections

Each of a certain item at a factory gets inspected by 4 employees. The amount of time it takes each employee to inspect the item has a mean of 30 seconds and a standard deviation of 6 seconds. Furthermore, the amount of time it takes a given employee to inspect an item is not impacted by how long it takes another employee to inspect that item.
Let T be the total amount of time it takes 4 employees to inspect a randomly selected item.
Question A (Example 3)
What is the mean total amount of time it takes 4 employees to inspect a randomly selected item?
Choose 1 answer:

Question B (Example 3)
What is the standard deviation of the total amount of time it takes 4 employees to inspect a randomly selected item?
Choose 1 answer:

Example 4: Difference in heights

A sociologist took a large sample of military members and looked at the heights of the men and women in the sample. The summary statistics for the heights of the people in the study are shown below.
Suppose that we choose a random man and a random woman from the study and look at the difference between their heights. Let M represent the man's height, W represent the woman's height, and D represent the difference between their heights (D=MW).
MeanStandard deviation
ManμM=178cmσM=7cm
WomanμW=164cmσW=6cm
DifferenceμD=?σD=?
Question A (Example 4)
What is the mean of the difference between the two heights?
Choose 1 answer:

Question B (Example 4)
What is the standard deviation of the difference between the two heights?
Choose 1 answer:

Want to join the conversation?

  • leaf green style avatar for user atung.tx
    I do not agree with explanation of Example 2 "... In fact, we should suspect such scores to not be independent." Why would the reading and math scores are correlated to each other? Plenty of people are good at one only.
    (64 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Michael
    In the examples, we only added two means and variances, can we add more than two means or variances?
    (7 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Chuck B
      Yes, and it is not difficult to demonstrate this: Given

      E(X + Y) = E(X) + E(Y) [Eq. 1]

      Now let's say we have Y = S + T. Earlier in the course we saw that

      E(Y) = E(S + T)

      And so by Eq. 1 we have

      E(Y) = E(S + T) = E(S) + E(T).

      Substituting back into Eq. 1 we get

      E(X + (S + T)) = E(X) + (E(S) + E(T))

      or

      E(X + S + T) = E(X) + E(S) + E(T)

      Notice that this approach can be extended to any number of terms.
      (3 votes)
  • leaf red style avatar for user Prashant Kumar
    In Example 2, both the random variables are dependent . Thus the mean of the sum of a student’s critical reading and mathematics scores must be different from just the sum of the expected value of first RV and the second RV. But the answer says the mean is equal to the sum of the mean of the 2 RV, even though they are independent.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • aqualine tree style avatar for user sharadsharmam
    I have understood that E(T=X+Y) = E(X)+E(Y) when X and Y are independent.

    But I am unable to understand it in my gut because, check the Example 3 Question A.

    If each employee on an average require 30 seconds to inspect a randomly selected item and T is the time it takes 4 employees to inspect a randomly selected item, how can the mean of 4 employees, E(T), inspecting a randomly selected item be 120 seconds?

    I know substituting in the formula gives 120 sec but this is exactly opposite of what my intuition says.
    (0 votes)
    Default Khan Academy avatar avatar for user
    • leafers seedling style avatar for user Alexzandria S.
      I'm not sure if this will help any, but I think when they are talking about adding the total time an item is inspected by the employees, it's being inspected by each employee individually and the times are added up, instead of the employees simultaneously inspecting it.

      So the item starts with Employee A, who inspects it for 30 seconds, and then it's passed to Employee B, who inspects it for 30 seconds, and so forth. So 30 + 30 + 30 + 30 = 120, so the item spends a total of 120 seconds being inspected by the employees.

      Does that help clarify it for you?
      (9 votes)
  • blobby green style avatar for user Sec Ar
    Still not feeling the intuition that substracting random variables means adding up the variances. Why should the difference between men's heights and women's heights lead to a SD of ~9cm?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • boggle blue style avatar for user Bryan
      Var(X-Y) = Var(X + (-Y)) = Var(X) + Var(-Y)

      But variance of -Y is the same as the variance of Y, since -Y is just the reflection of Y over the y-axis.
      So,
      Var(X) + Var(-Y) = Var(X) + Var(Y) = Var(X-Y)
      (4 votes)
  • blobby green style avatar for user 23yaa02
    When would you include something in the squaring? For example, in 3b, we did sqrt(4(6)^) or sqrt(4x36) for the SD. Is there any situation (whether it be in the given question or not) that we would do sqrt((4x6)^2) instead?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      When combining variances of independent random variables, each variance represents the squared variability of its respective variable. Adding these variances together accounts for the total variability of the combined random variables. Since variance is a squared measure, taking the square root of the sum gives the standard deviation, which provides a measure of variability on the original scale of the data. However, when an entire expression is squared and then square rooted, it typically serves a different purpose unrelated to combining variances of independent random variables.
      (1 vote)
  • blobby green style avatar for user N N
    "Subtracting two variables increases the overall variability in the outcomes."

    I'd like to understand this comment intuitively.
    Why does the standard deviation (variance) increase when subtracting two variables?

    Is it because the number of samples decreases when subtracting variables? (Thus, variability increases?)
    (1 vote)
    Default Khan Academy avatar avatar for user
    • cacteye blue style avatar for user Jerry Nilsson
      The only intuition I can give is that the range of 𝑋 − 𝑌 is
      (𝑋 − 𝑌)max − (𝑋 − 𝑌)min = (𝑋max − 𝑌min) − (𝑋min − 𝑌max)
      = 𝑋max − 𝑋min + 𝑌max − 𝑌min = Range(𝑋) + Range(𝑌)

      So, 𝑋 − 𝑌 has a wider range than 𝑋 or 𝑌, and thus should have greater variance.
      (1 vote)
  • male robot johnny style avatar for user Kevin Eldurson
    In example 3, why can't we scale the random variable? I thought the standard deviation would be 4 times the original.
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      Scaling a random variable by a constant factor directly affects the spread or variability of the data. However, when summing independent random variables, the variability doesn't scale linearly with the number of observations. Instead, it scales with the square root of the number of observations. Therefore, while the mean scales linearly with the number of observations, the standard deviation scales with the square root, reflecting the non-linear relationship between the number of observations and variability. In the context of Example 3, multiplying the standard deviation by 4 doesn't accurately represent the increase in variability caused by summing multiple independent random variables.
      (1 vote)
  • blobby green style avatar for user N N
    Example 2: SAT scores

    Is the mean of the sum of two random variables different from the mean of two randome variables?

    Assuming the case like below:
    Critical Reaing: {498, 495, 492}, mean = 495
    Mathmatics: {512, 502, 519}, mean = 511

    The mean of the sum of a student’s critical reading and mathematics scores = 495 + 511 = 1006
    The mean of a student’s critical reading and mathematics scores = 503, which is not 1006

    What is "the mean of the sum of two random variables"? I cannot understand it intuitively.
    What kind of situation do we actually use "the mean of the sum of two random variables" in statistics?
    (0 votes)
    Default Khan Academy avatar avatar for user
    • cacteye blue style avatar for user Jerry Nilsson
      𝑋 = {498, 495, 492} ⇒ 𝜇(𝑋) = (498 + 495 + 492)∕3 = 495
      𝑌 = {512, 502, 519} ⇒ 𝜇(𝑌) = (512 + 502 + 519)∕3 = 511

      𝑋 + 𝑌 = {498 + 512, 495 + 502, 492 + 519} = {1010, 997, 1011}
      ⇒ 𝜇(𝑋 + 𝑌) = (1010 + 997 + 1011)∕3 = 1006

      𝜇(𝑋) + 𝜇(𝑌) = 495 + 511 = 1006

      – – –

      Let's say we wanted to know how many hours the average person spends at work per week.

      One way to conduct the survey would be to choose five random Mondays, five random Tuesdays, and so on. That would give us a total of 35 days.
      Then on each of these days we call 25 random people and ask them how many hours they spent working yesterday.

      For each of the 35 days we can calculate the average amount of hours for the 25 people we called that day:
      𝜇(Day 1)
      𝜇(Day 2)

      𝜇(Day 35)

      Let's say Day 1-5 are Mondays, Day 6-10 are Tuesdays, and so on.
      𝜇(Monday) = (𝜇(Day 1) + 𝜇(Day 2) + ... + 𝜇(Day 5))∕5
      𝜇(Tuesday) = (𝜇(Day 6) + 𝜇(Day 7) + ... + 𝜇(Day 10))∕5

      𝜇(Sunday) = (𝜇(Day 31) + 𝜇(Day 32) + ... + 𝜇(Day 35))∕5

      Finally,
      𝜇(Week) = 𝜇(Monday + Tuesday + ... + Sunday)
      = 𝜇(Monday) + 𝜇(Tuesday) + ... + 𝜇(Sunday)
      (2 votes)
  • spunky sam green style avatar for user Muhammad Junaid
    Exercise 4 :
    My question is that should we not expect that as the average height of men are more then women so may be we can think that both are dependent of each other ?
    (0 votes)
    Default Khan Academy avatar avatar for user
    • leafers tree style avatar for user sam.farrington93
      No. It is true that height is dependent on gender but the height of one gender has no impact on the height the other.

      You can make no end of analogies. The size of shoe people wear is highly dependent on the size of their feet. The size of my feet however are not remotely dependent on the size of yours.
      (0 votes)