If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Deriving the variance of the difference of random variables

Sal derives the variance of the difference of random variables. Created by Sal Khan.

Want to join the conversation?

  • blobby green style avatar for user Portia
    At , shouldn't the formula for the variance be divided by n?
    (29 votes)
    Default Khan Academy avatar avatar for user
    • female robot grace style avatar for user JMGClark
      Good question! The variance of a random variable is E[(X - mu)^2], as Sal mentions above. What you're thinking of is when we estimate the variance for a population [sigma^2 = sum of the squared deviations from the mean divided by N, the population size] or when estimating the variance for a sample [s^2 = sum of the squared deviations from the mean divided by n-1, where n = the sample size].

      See here for more details: http://en.wikipedia.org/wiki/Variance
      (16 votes)
  • blobby green style avatar for user p1cony
    how did he factor to get (-1)^2? for the variance at around ?
    (7 votes)
    Default Khan Academy avatar avatar for user
    • piceratops tree style avatar for user heba
      Hi p1cony ,
      to answer your question let me just make sure you get the basic idea of what he did
      if we have an algabraic expression (numbers and variables multiplied together formig terms &the terms are separated by + or - signs) we can take a common factor by dividing each term by the common factor
      for example : (5a+10b) = 5(a+2b)

      if the algebric expression was squared we can still take a common factor but it'll have to be squared
      for example : (5a+10b)^2 = ((5)^2) (a+2b)^2

      another basic idea is that if we took (-1) as a common factor from an expression , it will change the sign ( if + becomes - and if - becomes +) of every term in that expression because its the same thing as dividig each term by (-1)
      for example (5a+10b) = -1(-5a-10b)

      so , what he did is just apply all of these basics in only .one step , he took a common factor (-1)^2 from a squared algebric expression
      E((-Y-E(-Y))^2)= (-1)^2 * E((Y+E(-Y))^2)
      (10 votes)
  • blobby green style avatar for user Tomas Santiago Mosquera
    What is the video (if there's a video) in which they define the variance of X as the expected value of (X-Ux)^2 ?? It is not in the expected value video.
    (7 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Elizabeth Topczewski
    At , Sal defines Z=X+Y. What does it mean to add random variables? What are you adding together?

    Thanks!
    Beth
    (4 votes)
    Default Khan Academy avatar avatar for user
    • boggle blue style avatar for user Bryan
      This honestly confused me at first too, but a few weeks after learning this it seems so obvious :P

      You are literally taking the result of one random variable, then adding it to the result of another. You can imagine this as rolling two die and then summing the results, or rolling a die then summing to the result either 7 or 65 based of whether a coin that you flipped was heads or tails, etc.
      It's the individual results of two random events, converted into numbers, then added together.

      The probability dist of the first random variable tells you the probability of each of the possible values of that variable, and the probability dist for the second ran variable does the same. But when you sum two random variables, you take an individual instance of both random variables (e.g. X = 2, Y = -33), and then literally sum them (X + Y, in this case, would be 2 + (-33) = -31).

      I'm only spending so much time elaborating this bc I was confused on this too. If anyone else is still confused, you'll get it eventually ( -.-)b
      (4 votes)
  • leaf green style avatar for user anzatzi
    Is there some reason we are not dividing by n or (n-1) in the computation of variance--y'know--like we have been doing for the entire playlist. If its because E() imples this division, why are we not explicitily citing that?
    (5 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Nathan Hoffmann
      Expected value divides by n, assuming we're looking at a real dataset of n observations. But we might not be. For example, if a random variable x takes the value 1 in 30% of the population, and the value 0 in 70% of the population, but we don't know what n is, then E(x) = .3(1) + .7(0) = .3. No n is necessary if we have a probability mass function like this (or probability distribution function, for continuous random variables).
      (3 votes)
  • male robot hal style avatar for user poli987789
    How comes now that the Variance = E((X-mu)^2)? Didn't Sal say in a previous video that the Variance = P(X)(X-mu)^2?
    Most of the Statistics & Probability course is really confusing so far... Didn't have this much struggle since the Integral course.
    (4 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user CV.AndrewLeong
    Can someone show me this with numbers please?

    If you had two independent random variables X and Y. With X having a standard deviation of 19 and Y having a standard deviation of 6. How do you workout the diference X - Y?

    Is it: 361 + 36 = 397

    Then: sqroot397 = 19.9

    ??
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Jon Gendron
    Starting at ~, it is announced that E(-y) = -E(y), can you please prove this or point to a proof?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user ben.cox.07
    What was supposed to come before this video? I just watched through Hypothesis testing with one sample, and figured this one would follow directly from the ones before, but this Var stuff just sort of comes out of nowhere. A little framing would be nice.
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user martinique
    how do you solve this variance equation when the data sets are
    2,2,2,2,2,2,2

    S2 = ∑(xi2) - [(∑xi)2 / N]
    N - 1
    (2 votes)
    Default Khan Academy avatar avatar for user

Video transcript

What I want to do in this video is build up some tools in our tool kit for dealing with sums and differences of random variables. So let's say that we have two random variables, x and y, and they are completely independent. They are independent random variables. And I'm just going to go over a little bit of a notation here. If we wanted to know the expected, or if we talked about the expected value of this random variable x, that is the same thing as the mean value of this random variable x. If we talk about the expected the value of y, that is the same thing as the mean of y. If we talk about the variance of the random variable x, that is it the same thing as the expected value of the squared distances between our random variable x and its mean. And that right there squared. So the expected value of these squared differences, and that you could also use the notation sigma squared for the random variable x. This is just a review of things we already know, but I just want to reintroduce it because I'll use this to build up some of our tools. So you do the same thing with this with random variable y. The variance of random variable y is the expected value of the squared difference between our random variable y and the mean of y, or the expected value of y, squared. And that's the same thing as sigma squared of y. There is the variance of y. Now you may or may not already know these properties of expected values and variances, but I will reintroduce them to you. And I won't go into some rigorous proof-- actually, I think they're fairly easy to digest. So one is is that if I have some third random variable, let's say I have some third random variable that is defined as being the random variable x plus the random variable y. Let me stay with my colors just so everything becomes clear. The random variable x plus the random variable y. What is the expected value of z going to be? The expected the value of z is going to be equal to the expected value of x plus y. And this is a property of expected values-- I'm not going to prove it rigorously right here-- but the expected value of x plus the expected value of y, or another way to think about this is that the mean of z is going to be the mean of x plus the mean of y. Or another way to view it is if I wanted to take, let's say I have some other random variable. I'm running out of letters here. Let's say I have the random variable a, and I define random variable a to be x minus y. So what's its expected value going to be? The expected value of a is going to be equal to the expected value of x minus y, which is equal to-- you could either view it as the expected value of x plus the expected value of negative y, or the expected value of x minus the expected value of y, which is the same thing as the mean of x minus the mean of y. So this is what the mean of our random variable a would be equal to. And all of this is review and I'm going to use this when we start talking about the distributions that are sums and differences of other distributions. Now let's think about what the variance of random variable z is and what the variance of random variable a is. So the variance of z-- and just to kind of always focus back on the intuition, it makes sense. If x is completely independent of y and if I have some random variable that is the sum of the two, then the expected value of that variable, of that new variable, is going to be the sum of the expected values of the other two because they are unrelated. If my expected value here is 5 and my expected value here is 7, completely reasonable that my expected value here is 12, assuming that they're completely independent. Now if we have a situation, so what is the variance of my random variable z? And once again, I'm not going do a rigorous proof here, this is really just a property of variances. But I'm going to use this to establish what the variance of our random variable a is. So if this squared distance on average is some variance, and this one is completely independent, it's squared distance on average is some distance, then the variance of their sum is actually going to be the sum of their variances. So this is going to be equal to the variance of random variable x plus the variance of random variable y. Or another way of thinking about it is that the variance of z, which is the same thing as the variance of x plus y, is equal to the variance of x plus the variance of random variable y. Hopefully that make some sense. I'm not proving it to you rigorously. And you'll see this in a lot of statistics books. Now what I want to show you is that the variance of random variable a is actually this exact same thing. And that's the interesting thing, because you might say, hey, why wouldn't it be the difference? We had the differences over here. So let's experiment with this a little bit. The variance-- so I'll just write this-- the variance of random variable a is the same thing as the variance of-- I'll write it like this-- as x minus y, which is equal to-- you could view it this way-- which is equal to the variance of x plus negative y. These are equivalent statements. So you could view this as being equal to-- just using this over here, the sum of these two variances, so it's going to be equal to the sum of the variance of x plus the variance of negative y. Now what I need to show you is that the variance of negative y, of the negative of that random variables are going to be the same thing as the variance of y. So what is the variance of negative y? The variance of negative y is the same thing as the variance of negative y, which is equal to the expected value of the distance between negative y and the expected value of negative y squared. That's all the variance actually is. Now what is the expected value of negative y right over here? Actually, even better let me factor out a negative 1. So what's in the parentheses right here, this is the exact same thing as negative 1 squared times y plus the expected value of negative y. So that's the same exact same thing in the parentheses, squared. So everything in magenta is everything in magenta here, and it is the expected value of that thing. Now what is the expected value of negative y? The expected value of negative y-- I'll do it over here-- the expected value of the negative of a random variable is just a negative of the expected value of that random variable. So if you look at this we can re-write this-- I'll give myself a little bit more space-- we can re-write this as the expected value-- the variance of negative y is the expected value-- this is just 1. Negative 1 squared is just 1. And over here you have y, and instead just write plus the expected value of negative y, that's the same thing as minus the expected value of y. So you have that, and then all of that squared. Now notice, this is the exact same thing by definition as the variance of y. So what we just showed you just now, so this is the variance of y. So we just showed you is that the variance of the difference of two independent random variables is equal to the sum of the variances. You could definitely believe this, it's equal to the sum of the variance of the first one plus the variance of the negative of the second one. And we just showed that that variance is the same thing as the variance of the positive version of that variable, which makes sense. Your distance from the mean is going to be-- it doesn't matter whether you're taking the positive or the negative of the variable. You just cared about absolute distance. So it makes complete sense that that quantity and that quantity is going to be the same thing. Now the whole reason why I went through this exercise, kind of the important takeaways here is that the mean of differences right over here-- so I could re-write it as the differences of the random variable is the same thing as the differences of their means. And then the other important takeaway, and I'm going to build on this in the next few videos, is that the variance of the difference-- if I define a new random variable is the difference of two other random variables, the variance of that random variable is actually the sum of the variances of the two random variables. So these are the two important takeaways that we'll use to build on in future videos. Anyway, hopefully that wasn't too confusing. If it was, you can kind of just accept these at face value and just assume that these are tools that you can use.