Current time:0:00Total duration:10:47

0 energy points

# Variance of differences of random variables

Sal derives the variance of the difference of random variables. Created by Sal Khan.

Video transcript

What I want to do in this video
is build up some tools in our tool kit for dealing with
sums and differences of random variables. So let's say that we have two
random variables, x and y, and they are completely
independent. They are independent
random variables. And I'm just going to
go over a little bit of a notation here. If we wanted to know the
expected, or if we talked about the expected value of this
random variable x, that is the same thing as the
mean value of this random variable x. If we talk about the expected
the value of y, that is the same thing as the mean of y. If we talk about the variance of
the random variable x, that is it the same thing as the
expected value of the squared distances between our random
variable x and its mean. And that right there squared. So the expected value of these
squared differences, and that you could also use the notation
sigma squared for the random variable x. This is just a review of things
we already know, but I just want to reintroduce it
because I'll use this to build up some of our tools. So you do the same thing with
this with random variable y. The variance of random variable
y is the expected value of the squared difference
between our random variable y and the
mean of y, or the expected value of y, squared. And that's the same thing
as sigma squared of y. There is the variance of y. Now you may or may not already
know these properties of expected values and variances,
but I will reintroduce them to you. And I won't go into some
rigorous proof-- actually, I think they're fairly
easy to digest. So one is is that if I have some
third random variable, let's say I have some third
random variable that is defined as being the random
variable x plus the random variable y. Let me stay with my
colors just so everything becomes clear. The random variable x plus
the random variable y. What is the expected value
of z going to be? The expected the value of z is
going to be equal to the expected value of x plus y. And this is a property of
expected values-- I'm not going to prove it rigorously
right here-- but the expected value of x plus the expected
value of y, or another way to think about this is that the
mean of z is going to be the mean of x plus the mean of y. Or another way to view it is if
I wanted to take, let's say I have some other
random variable. I'm running out of
letters here. Let's say I have the random
variable a, and I define random variable a
to be x minus y. So what's its expected
value going to be? The expected value of a is
going to be equal to the expected value of x minus y,
which is equal to-- you could either view it as the expected
value of x plus the expected value of negative y, or the
expected value of x minus the expected value of y, which is
the same thing as the mean of x minus the mean of y. So this is what the mean of
our random variable a would be equal to. And all of this is review and
I'm going to use this when we start talking about the
distributions that are sums and differences of other
distributions. Now let's think about what the
variance of random variable z is and what the variance of
random variable a is. So the variance of z-- and just
to kind of always focus back on the intuition,
it makes sense. If x is completely independent
of y and if I have some random variable that is the sum of
the two, then the expected value of that variable, of that
new variable, is going to be the sum of the expected
values of the other two because they are unrelated. If my expected value here is 5
and my expected value here is 7, completely reasonable that my
expected value here is 12, assuming that they're completely
independent. Now if we have a situation, so
what is the variance of my random variable z? And once again, I'm not going do
a rigorous proof here, this is really just a property
of variances. But I'm going to use this to
establish what the variance of our random variable a is. So if this squared distance on
average is some variance, and this one is completely
independent, it's squared distance on average is some
distance, then the variance of their sum is actually going to
be the sum of their variances. So this is going to be equal
to the variance of random variable x plus the variance
of random variable y. Or another way of thinking about
it is that the variance of z, which is the same thing
as the variance of x plus y, is equal to the variance of x
plus the variance of random variable y. Hopefully that make
some sense. I'm not proving it to
you rigorously. And you'll see this in a lot
of statistics books. Now what I want to show you is
that the variance of random variable a is actually this
exact same thing. And that's the interesting
thing, because you might say, hey, why wouldn't it
be the difference? We had the differences
over here. So let's experiment with
this a little bit. The variance-- so I'll just
write this-- the variance of random variable a is the same
thing as the variance of-- I'll write it like this-- as x
minus y, which is equal to-- you could view it this way--
which is equal to the variance of x plus negative y. These are equivalent
statements. So you could view this as being
equal to-- just using this over here, the sum of these
two variances, so it's going to be equal to the sum of
the variance of x plus the variance of negative y. Now what I need to show you is
that the variance of negative y, of the negative of that
random variables are going to be the same thing as
the variance of y. So what is the variance
of negative y? The variance of negative y is
the same thing as the variance of negative y, which is equal
to the expected value of the distance between negative y
and the expected value of negative y squared. That's all the variance
actually is. Now what is the expected value
of negative y right over here? Actually, even better let me
factor out a negative 1. So what's in the parentheses
right here, this is the exact same thing as negative 1 squared
times y plus the expected value of negative y. So that's the same exact
same thing in the parentheses, squared. So everything in magenta is
everything in magenta here, and it is the expected
value of that thing. Now what is the expected
value of negative y? The expected value of negative
y-- I'll do it over here-- the expected value of the negative
of a random variable is just a negative of the expected value
of that random variable. So if you look at this we can
re-write this-- I'll give myself a little bit more space--
we can re-write this as the expected value-- the
variance of negative y is the expected value--
this is just 1. Negative 1 squared is just 1. And over here you have y, and
instead just write plus the expected value of negative y,
that's the same thing as minus the expected value of y. So you have that, and then
all of that squared. Now notice, this is the exact
same thing by definition as the variance of y. So what we just showed you
just now, so this is the variance of y. So we just showed you is that
the variance of the difference of two independent random
variables is equal to the sum of the variances. You could definitely believe
this, it's equal to the sum of the variance of the first one
plus the variance of the negative of the second one. And we just showed that that
variance is the same thing as the variance of the positive
version of that variable, which makes sense. Your distance from the mean is
going to be-- it doesn't matter whether you're taking the
positive or the negative of the variable. You just cared about
absolute distance. So it makes complete sense that
that quantity and that quantity is going to
be the same thing. Now the whole reason why I went
through this exercise, kind of the important takeaways
here is that the mean of differences right over
here-- so I could re-write it as the differences of the random
variable is the same thing as the differences
of their means. And then the other important
takeaway, and I'm going to build on this in the next few
videos, is that the variance of the difference-- if I define
a new random variable is the difference of two other
random variables, the variance of that random variable is
actually the sum of the variances of the two
random variables. So these are the two important
takeaways that we'll use to build on in future videos. Anyway, hopefully that
wasn't too confusing. If it was, you can kind of
just accept these at face value and just assume
that these are tools that you can use.