Main content

## Statistics and probability

### Course: Statistics and probability > Unit 13

Lesson 2: Comparing two means- Statistical significance of experiment
- Statistical significance on bus speeds
- Hypothesis testing in experiments
- Difference of sample means distribution
- Confidence interval of difference of means
- Clarification of confidence interval of difference of means
- Hypothesis test for difference of means

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Difference of sample means distribution

Sal walks through the difference of sample means distribution. Created by Sal Khan.

## Want to join the conversation?

- At7:15"We saw this in the last video".

where is the last video?

I can't find the lecture content in last videos.

So I don't understand lecture content at7:20.

Plese help me. . .(23 votes)- I think it's this video: Variance of differences of random variables under the Random Variables section of Statistics and Probability(16 votes)

- A general statistical question - so much of the emphasis is placed on the mean as being representative of a given population, but what is the use of the population mean, and indeed the sampling distribution of the sample mean, when our population is not normally distributed?(6 votes)
- The population mean of X will be equal to the mean of the sampling distribution of X whether the population is normally distributed or not.

When our sample size of the population is high and if we gather a lot of the samples, we can plot the means of each sample on a graph and the mean of all of these means will still be the population mean. This is called the central limit theorem

Sal uses an app as a great visual demonstration of this here

http://www.khanacademy.org/math/statistics/v/sampling-distribution-of-the-sample-mean(11 votes)

- I am struggling to differentiate between when the variance of a sample is sigma squared divided by n and when it is sigma squared divided by (n+1). I know that sigma squared divided by (n+1) is a better estimator, and is actually unbiased, but why would Sal be using sigma divided by just "n" in this video? Am I missing a crucial point?(5 votes)
- The variance of a set of numbers is the Σ(x - x̄)²/n. You use this when you know every number in the set. If you take a sample, then this is how you calculate the variance of that sample.

However, if you want to estimate the variance of the population based on a sample, then it is Σ(x - x̄)²/ (n-1) for every x in the sample. This is because you don't know every x in the whole population. In this video Sal is talking in abstract terms, so assuming you know every value in a sample.

This video covers it: http://www.khanacademy.org/math/statistics/v/statistics--sample-variance(7 votes)

- Thanks for the fab videos,

I notice that they have been reorder since being recorded. In this one Sal often refers to "in the last video", but he is not referring to the one before in this sequence of videos. It would be great if you had a pointer to the video Sal is referring to.

kind regards and many thanks

Barry(5 votes) - Saying Z = x^ - y^ after many videos using the Z distribution was very confusing lol(4 votes)
- Is there a video somewhere about paired differences? I would love to see those worked out!

Thank you for your videos, I love them!(4 votes) - i have a question about a question i am doing for homework. dont have to answer the question its self, if someone could kind of clarify. "samples are taken from a normal population, will the distribution of the sample means also be normal?" what does it mean "distribution of the sample means"? i realize this is probably obvious but...(3 votes)
- for someone having a hard time with digesting what on earth "a distribution of sample" means, here's a bit weird derivation

1. get a population distribution

1) say you have 13 cats

2) they have 13 weights

3) you plot them on a graph

> this is a population distribution (of their weights)

2. get a sample (not sampling distribution!)

1) you pick 3 cats among 13 at random

2) plot their weights

3) you got 1 sample distribution of n=3 of your cats from the population distribution above

3. get a sampling distribution

1) you do 2 above ten times with the same n=3

2) plot their means on a graph (only 10 means of 3 cats, not any real weights of each cats!)

3) now you get your sampling distribution

when we are talking about "distribution of the sample means", we mean the 3-3) than 2-3). it literally means how distributed the 10 means of your 10 samples with 3 sample size for each from a population of 13 cats are!

hope this to sweep away the fog in your head (as it did for mine)(2 votes)

- If the means of X and Y are sufficiently far enough apart could the distribution diagram have two vertices?(3 votes)
- if you mean the distribution of the difference (mean_X - mean_Y), no there's 1 peak

in fact the positions of their distributions on the same line doesn't matter for the graph of their difference. only the means and standard deviations matter(1 vote)

- Do you recommend ck12 as the best resource for practicing problems related to the last third of the Inferential Statistics videos? The last exercise is z-scores 3 and I really would like to practice the later concepts. Thanks!(3 votes)
- is there any practice questions(2 votes)

## Video transcript

I want to build on what
we did in the last video a little bit. Let's say we have two
random variables. So I have random variable x. And let me draw its probability
distribution. And actually, it doesn't
have to be normal. But I'll just draw it as
a normal distribution. So this is the distribution
of random variable x. This is the mean. The population mean of
random variable x. And then it has some type
of standard deviation. Actually, let me just focus
on the variance. So it has some variance right
here for random variable x. This is x, the distribution
for x. Let's say we have another
random variable. Random variable y. Let's do the same
thing for it. Let's draw its distribution. And let me draw the parameters
for that distribution. So it has some true mean, some
population mean for the random variable y. And it has some variance
right over here. And I've drawn it
roughly normal. Once again, we don't have to
assume that it's normal. Because we're going to assume,
when we go to the next level, that when we take the samples,
we're taking enough samples that the central limit theorem
will actually apply. But with that said, let's
think about the sampling distributions of each of
these random variables. So let's think about the
sampling distribution of the sample mean of x. Let's say the sample size
over here is going to be equal to n. So what is that going
to look like? Well it's going to be
some distribution. And we're assuming that n is
a fairly large number. So this is going to be a
normal distribution. Or it can be approximated with
a normal distribution. Let me shift it over
a little bit. I'm going to draw it a
little bit narrow. Let me draw the mean. So the population mean of the
sampling distribution is going to be denoted with this x bar,
that tells us the distribution of the means when the
sample size is n. And we know that this is going
to be the same thing as the population mean for that
random variable. And we know from the central
limit theorem that the variance of the sampling
distribution or, often called the standard error of the mean,
is going to be equal to the population variance
divided by this n right over here. And if you wanted the standard
deviation of this, you just take the square root
of both sides. Let's do the same thing
for random variable y. Let's take the sampling
distribution of the sample mean. But here, we're talking about
y, random variable y. And let's just say it has
a different sample size. It doesn't have to be
a different one. But it just shows you that it
doesn't have to be the same. So it has a sample size of m. Let me draw its distribution
right over here. Once again, it'll be a narrower
distribution than the population distribution. And it will be approximately
normal, assuming that we have a large enough sample size. And the mean of the sampling
distribution of the sample mean is going to be the same
thing as the population mean. We've seen that multiple
times. And its variance for the sample
means, or the standard error of the mean. Actually, this isn't
the standard error. Standard error would be the
square root of this. So if I called this standard
error of the mean, that's wrong. The standard error of the mean
is the square root of this. It's the standard deviation. This is the variance
of the mean. Don't want to confuse you. So the variance of the mean here
is going to be the exact same thing. It's going to be the variance
of the population divided by our sample size. And everything we've done so
far is complete review. It's a little different, because
I'm actually doing it with two different
random variables. And I'm doing it with
two different random variables for a reason. Because now I'm going to define
a new random variable. We could just call it z. But z is equal to
the difference of our sample means. It's equal to the x sample mean
minus the y sample mean. So what does that really mean? Well, to get a sample mean,
or at least for this distribution, you're taking
n samples from this population over here. Maybe n is 10. You're taking 10 samples
and finding its mean. That sample mean is
a random variable. Let's say you take 10 samples
from here and you get 9.2 when you find their mean. That 9.2 can be viewed as a
sample from this distribution right over here. Same thing if this
right here is m. Or if m right here is 12. You're taking 12 samples,
taking its mean. And that sample mean, maybe it's
15.2, could be viewed as a sample from this
distribution. As a sample from the sampling
distribution. So what z is, z is a random
variable where you're taking n samples from this distribution
up here, this population distribution, taking its mean. Then you're taking m samples
from this population distribution up here,
taking its mean. And then finding the difference
between that mean and that mean. So it's another random
veritable. But what is the distribution
of the z? So let's draw it. Well there's a couple
of things we immediately know about z. And we kind of came up with
this in the last video. Instead of writing z, I'm just
going to write the mean of x bar, which is a sample from the
sampling distribution of x, or the sample mean of x,
minus the sample mean of y. We saw this in the last video. In fact, I think I still
have the work up here. Yeah, I still have the
work right up here. The mean of the difference
is going to be the difference of the means. The mean of the difference
is the same thing is the difference of the means. So the mean of this new
distribution right over here is going to be the same thing as
the mean of our sample mean minus the mean of our
sample mean of y. And this might seem a little
abstract in this video. In the next video, we're
actually going to do this with concrete numbers. And hopefully it'll make a
little bit more sense. And just so you know where we're
going with this, the whole point of this is so that
we can eventually do some inferential statistics about
differences of means. How likely is a difference of
means of two samples, random chance or not random chance? Or what is a confidence
interval of the difference of means? That's what this is all
building up to. So anyway, we know
the mean of this distribution right over here. And what's the variance
of this distribution? We came up with that result
in the last video. If we're taking essentially the
difference of two random variables, the variance is going
to be the sum of those two random variables. And the whole point of that
video is to show that it's not the difference of the
variances, it's the sum of the variances. The variance of this new
distribution-- and I haven't drawn the distribution yet--
The variance of this new distribution, I'll just write x
bar minus y bar, is going to be equal to the sum of the
variances of each of these distributions. The variance of x bar plus
the variance of y bar. Actually, let me just
draw this here. Just so we can visualize
another distribution. Although, all I'm going to
draw is another normal distribution. Let me scroll down
a little bit. So the mean over here, the mean
of x bar minus y bar, is going to be equal to
the difference of these means over here. I don't have to rewrite it. Let me draw the curve. And notice, I'm drawing a fatter
curve than either one. And why am I doing that? Because the variance here is the
sum of the variances here. So we're going to have
a fatter curve. It's going to have a bigger
variance, or a bigger standard deviation than either
of these. So then we have some variance
here, variance of x bar minus y bar. Now what are these, in terms
of the original population distribution? We came up with those results
right over here. We know what the standard
deviation is. We know that this thing is the
same thing as the variance of the population distribution
divided by n. We've done this multiple,
multiple times. What's this going
to be equal to? This is right here is the same
thing as the variance of our population distribution. And the x just means this is
for random variable x. But there's no bar on top. This is the actual population
distribution, not the sampling distribution of the
sample mean. So that divided by n. And then if we want the variance
of the sampling distribution for y, let me do
that in a different color. I'll use blue, because that was
what we were using for the y random variable. That's going to be equal to
this thing over here. And we've done this
multiple times. Same exact logic as this. The population distribution
for y divided by m. And so once again, I'll just
write this out front. This is the variance
of the differences of the sample means. And now if you wanted the
standard deviation of the differences of the sample means,
you just have to take the square root of both
sides of this. You take the square root of
this, you get the standard deviation of the difference of
the sample means is equal to the square root of the
population distribution of x. Or the variance of the
population distribution of x divided by n plus the variance
of the population distribution of y divided by m. And this is just neat. Because it kind of looks
a little bit like a distance formula. I'll throw that out there as we
get more sophisticated with our statistics and try to
visualize what all of this kind of stuff means in
more advanced topics. But the whole point of this is,
now we can make inferences about a difference of means. If we have two samples, and we
take the means of both of those samples and we find some
difference, we can make some conclusions about
how likely that difference was just by chance. And we're going to do that
in the next video.