If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## AP®︎/College Statistics

### Unit 9: Lesson 2

The central limit theorem

# Sampling distribution of the sample mean (part 2)

AP.STATS:
UNC‑3 (EU)
,
UNC‑3.H (LO)
,
UNC‑3.H.2 (EK)
,
UNC‑3.H.3 (EK)
,
UNC‑3.H.5 (EK)
More on the Central Limit Theorem and the Sampling Distribution of the Sample Mean. Created by Sal Khan.

## Want to join the conversation?

• at 4.43 Sal mentions that it is impossible to get 7.5, why is it so? I agree we cannot pick up 7 and 8, however we can pick up 9 and 6 and get average 7.5. am I wrong? •   No, you're right. He should've picked something like 6.5, which you can't ever get: 9+4, 8+5, 7+6 are impossible
• If you take a sample size of 1, do you get the SAME EXACT distribution for your Sampling distribution for the sample mean as your original discrete probability distrubution? • Whats the point of the central limit Theorem if it doesn't provide you with the actual population distribution. For Ex in this video the population distribution in reality was totally different than the normal distribution. So what the importance of this concept? • Each separate sample we take from the population will be different - they will have different scores and different sample means. So how do we tell which sample gives us the best description of the population? Can we even predict how well a sample describes the population it is drawn from?

By using the distribution of sample means we have the ability to predict the characteristics of the sample. And one of the basic reasons behind taking a sample is to use the sample data to answer questions about the larger population.

The Central Limit Theorem helps us to describe the distribution of sample means by identifying the basic characteristics of the samples - shape, central tendency and variability. So the distribution of sample means helps us to find the probability associated with each specific sample.

And because there's always some discrepancy or error between a sample statistic and the corresponding population statistic, the CLT enables us to calculate exactly how much error to expect.
• Could you please give a practical example of the utility of the Sampling Distribution of the Sample Mean (SDSM) when used with a NON NORMAL distribution? It would seem that a non-normal distribution generated by some process would mean that the process was "out of control", multiple processes going on, or some such. If that's the case, of what utility is the SDSM when it does not describe the scattered output of said process? Thanks much. • Why is the mean of the sampling distribution of sample means always equal to the population mean? • In formulas:
``E[X] = µE[ xbar ] = E[ 1/n Σ xi ]= 1/n E[ Σ xi ]= 1/n Σ E[X]= 1/n ( n * µ )= µ``

Logically, it makes sense this should be the case. If some variable has mean µ, that means we expect a given value to be µ. There'll be some variation around that, but that's what we expect, on average. So, we're expecting the average to be µ. Then, if we get a lot of such sample means (that is: the sampling distribution), we're getting a whole lot of values which we expect to all be µ. The average of a lot of things that are all µ or very close to it, should also be µ.
• what if the sample size= population, what happens to the standard deviation of the sampling mean distribution? • *I know my explanation can be pretty long, but do hear me out. :)

To give you a head start, you should know that the standard deviation of the sample mean is actually the difference between the mean of a sample and the true mean of the sample's population.

If sample size = population, the mean of your sample will equal to the true mean of your population (since you take every single observation/individual, your sample becomes the population), hence your standard deviation will be 0 because your sample mean did not deviate from the population mean (the sample mean = population mean; difference between sample mean and population mean = 0.
• Is there a difference between 1000 times taking samples of 10 (as at ) and 10 times taking samples of 1000?
What about taking 1 time taking a sample of 10000?
...Or 10000 times taking a sample of 1? • Yes. There is the difference. The bigger sample size you have - the narrow normal distribution you will get.

For example,

Sample size = 25, number of iteration = 5 || Sample size = 5, number of iteration = 25

mean ___________ 13.74 __________ || ____________ 14.32 ____________
median __________ 13.00 __________ || ____________ 15.00 ____________
SD ______________ 1.19 ___________ || ____________ 2.99 ____________
• @ actually, if you pulled a 9 and a 6, you would get 7 and a half.   