If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Sampling distribution of the sample mean

Take a sample from a population, calculate the mean of that sample, put everything back, and do it over and over. No matter what the population looks like, those sample means will be roughly normally distributed given a reasonably large sample size (at least 30). This is the main idea of the Central Limit Theorem — the sampling distribution of the sample mean is approximately normal for "large" samples. Created by Sal Khan.

## Want to join the conversation?

• If we know the mean and the standard deviation of the population, then why are we taking samples, if we already have the data?

• Learning statistics can be a little strange. It almost seems like you're trying to lift yourself up by your own bootstraps. Basically, you learn about populations working under the assumption that you know the mean/stdev, which is silly, as you say, but later you begin to drop these assumptions and learn to make inferences about populations based on your samples.

Once you have some version of the Central Limit Theorem, you can start answering some interesting questions, but it takes a lot of study just to get there!
• Is there any difference if I take 1 "sample" with 100 "instances", or I take 100 "samples" with 1 "instance"?
(By sample I mean the S_1 and S_2 and so on. With instances I mean the numbers, [1,1,3,6] and [3,4,3,1] and so on.)
• There is a difference. Your "samples" (random selections of values "x") that are made up of "instances" (referred to as the variable "n") provide what will essentially be the building blocks of your Sampling Distribution of the Sample Mean. Because your "instances" determine the value of the mean of "x", your size of "n" determines the value of "x"'s mean, and the Sampling Distribution of the Sample Mean's standard deviation (Defined as The original dataset's standard deviation divided by the square root of "n").
For example: If you were to take 1 "sample" with 100 "instances", you would get only one piece of data regarding the mean of 100 items [1,1,3,6,3,6,3,1,1,1,1,1...] from your original data. Your sampling distribution of the Sample mean's standard deviation would have a value of ((The original sample's S.D.)/(The square root of 100)), but that wouldn't really matter, because your data will likely be very close to your original data's mean, and you'd only have one sample.
Now if you take 100 samples with 1 instance [3], you'll get many pieces of data, but no change in standard deviation from your first sample: ((The original sample's S.D.)/(The square root of 1)). Functionally, with enough samples taken like this, you'll re-create your original dataset! You won't be creating a useful sampling distribution of the sample mean because "x" will equal the mean of "x". With 100 "samples" of 1 "instance", you're randomly picking 100 values of "x" and re-plotting them.
I hope that helps.
• So if every distribution approaches normal when do I employ say a Poisson or uniform or a Bernoulli distribution? I suppose it's a concept I haven't breached yet but how do I know when or which distribution to employ so I appropriately analyze the data? End goal = solve real world problems!
(1 vote)
• Not every distribution goes to the Normal. the distribution of the sample mean does, but that's as the sample size increases. If you have smaller sample sizes, assuming normality either on the data or the sample mean may be wholly inappropriate.

In terms of identifying the distribution, sometimes it's a matter of considering the nature of the data (e.g. we might think "Poisson" if the data collected are a rate, number of events per some unit/interval), sometimes it's a matter of doing some exploratory data analysis (histograms, boxplots, some numerical summaries, and the like).

For actually analyzing data: I would suggest hiring someone with more extensive training in Statistics to actually do such. Taking one course in Stats, which is basically what KhanAcademy goes through, isn't really enough to prepare someone to be a data analyst. I see the primary goal of taking one or two stats courses as giving you enough information to allow you to understand the results of statistical analyses. You can better tell the statistician what you want in his/her own terms, and you can better understand what s/he gives back to you.
• Do your sample sizes have to be the same size? E.G, at (ish) there are a bunch of samples with a sample size of four. Would it mess up any calculations if you took a sample of four and then, say, a sample of ten?
• Yes, the sample sizes should be the same. The sample size is not considered to be a variable, it's considered to be a constant. The sampling distribution of the sample mean can be thought of as "For a sample of size n, the sample mean will behave according to this distribution." Any random draw from that sampling distribution would be interpreted as the mean of a sample of n observations from the original population.
• What is the difference between "sample distribution" and "sampling distribution"?
• The sample distribution is what you get directly from taking a sample. You plot the value of each item in the sample to get the distribution of values across the single sample. When Sal took a sample in the previous video at and got S1 = {1, 1, 3, 6}, and graphed the values that were sampled, that was a sample distribution. The 2nd graph in the video above is a sample distribution because it shows the values that were sampled from the population in the top graph.

The sampling distribution is what you get when you compare the results from several samples. You plot the mean of each sample (rather than the value of each thing sampled). In the previous video, Sal did that starting at , when he plotted the mean of each sample. The 3rd and 4th graphs above are sampling distributions because each shows a distribution of means from the many samples of a particular size.

http://www.psychstat.missouristate.edu/introbook/SBK19.htm also has an explanation.
• Is it possible to determine the sample variance without the population variance? I have an assignment that requires me to show the sampling distribution of the mean with only a population proportion and sample size.
• If a question talks about a "population proportion" then you are dealing with a binomial distribution, except that you divide by the sample size to get sample proportion rather than the sample count. If the population proportion is p, then the mean value of sample proportions will be also be p (as usual, the mean of the sampling distribution is just the same as for the whole population), and the variance will be p(1 - p)/n, where n is the size of the sample. You can read about this distribution here (note they use the letter pi for population proportion. It does NOT mean 3.14159...):
http://onlinestatbook.com/2/sampling_distributions/samp_dist_p.html
• why can we say that the sampling distribution of mean follows a normal distribution for a large enough sample size even though the population is may not be normally distributed?
• Properly, the sampling distribution APPROXIMATES a normal distribution for a sufficiently large sample (sometimes cited as n > 30). A coin flip is not normally distributed, it is either heads or tails. But 30 coin flips will give you a binomial distribution that looks reasonably normal (at least in the middle).
• at , it has been said that even for single samples the central limit theorem is true. It is not so, central limit theorem is applicable only for sample MEANS. For example, out of a population of 5000 if I have taken the sample of n=50, central limit theorem does NOT apply to that. It applies only when I have taken (e.g.)40 samples of n=50. However, this is as per my understanding. Please correct me if I am wrong.