If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Margin of error 2

Finding the 95% confidence interval for the proportion of a population voting for a candidate. Created by Sal Khan.

## Want to join the conversation?

• I don't understand why the confidence interval doesn't take into account the size of the total population. It is interesting to me that the margin of error would have been 10% even if the population were 105 people, in which case a sample of 100 is much more powerful and precise than the same sample out of a population of 100 million. Can anyone please help clarify this concept?
• You are right that population size matters somewhat, although in many real-world examples the sample size is a tiny fraction of the population. As sample size gets to be 10% or more of the population, a "correction factor" can be used to scale the confidence interval to account for extra precision. More details here: <http://www.childrensmercy.org/stats/size/population.asp>. Also, in your example with a big relative sample size, margin of error depends on the sampling method (with or without replacement, i.e., whether you allow picking the same sample/unit/person multiple times): http://en.wikipedia.org/wiki/Simple_random_sample
• At : Why is there 2 sigma of the sampling mean?
• Good question. Sal did something different here than in previous videos. Previously, when finding the 95% confidence interval, he looked up the Z-score on a Z-table. Since Z-tables are organized by percentile (the entire area to the left of the confidence limit), he first had to say, "A 95% interval is equal to 95 / 2 + 50 = 97.5% percentile."

Then he looked up that percentile, 0.9750, on the Z-table and got a Z-score of 1.96. Finally, he multiplied the Z-score by the standard deviation of the sampling distribution, sigma(x-bar). If you do that here, you get (1.96)(0.05) = 0.098. That is the true 95% confidence interval.

But in this video, Sal used a rule of thumb that says 95% confidence is approximately equal to 2 standard deviations around the mean. So he used an approximate Z-score of 2 instead of the actual Z-score of 1.96. And doing this he got a confidence interval of 0.1 rather than the true 0.098.

It's a good rule of thumb, but to be strictly accurate, you should just remember that 1.96 is ALWAYS the Z-score for a 95% confidence interval (unless you have a small sample size and are using a t-table).
• Can i say that if I have a good amount of samples, 95% of the means of those samples will fall within the range of the confidence interval? My teacher emphasized that we couldn't say the population mean has 95% of chance being in that interval, because the population mean is a constant. It is either in that range or not. My interpretation is we are confident that 95% of the time our sample means will fall within the range we construct around the true population mean in the sampling distribution of the population. Is that correct?
• if we had the population standard deviation (sigma), (which I don't think we ever do) then it seems to me that everything you say is the correct way to look at it. But since we have only our sample standard deviation (s), then doesn't the 95% have a little bit of uncertainty? I think the SEM uses s/sqrt(n) while the central limit theorem uses sigma/sqrt(n).
• What's the difference between a frequentist confidence interval and a Bayesian credible interval?
• How come there's a greater probability for candidate A to win even though more people are voting for candidate B? I mean, I get the calculation and everything but how is this possible?
• There is not a greater chance that candidate A will win. Candidate B is most likely to win but Sal is only trying to make the point that candidate A still could win although it is unlikely.
• Am I correct that the margin of error is INcorrect at the end of this video? 0.43 +/- 0.1 equates to a margin of error of 0.1/0.43 ~ 23%. This will give the proper range, 0.33 to 0.53.
• The range mentioned in the video itself is 33% and 53% which is the same as 0.33 to 0.53 (just in percentage instead). The margin of error is 10% which is +/- 0.1 (again just in percentage).

Your margin of error comes from the 'estimate' standard deviation, and nothing else. As such, I am not really sure as to why you are dividing 0.1 by 0.43 to get 23%.
• What does "sampling distribution of the sample means" say that "distribution of the sample means" doesn't? And, does "sampling distribution" denote anything in particular (that is: Is the term self explanatory, without rote memorization?) (Does it mean "distribution of statistics from different samples around their corresponding population statistics"?) .
(1 vote)
• > "What does "sampling distribution of the sample means" say that "distribution of the sample means" doesn't?"

Nothing, they are equivalent. The second is just slightly less of a mouthful to say/write.

> "And, does "sampling distribution" denote anything in particular ... Does it mean "distribution of statistics from different samples around their corresponding population statistics"?"

Your question is answered by your "guess" of the answer. A sampling distribution is the distribution of a statistic over many repeated samples. Hopefully, the corresponding population parameter will be in the middle of that sampling distribution.

Also: Note that a statistic corresponds to the sample, a parameter corresponds to the population. So, e.g., s² is a statistic, σ² is a parameter.
• I don't understand why P( x bar is within two times of standard deviation of u) is equal to P(u is within two times of standard deviation of x bar). Since u is a unknown constant which won't change, but as Sal said, x bar is just one of the sample mean and we can have thousands of these kind of sample mean which located within two times of standard deviation. In other words, x bar is a changing variable, but u is a constant. I can easily understand P( x bar is within two times of standard deviation of u), but I don't think that P(u is within two times of standard deviation of x bar) is equal to the previous one. Only if both of x bar and u are constant, we can say they are equal. otherwise, they could never be equal. More over, I don't know how to calculate P(u is within two times of standard deviation of x bar). What did I miss? Thanks for the answer.
• This is what we know with certainty, from the central limit theorem - that sample means have a normal distribution around mu.
This means that there is a 68% probability that mu and our sample mean are within sigma/sqrt(n) of each other, right? (If our sample mean is x from mu, then how far is mu from our sample mean? If there is a 68% probability that our sample mean is within x of mu, what's the probability that mu is within x of our sample mean?).
We use (sample standard deviation)/sqrt(n) (called SEM) as an approximation to sigma/sqrt(n) for the standard deviation, since we don't have sigma.
(1 vote)
• How come (x bar within 2 sigma x bar of mean of mu x bar ) is same as
(mu x bar within 2 sigma x bar of x bar )
How are they interchangeable ? Can anyone please clarify this concept mathematically/graphically though intuitively it looks ok ?