If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Confidence interval example

Sal calculates a 99% confidence interval for the proportion of teachers who felt computers are an essential tool. Created by Sal Khan.

## Want to join the conversation?

• This video got me confused. In the introductory video on confidence intervals:

Sal solves a very similar problem. In both problems we're trying to estimate the standard deviation of the sampling distribution of the sample mean. And in the introductory video, Sal defines standard error of p-hat as:
`SE_p-hat = √(p-hat·(1 - p-hat)/n)`
and says that it is an unbiased estimator for standard deviation of sampling distribution.

In this video, he calculates:
`σ_p-hat = σ/√n`
`σ = √(p-hat·(1 - p-hat)·n/(n - 1))`
`σ_p-hat = √(p-hat·(1 - p-hat)/(n - 1))`
Clearly, we're getting a different estimate than what we would've got by calculating standard error. So, is standard error not, in fact, an unbiased estimator? Or is there some mistake in this video? • Why can't we use p(1-p) for the sample variance? When I do the calculations it works out the same if rounded. Then the formula for variance of sample distribution of the sample mean would be p(1-p)/n which is much easier to remember. • Why did we not straight off consider the distribution of the sample proportion as binomial distribution and proceed to find the standard error using, sq rt[ (sample proportion * (1 - sample proportion))/n ]? • this has gotta be one of the most unorganized topics in khanacademy • So I am reviewing stats for grad school and my school provides a brief review. On the section on confidence intervals it says this:

You can calculate a confidence interval with any level of confidence although the most common are 95% (z*=1.96), 90% (z*=1.65) and 99% (z*=2.58).

This confused me a bit. Maybe I am doing something wrong but these numbers don't seem to match up with a z-score chart. Can anyone shed some light on what might be happening here? • For confidence intervals based on the normal distribution, the critical value is chosen such that P( -z <= Z <= z ) = 0.95. That is, we want an interval that is symmetric about the mean. The middle part, inside of the critical values, must be the confidence level. The two tails must combine to be α, so each tail is α/2.

Hence, for a 95% confidence interval, instead of looking up 0.05 or 0.95, we want to look up 0.25 or 0.975 in the Z-table, and get the Z critical values from those. Doing so, we would obtain the values your review noted.
• I do not understand why there is -1 in denominator while calculating Variance • So for the sampling distribution of the sample mean here, we seem to be assuming a normal distribution as usual, that is to say it extends forever in both directions. Doesn't this cause problems if say, our p is very close to 0 or 1, for example if 99% teachers in our sample had been in favour of the computers, we would end up calculating the population mean would be just as likely to be over 1 as under 0.98, which is clearly impossible. How do you correct that? • When dealing with proportions, there's a general rule that we need to check.
``n*p > 5n*(1-p) > 5 ``

Though note that sometimes the 5 is replaced with 10. When both of these conditions are satisfied, then it's generally reasonable to assume that the sampling distribution of the sample proportion (the sample mean of data that takes values 0 or 1 ). So say p was 99%, then we'd have:
``n*p = 250*0.99 = 247.5n*(1-p) = 250*0.01 = 2.5 ``

The second one is not larger than 5, so in such a case it would not be reasonable to assume a Normal distribution; we'd need the sample size to be much larger. This is related to the Central Limit Theorem, forcing the sample size to be large enough so that the approximation is reasonable.

Though, there's always a possibility of still having extremely rare events (like some rare disease, where 1 in 10000 people have it) and so the raw proportion isn't a very useful measure. Sometimes instead of the proportion, people will think about the "odds," defined as p / (1-p), and the natural log of this quantity is generally assumed to be normally distributed.
• Where did the .495 come from? at   