Main content

## Statistics and probability

### Course: Statistics and probability > Unit 11

Lesson 2: Estimating a population proportion- Confidence interval example
- Margin of error 1
- Margin of error 2
- Conditions for valid confidence intervals for a proportion
- Conditions for confidence interval for a proportion worked examples
- Reference: Conditions for inference on a proportion
- Conditions for a z interval for a proportion
- Critical value (z*) for a given confidence level
- Finding the critical value z* for a desired confidence level
- Example constructing and interpreting a confidence interval for p
- Calculating a z interval for a proportion
- Interpreting a z interval for a proportion
- Determining sample size based on confidence and margin of error
- Sample size and margin of error in a z interval for p

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Confidence interval example

Sal calculates a 99% confidence interval for the proportion of teachers who felt computers are an essential tool. Created by Sal Khan.

## Want to join the conversation?

- This video got me confused. In the introductory video on confidence intervals:

https://www.khanacademy.org/math/statistics-probability/confidence-intervals-one-sample/introduction-to-confidence-intervals/v/confidence-intervals-and-margin-of-error

Sal solves a very similar problem. In both problems we're trying to estimate the standard deviation of the sampling distribution of the sample mean. And in the introductory video, Sal defines standard error of p-hat as:`SE_p-hat = √(p-hat·(1 - p-hat)/n)`

and says that it is an unbiased estimator for standard deviation of sampling distribution.

In this video, he calculates:`σ_p-hat = σ/√n`

`σ = √(p-hat·(1 - p-hat)·n/(n - 1))`

`σ_p-hat = √(p-hat·(1 - p-hat)/(n - 1))`

Clearly, we're getting a different estimate than what we would've got by calculating standard error. So, is standard error not, in fact, an unbiased estimator? Or is there some mistake in this video?(37 votes)- I just cut to the chase from the question and did the square root of (0.568*0.432) / 250 and got the same SE answer as him (0.031). I am not sure why he had to treat it as a Bernoulli at first and add in extra steps.(15 votes)

- Why can't we use p(1-p) for the sample variance? When I do the calculations it works out the same if rounded. Then the formula for variance of sample distribution of the sample mean would be p(1-p)/n which is much easier to remember.(18 votes)
- I think he was just using the sample means of the Bernoulli trials. Which made sence to him and then seeing that through. I agree with you that when dealing with proportions use p(1-p)/n.(4 votes)

- Why did we not straight off consider the distribution of the sample proportion as binomial distribution and proceed to find the standard error using, sq rt[ (sample proportion * (1 - sample proportion))/n ]?(18 votes)
- So I am reviewing stats for grad school and my school provides a brief review. On the section on confidence intervals it says this:

You can calculate a confidence interval with any level of confidence although the most common are 95% (z*=1.96), 90% (z*=1.65) and 99% (z*=2.58).

This confused me a bit. Maybe I am doing something wrong but these numbers don't seem to match up with a z-score chart. Can anyone shed some light on what might be happening here?(5 votes)- For confidence intervals based on the normal distribution, the critical value is chosen such that P( -z <= Z <= z ) = 0.95. That is, we want an interval that is symmetric about the mean. The middle part, inside of the critical values, must be the confidence level. The two tails must
*combine*to be α, so each tail is α/2.

Hence, for a 95% confidence interval, instead of looking up 0.05 or 0.95, we want to look up 0.25 or 0.975 in the Z-table, and get the Z critical values from those. Doing so, we would obtain the values your review noted.(18 votes)

- I do not understand why there is -1 in denominator while calculating Variance(10 votes)
- But he did not use (n-1) in any of sampling distribution (earlier sections)(4 votes)

- So for the sampling distribution of the sample mean here, we seem to be assuming a normal distribution as usual, that is to say it extends forever in both directions. Doesn't this cause problems if say, our p is very close to 0 or 1, for example if 99% teachers in our sample had been in favour of the computers, we would end up calculating the population mean would be just as likely to be over 1 as under 0.98, which is clearly impossible. How do you correct that?(5 votes)
- When dealing with proportions, there's a general rule that we need to check.
`n*p > 5`

n*(1-p) > 5

Though note that sometimes the 5 is replaced with 10. When both of these conditions are satisfied, then it's generally reasonable to assume that the sampling distribution of the sample proportion (the sample mean of data that takes values 0 or 1 ). So say p was 99%, then we'd have:`n*p = 250*0.99 = 247.5`

n*(1-p) = 250*0.01 = 2.5

The second one is not larger than 5, so in such a case it would not be reasonable to assume a Normal distribution; we'd need the sample size to be much larger. This is related to the Central Limit Theorem, forcing the sample size to be large enough so that the approximation is reasonable.

Though, there's always a possibility of still having extremely rare events (like some rare disease, where 1 in 10000 people have it) and so the raw proportion isn't a very useful measure. Sometimes instead of the proportion, people will think about the "odds," defined as p / (1-p), and the natural log of this quantity is generally assumed to be normally distributed.(12 votes)

- Where did the .495 come from? at10:35(7 votes)
- We want to be 99% confident i.e. with probability of 0.99, sample mean lies in the confidence interval. Since confidence interval is symmetrical about mean of sampling distribution of sample means, so we want 0.99/2=0.495 probability on both sides of mean. From here only, 0.495 was calculated.According to what happy 2332 said. If you look at confidence interval 1, Sal tells you why you want to divide it by 2. Because you only want between the mean and your z score. Because the z score tells you everything to the left of the z score you want to know what is only between. Then and only then, can you multiply to find the interval of your z score.(7 votes)

- this has gotta be one of the most unorganized topics in khanacademy(8 votes)
- could I assign the (1-p) as 2 and not zero? and what topic should I look up into regarding this assigning of 1 and 0?(5 votes)
- why use n-1 to find variance?(3 votes)
- If we are using a
**sample**to estimate the variance of some population, we use n-1 instead of n, because using n will actually give an underestimate (on average). If you are finding the true variance of an**entire population**, then you would use n. In this problem, the whole population is 6250 teachers, but they only asked a sample of 250, so you should use n-1. Sal has some videos about why we use n-1, because this really is a common confusion for a lot of students.

https://www.khanacademy.org/math/probability/descriptive-statistics/variance-std-deviation/v/review-and-intuition-why-we-divide-by-n-1-for-the-unbiased-sample-variance

https://www.khanacademy.org/math/probability/descriptive-statistics/variance-std-deviation/v/simulation-showing-bias-in-sample-variance(7 votes)

## Video transcript

In a local teaching district,
a technology grant is available to teachers in order
to install a cluster of four computers in their classroom. From the 6,250 teachers in the
district, 250 were randomly selected and asked if they felt
that computers were an essential teaching tool
for their classroom. Of those selected, 142 teachers
felt that the computers were an essential
teaching tool. And then they ask us, calculate
a 99% confidence interval for the proportion of
teachers who felt that the computers are an essential
teaching tool. So let's just think about
the entire population. We weren't able to survey all
of them, but the entire population, some of them fall
in the bucket, and we'll define that as 1, they thought
it was a good tool. They thought that the computers
were a good tool. And we'll just define a
0 value as a teacher that says not good. And some proportion of the total
teachers think that it is a good learning tool. So that proportion is p. And then the rest of them think
it's a bad learning tool, 1 minus p. We have a Bernoulli Distribution
right over here, and we know that the mean of
this distribution or the expected value of this
distribution is actually going to be p. So it's actually going to be a
value, it's neither 0 or 1, so not an actual value that you
could actually get out of a teacher if you were
to ask them. They cannot say something in
between good and not good. The actual expected value
is something in between. It is p. Now what we do is we're taking
a sample of those 250 teachers, and we got that 142
felt that the computers were an essential teaching tool. So in our survey, so we had 250
sampled, and we got 142 said that it is good, and we'll
say that this is a 1. So we got 142 1's, or we sampled
1, 142 times from this distribution. And then the rest of the time,
so what's left over? There's another 108 who said
that it's not good. So 108 said not good, or you
could view them as you were sampling a 0, right? 108 plus 142 is 250. So what is our sample
mean here? We have 1 times 142, plus 0
times 108 divided by our total number of samples,
divided by 250. It is equal to 142 over 250. You could even view this as
the sample proportion of teachers who thought that
the computers were a good teaching tool. Now let me get a calculator
out to calculate this. So we have 142 divided by
250 is equal to 0.568. So our sample proportion
is 0.568. or 56.8%, either one. So 0.568. Now let's also figure out our
sample variance because we can use it later for building
our confidence interval. Our sample variance here--
so let me draw a sample variance-- we're going to take
the weighted sum of the square differences from the mean and
divide by this minus 1. So we can get the best estimator of the true variance. So it's 1 times-- no, it's the
other way actually around-- we have 142 samples that were 1
minus 0.568 away from our sample mean, or we're this far
from the sample mean 142 times, and we're going to
square those distances. Plus the other 108 times we got
a 0, so we were 0 minus 0.568 away from the
sample mean. And then we are going to divide
that by the total number of samples minus 1. That minus 1 is our adjuster
so that we don't underestimate. So 250 minus 1. Let's get our calculator
out again. And so we have 100-- we put
a parentheses around everything-- I have 142 times
1 minus 0.568 squared, plus 108 times 0 minus-- and you
could obviously do parts of this in your head, but I'm just
going to write the whole thing out-- minus 0.568 squared,
and then all of that divided by 250 minus 1 is 249. So our sample variance is--
well, I'll just say 0.246. It is equal to-- it is our
sample variance-- I'll write it over here-- our sample
variance is equal to 0.246. If you were to take the square
root of that our actual sample standard deviation is going to
be, let's take the square root of that answer right over there,
and we get 0.496 is equal to 0. I'll just round that
up to 0.50. So that is our sample
standard deviation. Now this interval, let's think
of it this way, we are sampling from some sampling
distribution of the sample mean. So it looks like this
over here, it looks that over there. And it has some mean, and so
the mean of the sampling distribution of the sample mean
is actually going to be the same thing as this mean over
here-- it's going to be the same mean value-- which
is the same thing as our population proportion. We've seen this multiple
times. And the sampling distribution's
standard deviation, so the standard
deviation of the sampling distribution, so we could view
that as one standard deviation right over there. So the standard deviation of
the sampling distribution, we've seen multiple times,
is equal to the standard deviation-- let me do this in
a different color-- is equal to the standard deviation of
our original population divided by the square root
of the number of samples. So it's divided by 250. Now we do not know this
right over here. We do not know the actual
standard deviation in our population. But our best estimate of that,
and that's why we call it confident, we're confident that
the real mean or the real population proportion, is going
to be in this interval. We're confident, but we're not
100% sure because we're going to estimate this over here, and
if we're estimating this we're really estimating
that over there. So if this can be estimated it's
going to be estimated by the sample standard deviation. So then we can say this is going
to be approximately, or if we didn't get a weird,
completely skewed sample, it actually might not even be
approximately if we just had a really strange sample. But maybe we should write
confident that-- we are confident that the standard
deviation of our sampling distribution is going to be
around, instead of using this we can use our standard
deviation of our sample, our sample standard deviation. So 0.50 divided by the square
root of 250, and what's that going to be? That is going to be-- so we
have this value right over here, and actually I don't have
to round it, divided by the square root of 250. We get 0.031. So this is equal to
0.031 over here. So that's one standard
deviation. Now they want a 99% confidence
interval. So the way I think about it is
if I randomly pick a sample from the sampling distribution,
what's the 99% chance, or how many-- let
me think of it this way. How many standard deviations
away from the mean do we have to be that we can be 99%
confident that any sample from the sampling distribution will
be in that interval? So another way to think about
think it, think about how many standard deviations we need to
be away from the mean, so we're going to be a certain
number of standard deviations away from the mean such that any
sample, any mean that we sample from here, any sample
from this distribution has a 99% chance of being plus or
minus that many standard deviations. So it might be from
there to there. So that's what we want. We want a 99% chance that if
we pick a sample from the sampling distribution of the
sample mean, it will be within this many standard deviations
of the actual mean. And to figure that out let's
look at an actual Z-table. So we want 99% confidence. So another way to think about it
if we want 99% confidence, if we just look at the upper
half right over here, that orange area should be 0.475,
because if this is 0.475 then this other part's going to be
0.475, and we will get to our-- oh sorry, we want
to get to 99%, so it's not going to be 0.475. We're going to have to go
to 0.495 if we want 99% confidence. So this area has to be 0.495
over here, because if that is, that over here will also be. So that their sum will
be 99% of the area. Now if this is 0.495, this value
on the z table right here will have to be 0.5,
because all of this area, if you include all of this
is going to be 0.5. So it's going to be
0.5 plus 0.495. It's going to be 0.995. Let me make sure I
got that right. 0.995. So let's look at our Z-table. So where do we get 0.995. on our z table? 0.995. is pretty close, just to have
a little error, it will be right over here--
this is 0.9951. So another way to think about it
is 99-- so this value right here gives us the whole
cumulative area up to that, up to our mean. So if you look at the entire
distribution like this, this is the mean right over here. This tells us that at 2.5
standard deviations above the mean, so this is 2.5 standard
deviations above the mean. So this is 2.5 times the
standard deviation of the sampling distribution. If you look at this whole area,
this whole area over here, if you look at the
Z-table, is going to be 0.9951, which tells us that just
this area right over here is going to be 0.4951, which
tells us that this area plus the symmetric area of that many
standard deviations below the mean, if you combine
them, 0.4951 times 2 gets us to 99.2. So this whole area right
here is 99.992. So if we look at the area 2.5
standard deviations above and below the mean-- oh,
let me be careful. This isn't just 2.5,
we have to add another digit of precision. This is 2.5, and the next digit
of precision is given by this column over here. So we have to look all the way
up into the second to the last column, and we have to add
a digit of 8 here. So this is 2.58 standard
deviations. We have 2.5 over here, and then
we get the next digit 8 from the column. 2.58 standard deviations above
and below the standard deviation encompasses a little
over 99% of the total probability. So there's a little over a 99%
chance that any sample mean that I select from the sampling
distribution of the sample mean will fall
within this much of the standard deviation. So let me put it this way. There is a 99-- it's actually,
what, a 99.2% chance, right? If you multiply this times 2
you get 0.99-- actually you get 0.9902. So we'll say roughly 99% chance
that any sample that a random sample mean is within
2.58 standard deviations of the sampling mean of the mean
of the sampling distribution of the sampling mean, which is
the same thing as our actual population mean, which is the
same thing as our population proportion. So of p. And we know what this
value is right here. At least we have a decent
estimate for this value. We don't know exactly what this
is, but our best estimate for this value is
this over here. So we could re-write this, so
we could say that we are confident because we are really
using an estimator to get this value here. We are confident that there is
a 99% chance that a random x, a random sample mean, is
within-- and let's figure out this value right here
using a calculator. So it is 2.58 times our best
estimate of the standard deviation of the sampling
distribution, so times 0.031 is equal to 0.0-- well let's
just round this up because it's so close to 0.08-- is
within 0.08 of the population proportion. Or you could say that you're
confident that the population proportion is within 0.08
of your sample mean. That's the exact
same statement. So if we want our confidence
interval, our actual number that we got for there,
our actual sample mean we got was 0.568. So we could replace this, and
actually let me do it. I can delete this right here. Let me clear it. I can replace this, because we
actually did take a sample. So I can replace this
with 0.568. So we could be confident that
there's a 99% chance that 0.568 is within 0.08 of the
population proportion, which is the same thing as the
population mean, which is the same thing as the mean of the
sampling distribution of the sample mean, so forth
and so on. And just to make it clear we can
actually swap these two. It wouldn't change
the meaning. If this is within 0.08
of that, then that is within 0.08 of this. So let me switch this
up a little bit. So we could put a p is within
of-- let me switch this up-- of 0.568. And now linguistically it sounds
a little bit more like a confidence interval. We are confident that there's a
99% chance that p is within 0.08 of the sample
mean of 0.568. So what would be our confidence
interval? It will be 0.568 plus
or minus 0.08. And what would that be? If you add 0.08 to this right
over here, at the upper end you're going to have 0.648. And at the lower end of our
range, so this is the upper end, the lower end. If we subtract 8 from
this we get 0.488. So we are 99% confident that the
true population proportion is between these two numbers. Or another way, that the true
percentage of teachers who think those computers are good
ideas is between-- we're 99% confident-- we're confident that
there's a 99% chance that the true percentage of teachers
that like the computers is between
48.8% and 64.8%. Now we answered the first
part of the question. The second part, how could the
survey be changed to narrow the confidence interval,
but to maintain the 99% confidence interval? Well, you could just
take more samples. If you take more samples than
our estimate of the standard deviation of this distribution
will go down because this denominator will be higher. If the denominator is higher
then this whole thing will go down. So if the standard deviations
go down here, then when we count the standard deviations,
when we do the plus or minus on the range, this value
will go down and will narrow our range. So you just take more samples.