Main content

## Confidence intervals for proportions

Current time:0:00Total duration:8:01

# Determining sample size based on confidence and margin of error

## Video transcript

- [Instructor] We're
told Della wants to make a one-sample z interval to
estimate what proportion of her community members
favor a tax increase for more local school funding. She wants her margin
of error to be no more than plus or minus 2% at the 95% confidence level. What is the smallest sample size required to obtain the desired margin of error? So let's just remind
ourselves what the confidence interval will look like
and what part of it is the margin of error, and then we can think about
what is her sample size that she would need. So she wants to estimate the
true population proportion that favor a tax increase. She doesn't know what this is, so she's going to take
a sample size of size n, and in fact this question is all about what n does she need in order to have the desired margin of error. Well whatever sample she takes there, she's going to calculate
a sample proportion. And then the confidence
interval that she's going to construct is going to
be that sample proportion plus or minus critical value, and this critical value is
based on the confidence level. We'll talk about that in a second. What z star, what critical
value would correspond to a 95% confidence level, times and then you would
have times the standard error of her statistic. And so in this case it
would be the square root, it would be the standard error
of her sample proportion, which is the sample
proportion times one minus the sample proportion, all
of that over her sample size. Now she wants the margin of
error to be no more than 2%. So the margin of error is
this part right over here. So this part right over there, she wants to be no more than 2%, has to be less than or equal to 2%, that green color is kind of too shocking. It's unpleasant, all right. (laughing) Less than or equal to 2% right over here. So how do we figure that out? Well the first thing let's
just make sure we incorporate the 95% confidence level. So we could look at a z-table. Remember 95% confidence level, that means if we have a
normal distribution here, if we have a normal distribution here, 95% confidence level means the number of standard
deviations we need to go above and beyond this
in order to capture 95% of the area right over here. So this would be 2.5% that
is unshaded at the top right over there, and
then this would be 2.5% right over here. And we could look up in a z-table, and if you were to look up in a z-table, you would not look up 95%. You would look up the
percentage that would leave 2.5% unshaded at the top. So you would actually look up 97.5%. But it's good to know
in general that at a 95% confidence level, you're looking at a
critical value of 1.96. And that's just something good to know. We could of course look
it up on a z-table. So this is 1.96. And so this is going to
be 1.96 right over here. But what about p hat? We don't know what p
hat is until we actually take the sample, but
this whole question is, how large of a sample should we take. Well remember we want
this stuff right over here that I'm now circling or
squaring in this less, less bright color,
(laughing) this blue color. We want this thing to be
less than or equal to 2%. This is our margin of error. And so what we could do is we could pick a sample proportion, we don't know if that's
what it's going to be, that maximizes this right over here. Because if we maximize this, we know that we're
essentially figuring out the largest thing that
this could end up being, and then we'll be safe. So the p hat, the maximum p hat, and so if you wanna maximize
p hat times one minus p hat, you could do some trial and error here. This is a fairly simple quadratic. It's actually going to be p hat is 0.5, and I wanna be, I
wanna emphasize we don't know. She didn't even perform the sample yet. She didn't even take the random sample and calculate the sample proportion, but we wanna figure out what n to take, and so to be safe she says okay, well what sample proportion would maximize my margin of error? And so let me just
assume that and then let me calculate n. So let me set up an inequality here. We want 1.96, that's our critical value, times the square root of, we're just going to assume
0.5 for our sample proportion, although of course we
don't know what it is yet until we actually take the sample. So that's our sample proportion. That's one minus our sample proportion. All of that over n needs to
be less than or equal to 2%. We don't want our margin of
error to be any larger than 2%. Let me just write this as a decimal, 0.02. And now we just have to
do a little bit of algebra to calculate this. So let's see how we could do this. So this could be rewritten as, we could divide both sides by 1.96, 1.96. One over 1.96. And so this would be equal to, on the left-hand side
we'd have the square root of all of this, but that's the same thing
as the square root of 0.5 times 0.5 so that would just be 0.5 over the square root of n needs to be less than or equal to, actually let me write it this way. This is the same thing as two over 100. So two over 100 times one over 1.96 needs to be less than or
equal to two over 196. Let me scroll down a little bit. This is fancier algebra than we typically do in statistics, or at least in introductory
statistics class. All right so let's see we
could take the reciprocal of both sides. We could say the square
root of n over 0.5, and 196 over two. Now let's see what's 196 divided by two? That is going to be 98. So this would be 98. And so if we take the
reciprocal of both sides, then you're gonna swap the inequality. So it's gonna be greater than or equal to. Let's see I could multiply
both sides of this by 0.5. So 0.5, that's why I said 0.5 but
my fingers wrote down 0.4. Let's see 0.5. And so there we get the square root of n needs to be greater than or equal to 49, or n needs to be greater
than or equal to 49 squared. And what's 49 squared? Well you know 50 squared is 2,500, so you know it's going
to be close to that, so you can already make
a pretty good estimate that it's going to be D. But if you wanna multiply it out we can. 49 times 49, nine times nine is 81. Nine times four is 36 plus eight is 44. Four times nine, 36. Four times four is 16 plus three, we have 19. And then you add all of that together, and you indeed do get, so that's 10, and so this is a 14. You do indeed get 2,401. So that's the minimum sample
size that Della should take if she genuinely wanted
her margin of error to be no more than 2%. Now it might turn out
that her margin of error when she actually takes
the sample of size 2,401, if her sample proportion is less than 0.5 or greater than 0.5, well then she's going to
be in a situation where her margin of error
might be less than this. But she just wanted to
be no more than that. Another important thing to appreciate is, it just the math all worked
out very nicely just now, where I got our n to be
actually a whole number. But if I got 2,401.5, then you would have to round
up to the nearest whole number because you can't have a your sample size is always going to be
a whole number value. So I will leave you there.