Main content

## AP®︎/College Statistics

### Unit 10: Lesson 2

Confidence intervals for proportions- Conditions for valid confidence intervals for a proportion
- Conditions for confidence interval for a proportion worked examples
- Reference: Conditions for inference on a proportion
- Conditions for a z interval for a proportion
- Critical value (z*) for a given confidence level
- Finding the critical value z* for a desired confidence level
- Example constructing and interpreting a confidence interval for p
- Calculating a z interval for a proportion
- Interpreting a z interval for a proportion
- Determining sample size based on confidence and margin of error
- Sample size and margin of error in a z interval for p

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Determining sample size based on confidence and margin of error

AP.STATS:

UNC‑4 (EU)

, UNC‑4.C (LO)

, UNC‑4.C.1 (EK)

, UNC‑4.C.2 (EK)

, UNC‑4.C.3 (EK)

, UNC‑4.C.4 (EK)

Determining sample size based on confidence level and margin of error.

## Want to join the conversation?

- Can someone help me walk through how Sal determined that 0.5 will maximize p(1-p)?(1 vote)
- So basically, think of it this way. p(1 - p) = p - p^2. If you graph this, you will have roots at 0 and 1. This means the vertex is at x = 0.5. Since the graph is opening downwards with an a value that is less than one, the vertex will be a maximum point. Plug in 0.5, and you get 0.5-0.5^2 = 0.25. You will never get a value that is larger than that.(28 votes)

- Why we need to maximize this term "p_hat(1 - p_hat)"?(13 votes)
- You need to maximize this term so that you basically handle the worst case scenario when the margin of error is largest. After you maximize the margin of error, you can now find n accordingly so that you're 100% sure that the margin of error won't exceed 2.(2 votes)

- When we maximize the term, p_hat(1-p_hat), is that value, 0.5, always the same?(5 votes)
`0.5`

is always the maximum of`p_hat(1-p_hat)`

.(3 votes)

- Don't z-scores tell you have many standard deviations from the mean you are? Why is the z-score 1.96 and not 2 for 95% confidence?(2 votes)
- Yes z is the number of standard deviations from the mean. In a normal distribution,
**approximately**95% of the data is within 2 standard deviations from the mean.

So for 95% confidence, 2 is an**approximation**of the z-score. However, 1.96 is a more precise approximation of this z-score.(6 votes)

- Doesn't the empirical rule say that 95% is two standard deviations? That means that the z* critical value is two, not 1.96. Am I right?(3 votes)
- No, 2 is the number of standard deviations. When finding the Z* values they use a formula involving infinity and the number becomes 1.96.(3 votes)

- How do we interpret the 2% margin of error here? Does it mean that if Della's survey yields, let's say, that 30% of the sample supports the tax increase, then we can say that we are 95% sure that between 28% and 32% of the entire population support the tax increase?(4 votes)
- Is there a formula to find the minimum sample size?(2 votes)
- You can obtain a formula by solving for
`n`

without plugging in numbers.(2 votes)

- You are interested in estimating the the mean weight of the local adult population of female white-tailed deer (doe). From past data, you estimate that the standard deviation of all adult female white-tailed deer in this region to be 25 pounds. What sample size would you need to in order to estimate the mean weight of all female white-tailed deer, with a 96% confidence level, to within 9 pounds of the actual weight?(2 votes)
- A 96% confidence level means probability [(100-96)/2] * 100% = 2% in each of the two tails.

From the normal table, the z-score associated with a right tail of 2% (cumulative probability 98%) is 2.05. By symmetry, the z-score associated with a left tail of 2% is -2.05.

So the margin of error (the distance from either endpoint of the confidence interval to the center) is 2.05 times the standard deviation of the sample mean.

The standard deviation of the sample mean is 25/sqrt(n) pounds.

Since the margin of error needs to be 9 pounds, we have

2.05*25/sqrt(n) = 9

1/sqrt(n) = 9/(2.05*25)

sqrt(n) = 2.05*25/9

n = (2.05*25/9)^2 = 32.43.

To make sure not to exceed the needed margin, we round up. So we need sample size at least n = 33.

Have a blessed, wonderful day!(2 votes)

- if population is 650 how much should be the sample size to be significant(2 votes)
- If I remember correctly, less than 10% but more than 30 if you want to use a z-score. Also np>=10 and n(1-p)>=10.(1 vote)

- I see him use 2 different formulas, one involving the square root of variance over samples, and the other involving the square root of p times 1-p over samples. Can anyone tell me what the differences between the 2 formulas are? I need help.(1 vote)

## Video transcript

- [Instructor] We're
told Della wants to make a one-sample z interval to
estimate what proportion of her community members
favor a tax increase for more local school funding. She wants her margin
of error to be no more than plus or minus 2% at the 95% confidence level. What is the smallest sample size required to obtain the desired margin of error? So let's just remind
ourselves what the confidence interval will look like
and what part of it is the margin of error, and then we can think about
what is her sample size that she would need. So she wants to estimate the
true population proportion that favor a tax increase. She doesn't know what this is, so she's going to take
a sample size of size n, and in fact this question is all about what n does she need in order to have the desired margin of error. Well whatever sample she takes there, she's going to calculate
a sample proportion. And then the confidence
interval that she's going to construct is going to
be that sample proportion plus or minus critical value, and this critical value is
based on the confidence level. We'll talk about that in a second. What z star, what critical
value would correspond to a 95% confidence level, times and then you would
have times the standard error of her statistic. And so in this case it
would be the square root, it would be the standard error
of her sample proportion, which is the sample
proportion times one minus the sample proportion, all
of that over her sample size. Now she wants the margin of
error to be no more than 2%. So the margin of error is
this part right over here. So this part right over there, she wants to be no more than 2%, has to be less than or equal to 2%, that green color is kind of too shocking. It's unpleasant, all right. (laughing) Less than or equal to 2% right over here. So how do we figure that out? Well the first thing let's
just make sure we incorporate the 95% confidence level. So we could look at a z-table. Remember 95% confidence level, that means if we have a
normal distribution here, if we have a normal distribution here, 95% confidence level means the number of standard
deviations we need to go above and beyond this
in order to capture 95% of the area right over here. So this would be 2.5% that
is unshaded at the top right over there, and
then this would be 2.5% right over here. And we could look up in a z-table, and if you were to look up in a z-table, you would not look up 95%. You would look up the
percentage that would leave 2.5% unshaded at the top. So you would actually look up 97.5%. But it's good to know
in general that at a 95% confidence level, you're looking at a
critical value of 1.96. And that's just something good to know. We could of course look
it up on a z-table. So this is 1.96. And so this is going to
be 1.96 right over here. But what about p hat? We don't know what p
hat is until we actually take the sample, but
this whole question is, how large of a sample should we take. Well remember we want
this stuff right over here that I'm now circling or
squaring in this less, less bright color,
(laughing) this blue color. We want this thing to be
less than or equal to 2%. This is our margin of error. And so what we could do is we could pick a sample proportion, we don't know if that's
what it's going to be, that maximizes this right over here. Because if we maximize this, we know that we're
essentially figuring out the largest thing that
this could end up being, and then we'll be safe. So the p hat, the maximum p hat, and so if you wanna maximize
p hat times one minus p hat, you could do some trial and error here. This is a fairly simple quadratic. It's actually going to be p hat is 0.5, and I wanna be, I
wanna emphasize we don't know. She didn't even perform the sample yet. She didn't even take the random sample and calculate the sample proportion, but we wanna figure out what n to take, and so to be safe she says okay, well what sample proportion would maximize my margin of error? And so let me just
assume that and then let me calculate n. So let me set up an inequality here. We want 1.96, that's our critical value, times the square root of, we're just going to assume
0.5 for our sample proportion, although of course we
don't know what it is yet until we actually take the sample. So that's our sample proportion. That's one minus our sample proportion. All of that over n needs to
be less than or equal to 2%. We don't want our margin of
error to be any larger than 2%. Let me just write this as a decimal, 0.02. And now we just have to
do a little bit of algebra to calculate this. So let's see how we could do this. So this could be rewritten as, we could divide both sides by 1.96, 1.96. One over 1.96. And so this would be equal to, on the left-hand side
we'd have the square root of all of this, but that's the same thing
as the square root of 0.5 times 0.5 so that would just be 0.5 over the square root of n needs to be less than or equal to, actually let me write it this way. This is the same thing as two over 100. So two over 100 times one over 1.96 needs to be less than or
equal to two over 196. Let me scroll down a little bit. This is fancier algebra than we typically do in statistics, or at least in introductory
statistics class. All right so let's see we
could take the reciprocal of both sides. We could say the square
root of n over 0.5, and 196 over two. Now let's see what's 196 divided by two? That is going to be 98. So this would be 98. And so if we take the
reciprocal of both sides, then you're gonna swap the inequality. So it's gonna be greater than or equal to. Let's see I could multiply
both sides of this by 0.5. So 0.5, that's why I said 0.5 but
my fingers wrote down 0.4. Let's see 0.5. And so there we get the square root of n needs to be greater than or equal to 49, or n needs to be greater
than or equal to 49 squared. And what's 49 squared? Well you know 50 squared is 2,500, so you know it's going
to be close to that, so you can already make
a pretty good estimate that it's going to be D. But if you wanna multiply it out we can. 49 times 49, nine times nine is 81. Nine times four is 36 plus eight is 44. Four times nine, 36. Four times four is 16 plus three, we have 19. And then you add all of that together, and you indeed do get, so that's 10, and so this is a 14. You do indeed get 2,401. So that's the minimum sample
size that Della should take if she genuinely wanted
her margin of error to be no more than 2%. Now it might turn out
that her margin of error when she actually takes
the sample of size 2,401, if her sample proportion is less than 0.5 or greater than 0.5, well then she's going to
be in a situation where her margin of error
might be less than this. But she just wanted to
be no more than that. Another important thing to appreciate is, it just the math all worked
out very nicely just now, where I got our n to be
actually a whole number. But if I got 2,401.5, then you would have to round
up to the nearest whole number because you can't have a your sample size is always going to be
a whole number value. So I will leave you there.