If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Determining sample size based on confidence and margin of error

AP.STATS:
UNC‑4 (EU)
,
UNC‑4.C (LO)
,
UNC‑4.C.1 (EK)
,
UNC‑4.C.2 (EK)
,
UNC‑4.C.3 (EK)
,
UNC‑4.C.4 (EK)

## Video transcript

- [Instructor] We're told Della wants to make a one-sample z interval to estimate what proportion of her community members favor a tax increase for more local school funding. She wants her margin of error to be no more than plus or minus 2% at the 95% confidence level. What is the smallest sample size required to obtain the desired margin of error? So let's just remind ourselves what the confidence interval will look like and what part of it is the margin of error, and then we can think about what is her sample size that she would need. So she wants to estimate the true population proportion that favor a tax increase. She doesn't know what this is, so she's going to take a sample size of size n, and in fact this question is all about what n does she need in order to have the desired margin of error. Well whatever sample she takes there, she's going to calculate a sample proportion. And then the confidence interval that she's going to construct is going to be that sample proportion plus or minus critical value, and this critical value is based on the confidence level. We'll talk about that in a second. What z star, what critical value would correspond to a 95% confidence level, times and then you would have times the standard error of her statistic. And so in this case it would be the square root, it would be the standard error of her sample proportion, which is the sample proportion times one minus the sample proportion, all of that over her sample size. Now she wants the margin of error to be no more than 2%. So the margin of error is this part right over here. So this part right over there, she wants to be no more than 2%, has to be less than or equal to 2%, that green color is kind of too shocking. It's unpleasant, all right. (laughing) Less than or equal to 2% right over here. So how do we figure that out? Well the first thing let's just make sure we incorporate the 95% confidence level. So we could look at a z-table. Remember 95% confidence level, that means if we have a normal distribution here, if we have a normal distribution here, 95% confidence level means the number of standard deviations we need to go above and beyond this in order to capture 95% of the area right over here. So this would be 2.5% that is unshaded at the top right over there, and then this would be 2.5% right over here. And we could look up in a z-table, and if you were to look up in a z-table, you would not look up 95%. You would look up the percentage that would leave 2.5% unshaded at the top. So you would actually look up 97.5%. But it's good to know in general that at a 95% confidence level, you're looking at a critical value of 1.96. And that's just something good to know. We could of course look it up on a z-table. So this is 1.96. And so this is going to be 1.96 right over here. But what about p hat? We don't know what p hat is until we actually take the sample, but this whole question is, how large of a sample should we take. Well remember we want this stuff right over here that I'm now circling or squaring in this less, less bright color, (laughing) this blue color. We want this thing to be less than or equal to 2%. This is our margin of error. And so what we could do is we could pick a sample proportion, we don't know if that's what it's going to be, that maximizes this right over here. Because if we maximize this, we know that we're essentially figuring out the largest thing that this could end up being, and then we'll be safe. So the p hat, the maximum p hat, and so if you wanna maximize p hat times one minus p hat, you could do some trial and error here. This is a fairly simple quadratic. It's actually going to be p hat is 0.5, and I wanna be, I wanna emphasize we don't know. She didn't even perform the sample yet. She didn't even take the random sample and calculate the sample proportion, but we wanna figure out what n to take, and so to be safe she says okay, well what sample proportion would maximize my margin of error? And so let me just assume that and then let me calculate n. So let me set up an inequality here. We want 1.96, that's our critical value, times the square root of, we're just going to assume 0.5 for our sample proportion, although of course we don't know what it is yet until we actually take the sample. So that's our sample proportion. That's one minus our sample proportion. All of that over n needs to be less than or equal to 2%. We don't want our margin of error to be any larger than 2%. Let me just write this as a decimal, 0.02. And now we just have to do a little bit of algebra to calculate this. So let's see how we could do this. So this could be rewritten as, we could divide both sides by 1.96, 1.96. One over 1.96. And so this would be equal to, on the left-hand side we'd have the square root of all of this, but that's the same thing as the square root of 0.5 times 0.5 so that would just be 0.5 over the square root of n needs to be less than or equal to, actually let me write it this way. This is the same thing as two over 100. So two over 100 times one over 1.96 needs to be less than or equal to two over 196. Let me scroll down a little bit. This is fancier algebra than we typically do in statistics, or at least in introductory statistics class. All right so let's see we could take the reciprocal of both sides. We could say the square root of n over 0.5, and 196 over two. Now let's see what's 196 divided by two? That is going to be 98. So this would be 98. And so if we take the reciprocal of both sides, then you're gonna swap the inequality. So it's gonna be greater than or equal to. Let's see I could multiply both sides of this by 0.5. So 0.5, that's why I said 0.5 but my fingers wrote down 0.4. Let's see 0.5. And so there we get the square root of n needs to be greater than or equal to 49, or n needs to be greater than or equal to 49 squared. And what's 49 squared? Well you know 50 squared is 2,500, so you know it's going to be close to that, so you can already make a pretty good estimate that it's going to be D. But if you wanna multiply it out we can. 49 times 49, nine times nine is 81. Nine times four is 36 plus eight is 44. Four times nine, 36. Four times four is 16 plus three, we have 19. And then you add all of that together, and you indeed do get, so that's 10, and so this is a 14. You do indeed get 2,401. So that's the minimum sample size that Della should take if she genuinely wanted her margin of error to be no more than 2%. Now it might turn out that her margin of error when she actually takes the sample of size 2,401, if her sample proportion is less than 0.5 or greater than 0.5, well then she's going to be in a situation where her margin of error might be less than this. But she just wanted to be no more than that. Another important thing to appreciate is, it just the math all worked out very nicely just now, where I got our n to be actually a whole number. But if I got 2,401.5, then you would have to round up to the nearest whole number because you can't have a your sample size is always going to be a whole number value. So I will leave you there.