Main content

## Statistics and probability

### Course: Statistics and probability > Unit 11

Lesson 3: Estimating a population mean- Introduction to t statistics
- Simulation showing value of t statistic
- Conditions for valid t intervals
- Reference: Conditions for inference on a mean
- Conditions for a t interval for a mean
- Example finding critical t value
- Finding the critical value t* for a desired confidence level
- Example constructing a t interval for a mean
- Calculating a t interval for a mean
- Confidence interval for a mean with paired data
- Making a t interval for paired data
- Interpreting a confidence interval for a mean
- Sample size for a given margin of error for a mean
- Sample size and margin of error in a confidence interval for a mean

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Sample size for a given margin of error for a mean

Calculate the approximate sample size required to obtain a desired margin of error in a confidence interval for a mean.

## Want to join the conversation?

- I tried the t* times SE/square-root(n) with all the answers and only answer C (approximately 8.69) and D (approximately 6.82) produced a margin of error no more than 10. Using a sample size of 5 and 7 would result of a margin of error of approximately 14.30 and 11.02 respectively. As Sal Khan said in some previous videos, if we don't know the Standard Deviation of the true population and instead using Standard Deviation from a sample population (aka Standard Error), to calculate margin of error with the formula z* times sigma/square-root(n), the result will not reach the desired confidence level and often are underestimating the level. Here he did the exact same thing he said before we shouldn't be doing...And the results I calculated also showed that the answers should be C (the smallest sample size to get <= 10 margin of error) instead of B...I am lost now...anyone can help?(9 votes)
- A correction of terminologies;

sample SD is just sample SD

SE of the mean is equal to (sample SD)/sqrt(n)

The thing is that here, we don't have a sample standard deviation. But we do have a pilot study that tells us that the*population*SD is 15km, and so hopefully, we can trust that. And if we use population SD, we use z*, not t*.

It is still possible that the n Sal calculated will not give a ME that is less than 10. Maybe the pilot study was completely wrong, or the sample SD that we get once we actually start the experiment ends up grossly overestimating the population SD.(4 votes)

- Why did khan choose to use the Z-score instead of the T-table for this problem as they want to take a sample.(4 votes)
- You need to know the degrees of freedom (df), which requires you to know the sample size, to use a t-table. Since the sample size is what we are trying to find out, we cannot use a t-table.(5 votes)

- I share the other commentators' sentiment. If we use the sample size n=7 and apply the appropriate t critical value for df=6, we'll see that the margin of error is about 11 which is 10% higher than the target 10. It is obvious that using z score for such a small sample is not wise.

Wouldn't it be more prudent to calculate the ME for each of the presented values (only 4 of them, after all) and choose one which*actually*satisfies the condition? At least as far as this example goes, the solution provided in the video is unsatisfactory.(5 votes) - My TI-84 Plus calculator doesn't have the tail function so I get a z-value of 1.28 as a result. Furthermore, I checked the z-table and 90% is 1.28. How come you got 1.64 for it?(3 votes)
- The probability of being inside the interval is 90%, but the z-table takes the probability of being inside or below the interval,

which is 95% ⇒ 𝑧 ≈ 1.64(3 votes)

- please judge this:

if continue with t then min n = 9

more details:

using the inequality expression that

t*(15/n^0.5) </= 10 leads to

n^0.5 >/= 1.5 t*

we need the min n relative to max t* and also satisfies the above inequality.

looking at t table, will find min n=9

for n=7 , 7^0.5=2.65 while 1.5*1.943= 2.9, not satisfying the inequality.

for n=8, 8^0.5=2.828 while 1.5(1.895)= 2.842 not satisfying

BUT @ n=9

9^0.5 = 3 and 1.5*1.860=2.79 satisfying the inequality(3 votes) - Can't we get an ever better estimate if we take our result of n = 7 that we initially got and then calculate a t critical value based on that. So, we know that n = 7 and therefore our df = 6. By plugging into invT in the TI-84 calculator we get that the t critical value happens to be approximately 1.943, and by solving the inequality 1.943 * 15 / sqrt(n) <= 10 to get n >= 8.496, leading to answer choice C(3 votes)
- 1.943 is the T-critical value for cumulative probability of 95%, specifically for when n = 7. The shape of T Distribution changes according to the size of n. So 1.943 is NOT the T-critical value for cumulative probability of 95% for any other amount of 'n', say 10.

You said solve 1.943 * 15 / sqrt(n) <= 10 for 'n' to get best estimate. Problem is, 1.943 is the T-critical value for cumulative probability of 95% for 'n=7', and it NOT the T-critical value for cumulative probability of 95% for 'n= 8.496'; if you use the Ti-84 tcdf function, you will see that 1.943 T-critical value for when 'n=8.496' (with df of 7.496) is 0.955, which is a cumulative probability of 95.5%. {Sample Mean +/- 1.943 * 15 /8.496} is an Confidence Interval of 93.25% while {Sample Mean +/- 1.943 * 15 /8.496} is an Confidence Interval of 95%, which is what we want.(0 votes)

- What if we don't have the population variance, and only have the sample variance, and we can't use the t-interval with the sample variance to find 'n', what to do?(3 votes)
- You can test each value of n from the available options. This would be a lot faster than the alternative method, which is expanding t* to it's algebraic form and then solve the inequality.(0 votes)

- When do we use the Z-score and t-test?

If we are using the sample standard deviation in z score, then what is the point using t-test? Can't we use only z score to calculate the confidence interval?

Also, can both the test be used interchangeably if we have the degree of freedom and the sample standard deviation?(2 votes) - Can someone explain to me why he listed X +/- at the beginning and then didn't use it? In what case would I use that?(2 votes)
- So what's the difference between the pilot study and the actual study in a real life scenario? Can we estimate the population mean from a pilot study?

And how can we estimate the sample size for the pilot study?(2 votes)

## Video transcript

- [Instructor] Nadia wants to
create a confidence interval to estimate the mean driving range for her company's new electric vehicle. She wants the margin of error to be no more than 10 kilometers at
a 90% level of confidence. A pilot study suggests
that the driving ranges for this type of vehicle have a standard deviation of 15 kilometers. Which of these is the smallest approximate sample size required to obtain
the desired margin of error? So pause this video and see
if you can think about this on your own. So the traditional way that we would construct a margin of
error at a confidence interval we take a sample and from that
sample, we construct the mean and then we add or subtract a
margin of error around that to construct the confidence interval. And the way we've done that since we're dealing
with means is we say alright, if we don't know the standard
deviation of the population, it's appropriate to use the T statistic. So, our critical value we
denote as t star and you'd multiply that by that times
the sample standard deviation divided by the square
root of your sample size. Now, this question is all
about what is an appropriate sample size, given that we wanna have a 90% level of confidence. And what's tricky here is, when you're using a t
table right over here, not only do you need to know
the 90% level of confidence, you also need to know
the degrees of freedom. And the degrees of freedom
is gonna be n minus one. But we don't know what that's
gonna be without knowing n so how would we determine an n? Similarly, you don't know
what your sample standard deviation is going to be until you actually take some samples. So instead of that, what
we could think about we know that another
legitimate way to construct a confidence interval
and the margin of error is to say, alright I
can take my sample mean and I can add or subtract a Z score a critical value at
this time using a z table, where if I multiply that
times the true population standard deviation and divide
that by the square root of n. Now, you might say well I don't know the true population standard deviation. But they tell us a pilot study
suggests that the driving ranges for this type of
vehicle have a standard deviation of 15 kilometers. So we could use this as an estimate of our true population standard deviation. So this is 15 kilometers right over here. And then the good thing about a z table is you don't have to think
about degrees of freedom. You could just look up
your confidence interval. And so then we could just say, that look, z star times 15 kilometers over our square root of n, this right over here
is our margin of error. This right over here
is our margin of error, that has to be no more than 10 kilometers. So that has to be less
than or equal to 10. And we can figure out what
z star needs to be for a 90% confidence level and then
so we just solve for n. So let's do that. Now, do figure out z star
I could use the z table but just for a diversity of methodology let's use a calculator here. So to figure out the z
value that would give us a 90% confidence interval I can use a function called inverse norm. And you can see that right
over here, that's choice three. Let me just select that. And what it'll do is, you give it the area that
you want under normal curve. You can even specify the mean
and the standard deviation. Although you want the mean
to be zero and the standard deviation to be one if you
really want to figure out a z score here. And so it'll do is, it will
give you the z score that will give you that corresponding area. And so I want and actually it's already selected that I want the
center area to be 90%. So I could say .9 right over here. If I use the left tail then
that means if I have 90% of the center that means I
would have a 5% of either tail. So instead of doing it .9 in center I could've done .05 and
used the left tail or used .05 and used the right tail. But this is exactly what I want. So let me just go and paste this. And so this should give me
the appropriate z value. So there you go. If I want this middle 90%, the center 90%, I have to go one point, roughly 1.645 standard deviations below the mean and that same
amount above the mean. So it's roughly our critical
value here is approximately 1.6, let's just say 1.645. So we have 1.645 times times 15 over the square root of n is going to be less than or equal to 10. And so now there's a couple of
ways that you could do this. We could do a little bit
of algebra to simplify this inequality and I
encourage you to do so or you could even try out some values here and see which of these
ends would make this true and we want the smallest possible one. I'll do it the algebraic way because if you're actually doing
this in the real world Nadia would not have a multiple
choice right over here. She'd have to figure out the sample size in order to conduct her study. So let's do that. So let's see. If I divide both sides by 1.645 and 15 what do I get? I get one over the square root of n is less than or equal to 10 over 1.645 over, over 15. And if I take the reciprocal
of both sides I get the square root of n is
greater than or equal to, if I'm taking the
reciprocal of both sides, and so this is going to be 1.645, times 15 times 15. All of that over 10. All of that over 10. See 15 over 10 is just 1.5 so let me just write that as 1.5 right over here. And then if I square both
sides I would get that n needs to be greater than or equal to 1.645 times... Times 1.5. And then all of that squared. I just squared both sides. All of that squared. And so let's get our calculator back. We are going to have 1.645 times 1.5 and then we want to
square it and we get 6. Approximately 6.0, 6.09. So n has to be greater
than or equal to 6.09. And of course our sample
size needs to actually be a whole number. So what's the smallest whole number that is larger than 6.09? Well that's going to be seven. So that would be this
choice right over here. This is the smallest
approximate sample size required to obtain the
desired margin of error. And of course we won't
really really know until we actually conduct the study. We obviously here use an estimate of the population standard deviation. And we used a z table but
it will be interesting when Nadia actually conducts
the study to see if her margin of error is indeed
no more than 10 kilometers with a 90% level of confidence.