Main content

## Estimating a population mean

# Conditions for valid t intervals

AP.STATS:

UNC‑4 (EU)

, UNC‑4.P (LO)

, UNC‑4.P.1 (EK)

## Video transcript

- [Instructor] Flavia wanted
to estimate the mean age of the faculty members
at her large university. She took an SRS, or simple random sample, of 20 of the approximately
700 faculty members, and each faculty member in
the sample provided Flavia with their age. The data were skewed to the
right with a sample mean of 38.75. She's considering using her data to make a confidence interval
to estimate the mean age of faculty members at her university. Which conditions for
constructing a t interval have been met? So pause this video and
see if you can answer this on your own. Okay, now let's try to
answer this together. So there's 700 faculty members over here. She's trying to estimate the
population mean, the mean age. She can't talk to all 700,
so she takes a sample, a simple random sample of 20,
so the n is equal to 20 here. From this 20, she calculates
a sample mean of 38.75. Now ideally, she wants to
construct a t interval, a confidence interval,
using the t statistic and so that interval would
look something like this. It would be the sample mean
plus or minus the critical value times the sample standard deviation divided by the square root of n. And we use a t statistic
like this and a t table and a t distribution when
we are trying to create confidence intervals for means
where we don't have access to the standard deviation of
the sampling distribution, but we can compute the
sample standard deviation. Now in order for this to hold
true, there's three conditions just like what we saw when
we thought about z intervals. The first is is that our sample is random. Well, they tell us that here that she took a simple
random sample of 20, and so we know that we are
meeting that constraint, and that's actually choice A, the data is a random sample
from the population of interest, so we can circle that in. So the next condition
is the normal condition. Now the normal condition
when we're doing a t interval is a little bit more involved
because we do need to assume that the sampling distribution
of the sample means is roughly normal. Now there's a couple of
ways that we can get there. Either our sample size is
greater than or equal to 30. The central limit theorem tells us that then our sampling distribution, regardless of what the
distribution is in the population, that the sampling distribution
actually would then be approximately normal. She didn't mean that
constraint right over here. Here, her sample size is only 20, so so far this isn't looking good. Now that's not the only way
to meet the normal condition. Another way to meet the normal condition, if we have a smaller
sample size smaller than 30 is one, if the original
distribution of ages is normal, so original distribution normal, or even if it's roughly
symmetric around the mean, so approximately symmetric, but if look at it, this, they tell us that it has right skew. They say the data were skewed to the right with the sample mean of 38.75. So that tells us that the data set that we're getting in our
sample is not symmetric, and the original distribution
is unlikely to be normal. Think about it. It's not going to be, you're
likely to have people who are, you could have faculty
members who are 30 years older than this, 68 and three quarters, but you're very unlikely
to have faculty members who are 30 years younger than this, and that's actually what's
causing that skew to the right. So this one does not meet
the normal condition. We can't feel good that
our sampling distribution of the sample means is going to be normal, so I'm not gonna fill that one in. Choice C: Individual
observations can be considered independent. So there's two ways to
meet this constraint. One is is if we sample with replacement. Every faculty member we look at after asking them their age, we say, "Hey, go back into the pool
and we might pick 'em again "until we get our sample of 20." It does not look like she did that. It doesn't look like she
sampled with replacement, and so even if you're
sampling without replacement, the 10% rule says that, "Look,
as long as this is less than "10% or less than or equal
to 10% of the population, "then we're good," and
the 10% of this population is 70; 70 is 10% of 700,
and so this is definitely less than or equal to 10%,
and so it can be considered independent, and so we can
actually meet that constraint as well. So the main issue where our t
interval might not be so good is that our sampling distribution, we can't feel so confidant that
that is going to be normal.