Main content

### Course: AP®︎/College Statistics > Unit 11

Lesson 2: Setting up a test for a population mean# Conditions for a t test about a mean

Example showing how to check which conditions have been met for a t test about a mean.

## Want to join the conversation?

- I don't get how you would do this type of problem where the standard deviation is from the population and not the sample:

"The amount of money collected each week by a city’s parking meter staff has been recorded for a decade and has an average of $35,800 and a standard deviation of $720. In February, a supervisor noticed that the weekly collection appeared to be smaller than usual. For the most recent 9 weeks, the average weekly collection was $35,200. Is there significant evidence that someone among the meter staff has been skimming some of the collection into their own pockets? Assume that the data for the most recent 9 weeks can be viewed as a random sample and that the amount of money collected each week has a normal distribution. Test at level α = 0.05."(2 votes)- Since we know the population standard deviation, we can perform a z-test.

First of all, the conditions are met:

9 weeks is less than 10% of a decade.

We can view the 9 weeks as a random sample.

The population is normally distributed.

𝑧 = (35200 − 35800)∕(720∕√9) = −2.5,

which gives us the probability of a random sample mean being $35,200 or less as 𝑝 = 0.0062 < 𝛼 = 0.05,

meaning that the difference between the population mean and our sample mean is most likely not due to random chance, and we have significant evidence that someone is skimming off the top.

(Note that we immediately jump to the conclusion that someone is skimming, though the decrease in money could be due to other reasons.

Maybe the parking company lowered the fee, or there was a virus outbreak that caused people to stay home and not use public parking places as much.)(4 votes)

- Sal seems to use the expressions "significance test" and "hypothesis testing" interchangeably. However, according to https://stats.stackexchange.com/a/16227/82135, it seems that these terms do actually refer to different tests. So, is Sal in this video using the terms interchangeably or am I not actually understanding that Sal is really talking about a significance test instead of a hypothesis testing?(1 vote)
- yes, me too think so its a mistake by him. he should have clarified the difference.(1 vote)

- (0:12) what if you send 100messages in everyday everybody say stop do not do that.(0 votes)

## Video transcript

- [Instructor] Sunil and his
friends have been using a group messaging app for over a
year to chat with each other. He suspects that, on
average, they send more than 100 messages per day. Sunil takes a random sample of seven days from their chat history and
records how many messages were sent on those days. The sample data are
strongly skewed to the right with a mean of 125 messages
and a standard deviation of 44 messages. He wants to use these
sample data to conduct a t test about the mean. Which conditions for performing this type of significance test have been met? So let's just think about
what's going on here. Sunil might have some
type of a null hypothesis. Maybe he got this 100, maybe
he read a magazine article that says, that on average,
the average teenagers sends a 100 text messages per day. And so maybe the null hypothesis
is that the mean amount of messages per day that
he and his friend send, which was signified by
mu, maybe the null is 100, that they're no different
than all other teenagers. And maybe he suspects, and actually they say it right over here, his alternative hypothesis
would be what he suspects. That they send more than
100 text messages per day. And so what he does, is he takes a sample from the population of
days, and there's over 365. They said they've been using
the group messaging app for over a year. And he takes seven of those days. So, n is equal to seven. And from that he calculates
sample statistics. He calculates the sample mean,
which is trying to estimate the true population mean, right over here. And he also is able to calculate a sample standard deviation. And what you do in a significance test is you say, well what is the probability of getting this sample mean or
something even more extreme, assuming the null hypothesis. And if that probability is
below a preset threshold, then you would reject the null hypothesis and it would suggest the alternative. But in order to feel good
about that significance test and be able to even calculate
that p-value with confidence. There are conditions for performing this type of significance test. The first is, that this
truly a random sample. And that's known as the random condition. And you've seen this before
when we did significance tests with proportions, here
we're doing it with means. Population mean, sample
mean, in the past we did it with population proportion
and sample proportion. Well the random condition,
it says it right here, Sunil takes a random sample of seven days from their chat history. They don't say how they did it, but we'll just take their word for it, that it was a random sample. The next condition is sometimes
known as the independence, independence condition, and
that's that the individual observations in our sample
are roughly independent. One way that they would
be independent for sure is if Sunil is sampling with replacement. They don't say that,
but another condition, so either could have replacement,
sampling with replacement. Or another way where you
could feel it's roughly independent is if you're
sample size is less than or equal to 10% of the population. Now in this situation, he took seven, he took a sample size of seven. And then the population of
days, it says that they've been using the group messaging
app for over a year. So they've been using
it for over 365 days. So seven is for sure less
than or equal to 10% of 365, which would be 36.5. So we meet this condition,
which allows us to meet the independence condition. Now the last condition is often known as the normal condition. And this is to feel good that
the sampling distribution of the sample means, right over here, is approximately normal. And this is going to be
a little bit different than what we saw with significance test when we dealt with proportions. There's a few ways to feel good that the sampling distribution
of the sample means is normal. One is, is the underlying
parent population normal. So parent population normal. Now they don't tell us
anything that there's actually a normal distribution
for the amount of time that they spend on a given day. So we don't know this one for sure. But sometimes you might. Another way is to feel
good that our sample size is greater than or equal to 30. And this comes from the
central limit theorem that then our sampling
distribution is going to be roughly normal. But we see very clearly our
sample is not greater than or equal to 30, so we don't
meet that constraint either. Now the third way that we could feel good that our sampling distribution
of our sample mean is roughly normal, is if our sample, is if our sample is symmetric
and there are no outliers, or maybe say no significance outliers. Now is this the case? Well it says right over here,
the sample data are strongly skewed to the right with
a mean of 125 messages and a standard deviation of 44 messages. So this strongly skewed to the right, it's clearly not a symmetric sample data. And so we don't meet any
of these sub-conditions for the normal condition. And so we do not meet the normal condition for our significance test.