Main content

## Tests about a population proportion

# Conditions for a z test about a proportion

AP.STATS:

VAR‑6 (EU)

, VAR‑6.F (LO)

, VAR‑6.F.1 (EK)

## Video transcript

- [Instructor] Jules works on
a small team of 40 employees. Each employee receives an annual rating, the best of which is exceeds expectations. Management claimed that 10%
of employees earn this rating, but Jules suspected it
was actually less common. She obtained an anonymous
random sample of 10 ratings for employees on her team. She wants to use the sample
data to test her null hypothesis that the true proportion is 10% versus her alternative hypothesis
that the true proportion is less than 10%, where p is
the proportion of all employees on her team who earned
exceeds expectations. Which conditions for
performing this type of test did Jules' sample meet? And when they're saying which conditions, they are talking about
the three conditions, the random condition,
the normal condition, and we've seen these before,
and the independence condition. So I will let you pause the video now and try to figure this out on your own and then we will review
each of these conditions and think about whether
Jules' sample meets the conditions that we need to feel good about some of our significance testing. All right, now let's work
through this together. So let's just remind ourselves
what we're going to do in a significance test. We have our null hypothesis. We have our alternative hypothesis. What we do is we look at the population. The population size, there's
40 employees on staff at this company. We take a sample, in Jules' case she took
a sample size of 10, and then we calculate a sample statistic, in this case it is a sample proportion which is equal to, let's
just call it p hat sub one. And then we want to calculate a p-value. And just as a bit of review,
a p-value is the probability of getting a result at
least as extreme as this one if we assume our null hypothesis is true. And in this particular
case, because she suspects that not 10% are getting
the exceeds expectations, this would be the probability
of your sample statistic being less than or equal to the
one that you just calculated for a sample size of n equals 10, given that your null hypothesis is true. And if this p-value is less than your predetermined
significance level, maybe that's 5% or 10%, but you'd want to decide
that ahead of time, then you would reject your null hypothesis because this, the probability
of getting this result, seems pretty low, in which case it would suggest the alternative. But then if the p-value
is not less than this, then you wouldn't be able to
reject the null hypothesis. But the key thing, and
this is what this question is all about, in order to feel
good about this calculation, we need to make some assumptions about the sampling distribution. We have to assume that
it's reasonably normal, that it can actually be used
to calculate this probability, and that's where these
conditions come into play. The first is the random condition, and that's that the data
points in this sample were truly randomly selected. So pause this video. Did she meet the random condition? Well, it says she obtained
an anonymous random sample of 10 ratings of employees on her team. They don't say how she did
it, but we'll take their word for it that it was an
anonymous random sample, so she meets the random condition. Now what about the normal condition? The normal condition tells
us that the expected number of successes, which
would be our sample size times the true proportion,
and the number of failures, sample size times one minus p, need to be at least equal to 10. So they need to be greater
than or equal to 10. Now what are they for
this particular scenario? Well, n is equal to 10, and our true proportion,
remember we're going to assume when do the significance test, we assume the null hypothesis is true, and the null hypothesis tells us that our true proportion is 0.1. So this is 0.1, this is
one minus 0.1 which is 0.9. Well, 10 times 0.1 is one,
so that's not greater than or equal to 10. So just off of that, we don't
meet the normal condition. But even the second one,
10 times 0.9 is nine. That's also not greater
than or equal to 10, so we don't meet this normal condition. We can't feel good that
the sampling distribution is roughly normal,
which we normally assume when we're trying to make
this type of calculation. And then last but not least, independence. Independence is to feel good
that each of the data points in your sample are independent. The results of whether they
are a success or a failure is independent of each other. Now if she was surveying
these people with replacement, if each data point was with replacement, you would definitely meet
this independence condition. But she didn't do it with replacement, but there's another way to go about it. You could use your 10% rule. If your sample size is less than 10% of the population size, then it's okay, it's
considered roughly okay, that you didn't do it with replacement. But her sample size here is
25%, clearly greater than 10%, and so she does not meet the
independence condition either. And so if she went and
tried to calculate this, assuming a indicative
sampling distribution that is roughly normal,
I would not feel so good about her results 'cause she didn't meet two of these three conditions.