If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

### Course: AP®︎/College Statistics>Unit 10

Lesson 4: Setting up a test for a population proportion

# Conditions for a z test about a proportion

Examples showing how to check whether or not the conditions have been met for doing a z test about a proportion.

## Want to join the conversation?

• In my stats class we were taught that in order to be normal the pn and qn had to be greater than or equal to 5. Which way is standard?
(21 votes)
• Where did these three conditions initially introduced and explained in this series of video?
I went through the list of content covered under "Statistics and probability" topics, but I couldn't find any. Could anyone send a link please?
(6 votes)
• Is it a rule that np,n(1-p)>=10? [For the normal condition to be true]
(5 votes)
• When the p-value is being elaborated on, shouldn't it be the population statistic being less or equal the one we calculated for the sample? I guess we are trying to make assumptions about all of the workers?
(3 votes)
• Firstly, to clarify, it's called 'statistic' for samples, but 'parameter' for a population.

Secondly, p-value is the probability that a sample statistic will be at least as extreme as the sample statistic we actually measured, on the condition that our null hypothesis is true. At this stage we are comparing sample statistics, real and expected, not a population parameter.
(2 votes)
• If we had known what Jules's sample proportion would've been, then when we're testing the normal condition we would have used n*(p_hat) and n*(1 - p_hat) instead of np and n(1-p), right??

From my understanding, we're looking at the conditions for Jules's samples (and thus the sample proportion p-hat which may not necessarily be the same as p), not the population so in the first place it seemed weird to me we were using p in our calculations
(1 vote)
• conduct a hypothesis test to determine if the proportion of business students who were involved in some sort of cheating os less than that of nonbusiness students
(1 vote)
• what is the title "z test" about? thanks.
(1 vote)

## Video transcript

- [Instructor] Jules works on a small team of 40 employees. Each employee receives an annual rating, the best of which is exceeds expectations. Management claimed that 10% of employees earn this rating, but Jules suspected it was actually less common. She obtained an anonymous random sample of 10 ratings for employees on her team. She wants to use the sample data to test her null hypothesis that the true proportion is 10% versus her alternative hypothesis that the true proportion is less than 10%, where p is the proportion of all employees on her team who earned exceeds expectations. Which conditions for performing this type of test did Jules' sample meet? And when they're saying which conditions, they are talking about the three conditions, the random condition, the normal condition, and we've seen these before, and the independence condition. So I will let you pause the video now and try to figure this out on your own and then we will review each of these conditions and think about whether Jules' sample meets the conditions that we need to feel good about some of our significance testing. All right, now let's work through this together. So let's just remind ourselves what we're going to do in a significance test. We have our null hypothesis. We have our alternative hypothesis. What we do is we look at the population. The population size, there's 40 employees on staff at this company. We take a sample, in Jules' case she took a sample size of 10, and then we calculate a sample statistic, in this case it is a sample proportion which is equal to, let's just call it p hat sub one. And then we want to calculate a p-value. And just as a bit of review, a p-value is the probability of getting a result at least as extreme as this one if we assume our null hypothesis is true. And in this particular case, because she suspects that not 10% are getting the exceeds expectations, this would be the probability of your sample statistic being less than or equal to the one that you just calculated for a sample size of n equals 10, given that your null hypothesis is true. And if this p-value is less than your predetermined significance level, maybe that's 5% or 10%, but you'd want to decide that ahead of time, then you would reject your null hypothesis because this, the probability of getting this result, seems pretty low, in which case it would suggest the alternative. But then if the p-value is not less than this, then you wouldn't be able to reject the null hypothesis. But the key thing, and this is what this question is all about, in order to feel good about this calculation, we need to make some assumptions about the sampling distribution. We have to assume that it's reasonably normal, that it can actually be used to calculate this probability, and that's where these conditions come into play. The first is the random condition, and that's that the data points in this sample were truly randomly selected. So pause this video. Did she meet the random condition? Well, it says she obtained an anonymous random sample of 10 ratings of employees on her team. They don't say how she did it, but we'll take their word for it that it was an anonymous random sample, so she meets the random condition. Now what about the normal condition? The normal condition tells us that the expected number of successes, which would be our sample size times the true proportion, and the number of failures, sample size times one minus p, need to be at least equal to 10. So they need to be greater than or equal to 10. Now what are they for this particular scenario? Well, n is equal to 10, and our true proportion, remember we're going to assume when do the significance test, we assume the null hypothesis is true, and the null hypothesis tells us that our true proportion is 0.1. So this is 0.1, this is one minus 0.1 which is 0.9. Well, 10 times 0.1 is one, so that's not greater than or equal to 10. So just off of that, we don't meet the normal condition. But even the second one, 10 times 0.9 is nine. That's also not greater than or equal to 10, so we don't meet this normal condition. We can't feel good that the sampling distribution is roughly normal, which we normally assume when we're trying to make this type of calculation. And then last but not least, independence. Independence is to feel good that each of the data points in your sample are independent. The results of whether they are a success or a failure is independent of each other. Now if she was surveying these people with replacement, if each data point was with replacement, you would definitely meet this independence condition. But she didn't do it with replacement, but there's another way to go about it. You could use your 10% rule. If your sample size is less than 10% of the population size, then it's okay, it's considered roughly okay, that you didn't do it with replacement. But her sample size here is 25%, clearly greater than 10%, and so she does not meet the independence condition either. And so if she went and tried to calculate this, assuming a indicative sampling distribution that is roughly normal, I would not feel so good about her results 'cause she didn't meet two of these three conditions.