Main content

## Confidence intervals for the difference between two proportions

Current time:0:00Total duration:4:36

# Examples identifying conditions for inference on two proportions

AP Stats: UNC‑4 (EU), UNC‑4.J (LO), UNC‑4.J.1 (EK)

## Video transcript

- [Instructor] A sociologist suspects that men are more likely to have received a ticket for speeding than women are. The sociologist wants to sample people and create a two-sample Z interval, in other videos we
introduced what that idea is, to estimate the difference between the proportion of men who have received a speeding ticket and the proportion of women who have received a speeding ticket. Which of the following are conditions for this type of interval? Choose all answers that apply. So, like always pause this video and see if you can answer it on your own. All righty, let's review our
conditions for inference. So, you have your random condition and these are the same ones that we have talked about when we were dealing with one sample but now we just have to make sure that it applies to both samples, that both samples we feel
good are randomly selected. The second one is the normal condition and this is to feel good that the sampling distribution
of the sample proportion for each of the samples is roughly normal and so, what you have to do is you take the sample size of the first sample times
the sample proportion of the first sample and that needs to be
greater than or equal to 10. You take the sample
size of the first sample times one minus the sample
proportion of the first sample, that should also be greater
than or equal to 10. Another way to think about it is your best sense of the
expected number of successes and failures should be
greater than or equal to 10 and then you do this
with the second sample. So, the sample size of the second sample, these don't have to be the same,
times the sample proportion of the second sample should be greater than
or equal to 10 as well and the sample size of the
second sample times one minus the sample proportion
of the second sample, that needs to be greater
than or equal to 10. This is has to and, all of this needs to be true and then the final one is
the independence condition. And we meet that if
individual observations in these samples are done with replacement or if even they're not
done with replacement but if the samples are no more
that 10% of the population then we meet the independence condition and once again, you've seen this before, we're now just doing it with two samples. So, let's see. Which of the following are conditions for this type of interval? So, the samples both
include at least 10 people who have received a speeding ticket and at least 10 people who haven't. Yeah, that's right. This is, you could view
this as the expected number of people who have
received a speeding ticket and this is the expected number of people who haven't received a speeding ticket or an estimate of the expected number because we're using the sample proportion instead of the true proportion. So, these need to be
greater than or equal to 10 in both samples. So, this is absolutely true. The people in each sample can be considered independent. Yeah, we have that independence condition. Either they're sampled with replacement or we are sampling no more
than 10% of the population, so this is important. And then last but not least they take separate random samples of men and women. Yeah, that's the random
condition right over here. So, they have all three
of them right over here. We have our normal condition, our independent condition and our random condition. Let's do another example. A biologist is studying a certain disease affecting oak tress in a forest. They are curious if there's a difference in the proportion of trees that are infected in the
North and South sections of the forest. They want to take a sample of trees from each section and do a two-sample Z test to test their hypotheses. Which of the following are conditions for this type of test? So, pause the video again and see if you can answer this. Okay, so we've already
reviewed our conditions for inference, so let's see which of these are the actual conditions for inference. So, both samples include at 30 trees. So, this might have been tempting because this 30 number shows up when we're thinking about conditions for inference when
we're dealing with means but this does not come up when we're dealing with proportions. Both samples do not need
to include at 30 trees, so this would not be one of our choices. They sample an equal number of tress from each region of the forest. This is a very common misconception that when you're doing a two-sample Z test or when you're doing a
two-sample Z interval or confidence interval that both samples have to
have the same sample size but that is actually not the case. So, we can rule that one out. They observe at least 10 trees with the disease and at least 10 trees without the disease in each sample. Yes, this is the normal condition that we just looked at, so this would be our only
choice and we're done.

AP® is a registered trademark of the College Board, which has not reviewed this resource.