Main content
AP®︎/College Statistics
Course: AP®︎/College Statistics > Unit 10
Lesson 8: Confidence intervals for the difference of two proportions- Confidence intervals for the difference between two proportions
- Examples identifying conditions for inference on two proportions
- Conditions for inference on two proportions
- Calculating a confidence interval for the difference of proportions
- Two-sample z interval for the difference of proportions
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Examples identifying conditions for inference on two proportions
Examples identifying conditions for confidence intervals and tests about two proportions.
Want to join the conversation?
- Why don't the two samples have to have the same size?(3 votes)
- I think it has to do with the fact that we are using proportions and not any other statistics. Because if you think about it, 3/4=0.75, but so is 6/8. There are two different samples sizes, but the same proportion of successes.
edit However, if I were you I would just double check with my teacher or consult Google. :)(4 votes)
- in example one it said the following "A sociologist suspects that men are more likely to have received a ticket for speeding than women are. The sociologist wants to sample people and create a two-sample Z interval, in other videos we introduced what that idea is, to estimate the difference between the proportion of men who have received a speeding ticket and the proportion of women who have received a speeding ticket. Which of the following are conditions for this type of interval? Choose all answers that apply." My question is this. Why is A the answer when it says you choose 10 people who have gotten a ticket for speeding and 10 who have not? I feel that this is a little bias to do this because they are divided into groups instead of chosen at random as C says. can someone explain it a little better for me? thanks(2 votes)
- The samples are choose in random, answer A use the word 'include', not 'choose'. what answer A try to say is AFTER sampling, there needs at least 10 for both success and failure case, if it doesn't, we may need to keep doing random sampling until the condition is met.(2 votes)
- A biologist is studying a certain disease affecting oak trees in a forest. They are curious if there's a difference in the proportion of trees that are infected in the North and South sections of the forest. They want to take a sample of trees from each section and do a two-proportion z-test to test their hypotheses.
Which of the following are required for this type of test? Select all that apply.
The trees that are samples are selected randomly.
They observe at least 10 trees with the disease and 10 trees without the disease in each sample.
They sample an equal number of trees from each region of the forest.(1 vote) - I'm taking this chapter in school right now, and I am familiar with the Random Condition Normal(Large Counts) Condition, but I have never heard of the Independence Condition. Can someone explain what it is? (@) 3:05(1 vote)
Video transcript
- [Instructor] A sociologist suspects that men are more likely to have received a ticket for speeding than women are. The sociologist wants to sample people and create a two-sample Z interval, in other videos we
introduced what that idea is, to estimate the difference between the proportion of men who have received a speeding ticket and the proportion of women who have received a speeding ticket. Which of the following are conditions for this type of interval? Choose all answers that apply. So, like always pause this video and see if you can answer it on your own. All righty, let's review our
conditions for inference. So, you have your random condition and these are the same ones that we have talked about when we were dealing with one sample but now we just have to make sure that it applies to both samples, that both samples we feel
good are randomly selected. The second one is the normal condition and this is to feel good that the sampling distribution
of the sample proportion for each of the samples is roughly normal and so, what you have to do is you take the sample size of the first sample times
the sample proportion of the first sample and that needs to be
greater than or equal to 10. You take the sample
size of the first sample times one minus the sample
proportion of the first sample, that should also be greater
than or equal to 10. Another way to think about it is your best sense of the
expected number of successes and failures should be
greater than or equal to 10 and then you do this
with the second sample. So, the sample size of the second sample, these don't have to be the same,
times the sample proportion of the second sample should be greater than
or equal to 10 as well and the sample size of the
second sample times one minus the sample proportion
of the second sample, that needs to be greater
than or equal to 10. This is has to and, all of this needs to be true and then the final one is
the independence condition. And we meet that if
individual observations in these samples are done with replacement or if even they're not
done with replacement but if the samples are no more
that 10% of the population then we meet the independence condition and once again, you've seen this before, we're now just doing it with two samples. So, let's see. Which of the following are conditions for this type of interval? So, the samples both
include at least 10 people who have received a speeding ticket and at least 10 people who haven't. Yeah, that's right. This is, you could view
this as the expected number of people who have
received a speeding ticket and this is the expected number of people who haven't received a speeding ticket or an estimate of the expected number because we're using the sample proportion instead of the true proportion. So, these need to be
greater than or equal to 10 in both samples. So, this is absolutely true. The people in each sample can be considered independent. Yeah, we have that independence condition. Either they're sampled with replacement or we are sampling no more
than 10% of the population, so this is important. And then last but not least they take separate random samples of men and women. Yeah, that's the random
condition right over here. So, they have all three
of them right over here. We have our normal condition, our independent condition and our random condition. Let's do another example. A biologist is studying a certain disease affecting oak tress in a forest. They are curious if there's a difference in the proportion of trees that are infected in the
North and South sections of the forest. They want to take a sample of trees from each section and do a two-sample Z test to test their hypotheses. Which of the following are conditions for this type of test? So, pause the video again and see if you can answer this. Okay, so we've already
reviewed our conditions for inference, so let's see which of these are the actual conditions for inference. So, both samples include at 30 trees. So, this might have been tempting because this 30 number shows up when we're thinking about conditions for inference when
we're dealing with means but this does not come up when we're dealing with proportions. Both samples do not need
to include at 30 trees, so this would not be one of our choices. They sample an equal number of tress from each region of the forest. This is a very common misconception that when you're doing a two-sample Z test or when you're doing a
two-sample Z interval or confidence interval that both samples have to
have the same sample size but that is actually not the case. So, we can rule that one out. They observe at least 10 trees with the disease and at least 10 trees without the disease in each sample. Yes, this is the normal condition that we just looked at, so this would be our only
choice and we're done.