If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## Statistics and probability

### Course: Statistics and probability>Unit 13

Lesson 1: Comparing two proportions

# Comparing population proportions 2

Sal continues the election example for population proportions. Created by Sal Khan.

## Want to join the conversation?

• I did get almost everything... just the last sentence still sounds obscure to me... Why should we point out, with the results we get, that man would vote for candidate one and not zero? •  That's not quite the conclusion Sal reached. There's only one candidate of interest in this case, and we conclude that it appears that men are more likely to vote for that person than are women. Recall that p is the proportion who vote for the candidate, and that p(sub 1) minus p(sub 2) is the difference between the proportions of men who vote for the candidate and women who do likewise. If p(sub 1) minus p(sub 2) is positive (between 0.008 to 0.094), it suggests more men than women are likely to vote for the candidate.
• At the end Sal says that men are definitly more likely to vote on the candidate, ok, I get that. But these numbers are so small (0.8% to 9.4%), and you aren't really sure (because it's a confidence interval). Can you really say that it's a significant difference?
Because the way I see it is that it's still just a sample and the difference isn't really that big, so it really doesn't tell us much. • another way to put it is that Sal is 95% sure the difference between the sexes are .8% to 9.3% , So when the vote happens, we dont know excactly how big the difference is going to be, (it may just be 1% or something) but we are 95% sure the difference is going to be between .8% to 9.3%
• if you ended up with partially negative confidence interval would it still be statistically significant if simply a larger portion of it was on the positive side (thus showing men favoured the "1" candidate)? eg. from -0.04 to 0.065

At what point does it lose statistical significance? • Remember what the confidence interval represents here. In this case it tells us, that the DIFFERENCE between the percentage of men and the percentage of women voting for candidate X is (with a chance of 95%) between two values (e. g. -0.04 and 0.065 as you suggested). A negative difference means just that the number you subtract is bigger than the number you subtract from. In our case this means that the percentage of men votig for candidate X is bigger than the percentage of women doing that. So it still makes sense (has statistical significance), because it says that the EVENT that a little more men than women (percentage-whise) would vote for candidate X is still in the 95%-confidence-interval, so not that unlikely.
• Why is Sal not taking "corrected standard deviation"? I expected him to multiply variance by (1000/999). • can someone explain me why he changed from 95% to 97.5% to find z? • The variance presented on the video for the Bernoulli distribution is the population variance, however what we have is only a sample, so shouldn't it be, men for example, S^2_1=(642(1-0.642)^2+(1000-642)(0-0.642)^2)/999? • is almost wrong. It is not there is 95% chance the true population mean difference is within the calculated statistical mean difference. It is that, if we take many more such statistic, and CI each time, 95% of those CIs would contain true population mean difference. • If we are asked to draw a 95% confidence interval, why can we not just use the empirical rule to know that the mean must be within 2 standard deviations? Why use the z table at all?
(1 vote) • The Empirical Rule is an approximation. It's certainly useful, but if we're going to the trouble of making a confidence interval, we may as well be precise.

Additionally, the Empirical Rule corresponds to the Z distribution. Using this for the confidence interval means that you assume you know the population standard deviation. More often, we cannot asume this, and we need to use the t-distribution, for which there is no Empirical Rule.
• Has Sal posted any videos on exactly how to read and use a z or t-table? If not, that would be very helpful.  