Bernoulli distributions and margin of error
Margin of Error 2 Finding the 95% confidence interval for the proportion of a population voting for a candidate.
⇐ Use this menu to view and help create subtitles for this video in many different languages.
You'll probably want to hide YouTube's captions if using these subtitles.
- Where we left off in the last video I kind
- of gave you a question.
- Find an interval so that we're reasonably confident-- we'll
- talk a little bit more about why I have to give this kind
- of vague wording right here-- reasonably confident that
- there's a 95% chance that the true population mean, which is
- p, which is the same thing as the mean of the sampling
- distribution of the sampling mean.
- So there's a 95% chance that the true mean-- and
- let me put this here.
- This is also the same thing as the mean of the sampling
- distribution of the sampling mean is in that interval.
- And to do that let me just throw out a few ideas.
- What is the probability that if I take a sample and I were
- to take a mean of that sample, so the probability that a
- random sample mean is within two standard deviations of the
- sampling mean, of our sample mean?
- So what is this probability right over here?
- Let's just look at our actual distribution.
- So this is our distribution, this right here is our
- sampling mean.
- Maybe I should do it in blue because that's
- the color up here.
- This is our sampling mean.
- And so what is the probability that a random sampling mean is
- going to be two standard deviations?
- Well a random sampling is a sample from this distribution.
- It is a sample from the sampling distribution of the
- sample mean.
- So it's literally what is the probability of finding a
- sample within two standard deviations of the mean?
- That's one standard deviation, that's another standard
- deviation right over there.
- In general, if you haven't committed this to memory
- already, it's not a bad thing to commit to memory, is that
- if you have a normal distribution the probability
- of taking a sample within two standard deviations is 95--
- and if you want to get a little bit more
- accurate it's 95.4%.
- But you could say it's roughly-- or maybe I could
- write it like this-- it's roughly 95%.
- And really that's all that matters because we have this
- little funny language here called reasonably confident,
- and we have to estimate the standard deviation anyway.
- In fact, we could say if we want, I could say that it's
- going to be exactly equal to 95.4%.
- But in general, two standard deviations, 95%, that's what
- people equate with each other.
- Now this statement is the exact same thing as the
- probability that the sample mean, that the sampling mean--
- not the sample mean, the probability of the mean of the
- sampling distribution is within two standard deviations
- of the sampling distribution of x is also going to be the
- same number, is also going to be equal to 95.4%.
- These are the exact same statements.
- If x is within two standard deviations of this, then this,
- then the mean, is within two standard deviations of x.
- These are just two ways of phrasing the same thing.
- Now we know that the mean of the sampling distribution, the
- same thing as a mean of the population distribution, which
- is the same thing as the parameter p-- the proportion
- of people or the proportion of the population that is a 1.
- So this right here is the same thing as the population mean.
- So this statement right here we can switch this with p.
- So the probability that p is within two standard deviations
- of the sampling distribution of x is 95.4%.
- Now we don't know what this number right here is.
- But we have estimated it.
- Remember, our best estimate of this is the true standard, or
- it is the true standard deviation of the population
- divided by 10.
- We can estimate the true standard deviation of the
- population with our sampling standard deviation, which was
- 0.5, 0.5 divided by 10.
- Our best estimate of the standard deviation of the
- sampling distribution of the sample mean is 0.05.
- So now we can say-- and I'll switch colors-- the
- probability that the parameter p, the proportion of the
- population saying 1, is within two times-- remember, our best
- estimate of this right here is 0.05 of a sample mean that we
- take is equal to 95.4%.
- And so we could say the probability that p is within 2
- times 0.05 is going to be equal to-- 2.0 is going to be
- 0.10 of our mean is equal to 95-- and actually let me be a
- little careful here.
- I can't say the equal now, because over here if we knew
- this, if we knew this parameter of the sampling
- distribution of the sample mean, we could
- say that it is 95.4%.
- We don't know it.
- We are just trying to find our best estimator for it.
- So actually what I'm going to do here is actually just say
- is roughly-- and just to show that we don't even have that
- level of accuracy, I'm going to say roughly 95%.
- We're reasonably confident that it's about 95% because
- we're using this estimator that came out of our sample,
- and if the sample is really skewed this is going to be a
- really weird number.
- So this is why we just have to be a little bit more exact
- about what we're doing.
- But this is the tool for at least saying
- how good is our result.
- So this is going to be about 95%.
- Or we could say that the probability that p is within
- 0.10 of our sample mean that we actually got.
- So what was the sample mean that we actually got?
- It was 0.43.
- So if we're within 0.1 of 0.43, that means we are within
- 0.43 plus or minus 0.1 is also, roughly, we're
- reasonably confident it's about 95%.
- And I want to be very clear.
- Everything that I started all the way from up here in brown
- to yellow and all this magenta, I'm just restating
- the same thing inside of this.
- It became a little bit more loosey-goosey once I went from
- the exact standard deviation of the sampling distribution
- to an estimator for it.
- And that's why this is just becoming-- I kind of put the
- squiggly equal signs there to say we're reasonably
- confident-- and I even got rid of some of the precision.
- But we just found our interval.
- An interval that we can be reasonably confident that
- there's a 95% probability that p is within that, is going to
- be 0.43 plus or minus 0.1.
- Or an interval of-- we have a confidence interval.
- We have a 95% confidence interval of, and we could say,
- 0.43 minus 0.1 is 0.33.
- If we write that as a percent we could say 33% to-- and if
- we add the 0.1, 0.43 plus 0.1 we get 53%-- to 53%.
- So we are 95% confident.
- So we're not saying kind of precisely that the probability
- of the actual proportion is 95%, but we're 95% confident
- that the true proportion is between 33% and 55%.
- That p is in this range over here.
- Or another way, and you'll see this in a lot of surveys that
- have been done, people will say we did a survey and we got
- 43% will vote for number one, and number one in this case is
- candidate B.
- And then the other side, since everyone else voted for
- candidate A, 57% will vote for A.
- And then they're going to put on margin of error.
- And you'll see this in any survey that you see on TV.
- They'll put a margin of error.
- And the margin of error is just another way of describing
- this confidence interval.
- And they'll say that the margin of error in this case
- is 10%, which means that there's a 95% confidence
- interval, if you go plus or minus 10% from that value
- right over there.
- And I really want to emphasize, you can't say with
- certainty that there is a 95% chance that the true result
- will be within 10% of this, because we had to estimate the
- standard deviation of the sampling mean.
- But this is the best measure we can with the information
- you have. If you're going to do a survey of 100 people,
- this is the best kind of confidence that we can get.
- And this number is actually fairly big.
- So if you were to look at this you would say, roughly there's
- a 95% chance that the true value of this number is
- between 33% and 53%.
- So there's actually still a chance that candidate B can
- win, even though only 43% of your 100 are
- going to vote for him.
- If you wanted to make it a little bit more precise you
- would want to take more samples.
- You can imagine.
- Instead of taking 100 samples, instead of n being 100, if you
- made n equal 1,000, then you would take this number over
- here, you would take this number here and divide by the
- square root of 1,000 instead of the square root of 100.
- So you'd be dividing by 33 or whatever.
- And so then the size of the standard deviation of your
- sampling distribution will go down.
- And so the distance of two standard deviations will be a
- smaller number, and so then you will have a
- smaller margin of error.
- And maybe you want to get the margin of error small enough
- so that you can figure out decisively who's going to win
- the election.
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
|
Have something that's not a question about this content? |
This discussion area is not meant for answering homework questions.
Discuss the site
For general discussions about Khan Academy, visit our Reddit discussion page.
Flag inappropriate posts
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
abuse
- disrespectful or offensive
- an advertisement
not helpful
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
wrong category
- a tip or feedback in Questions
- a question in Tips & Feedback
- an answer that should be its own question
about the site
Share a tip
Suggest a fix
Have something that's not a tip or feedback about this content?
This discussion area is not meant for answering homework questions.