If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Confidence intervals and margin of error

If we poll 100 people, and 56% of them support a candidate, we can use what we know about sampling distributions and margin of error to build a confidence interval to estimate the true value of the percentage in the population.

Want to join the conversation?

  • blobby green style avatar for user Remi Schwartz
    Where did the standard deviation forumla for population proportion come from?
    (26 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user ju lee
    at , i dont get why the two sentences are equivalent.
    (15 votes)
    Default Khan Academy avatar avatar for user
    • leafers ultimate style avatar for user Phil P
      Suppose you have three numbers: a, b and c.
      Then, "b is within c of a" means the same as "a is within c of b".
      Both of those statements are equivalent to "the absolute value of the difference between a and b is less than c".
      In the video, a = p, b = p̂, and c = 2σ.
      (20 votes)
  • blobby green style avatar for user ju lee
    why p hat (sample proportion) can be used as a substitute for p (population proportion)? in reality, they have different values, right?
    (10 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user JorgeMercedes
      Yes, p hat and p are extremely unlikely to be the same. In reality, p is impossible to know (if population size is huge) or difficult to find out. We only have p hat as an estimate of p. To get the best estimate to p, we need to take more samplings, find p hat of all these samplings and find the mean of p hat; this mean of p hat will be very close to p.

      The substitution of p with p hat is used to find the Standard Error. That is why it is called SE and not Standard Deviation.
      (21 votes)
  • leafers ultimate style avatar for user roger.llrt
    Within the interval, in this case from 0.44 to 0.64, which kind of probability distribution is it assumed for the true parameter? Is it a uniform distribution?
    (7 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user ughussain
    At the statement that "There is 95% prob that p is within 2sigma_phat of phat" is awkward. This makes it seem that we are imagining another distribution with mean phat and standard deviation sigma_phat and saying that p lies within 2 sigma of that, which I don' think is correct. There is just one distribution for the various sample proportions phat with mean p and for each value of phat the mean will lie within the confidence interval with probability 95%.
    (10 votes)
    Default Khan Academy avatar avatar for user
  • starky ultimate style avatar for user E M
    @ Why not N-1 instead of N?
    (8 votes)
    Default Khan Academy avatar avatar for user
    • leaf grey style avatar for user |value|
      In unbiased estimation of population standard deviation, we have an n - 1 to partially correct for the fact that a sample is likely less spread out than the population. This is estimating a population statistic using a sample statistic.

      Here we are not trying to estimate population statistic. We know sample mean is a good estimator for population mean; we are just trying to quantify how good that estimator is, with the SE. The SE is a property of the sample. So no need to do n-1 as it does not have to do with the population.
      (2 votes)
  • leaf green style avatar for user omprakash.nekkanti
    if p hat is a good approximation of p, why bother creating an interval? why cant we simply say sample proportion is population proportion
    (8 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user xiaotingh1117
      P hat will always be different for each time of sampling, so we cannot say sample proportion get get just by 1 time sampling is the true population proportion. But we could estimate the range of true proportion base on sample we get, which is intervals, basing on different confidence level.
      (1 vote)
  • spunky sam blue style avatar for user Sher Jav
    why don't we use a 100% confidence interval?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • winston default style avatar for user Victor Gutierrez
    This seems so wrong.. That probability of 95% is when you have the true standard deviation. Since we are using a fake standard deviation (standard error) how come we can still use this probability of 95%?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • stelly yellow style avatar for user 24tinat
    Isn't the standard error formula σ/n? How did we just suddenly say that SE= square root ((p(p-1)/n) ?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • primosaur seed style avatar for user Ian Pulizzotto
      Interesting question! First, there are some corrections: the general standard error formula is σ/square root(n), and the SE for a proportion is square root (p(1-p)/n).

      In the context of proportions, σ can be thought of as the standard deviation of a Bernoulli random variable that has value 1 with probability p, and 0 with probability 1-p. The mean of this Bernoulli random variable is 0(1-p) + 1p = p, so the variance is (1-p)(0-p)^2 + p(1-p)^2 = p(1-p)(p+(1-p)) = p(1-p). Therefore, the standard deviation is σ = square root(p(1-p)).

      So the SE for a proportion is σ/square root(n) = square root(p(1-p))/square root(n) = square root(p(1-p)/n).

      Have a blessed, wonderful day!
      (4 votes)

Video transcript

- [Instructor] It is election season, and there is a runoff between candidate A versus candidate B. And we are pollsters. And we're interested in figuring out, well, what's the likelihood that candidate A wins this election? Well, ideally, we would go to the entire population of likely voters right over here, let's say there's 100,000 likely voters, and we would ask every one of the them, who do you support? And from that, we would be able to get the population proportion, which would be, this is the proportion that support, support candidate A. But it might not be realistic. In fact, it definitely will not be realistic to ask, well, all 100,000 people. So instead, we do the thing that we tend to do in statistics is, is that we sample this population, and we calculate a statistic from that sample in order to estimate this parameter. So let's say we take a sample right over here. So this sample size, let's say n equals 100. And we calculate the sample proportion that support candidate A. So out of the 100, let's say that 54 say that they're going to support candidate A. So the sample proportion here is 0.54. And just to appreciate that we're not always going to get 0.54, there could've been a situation where we sampled a different 100, and we would've maybe gotten a different sample proportion. Maybe in that one, we got 0.58. And we already have the tools in statistics to think about this, the distribution of the possible sample proportions we could get. We've talked about it when we thought about sampling distributions. So you could have the sampling distribution of the sample proportions, of the sample proportions, proportions. And it's going, this distribution's going to be specific to what our sample size is, for n is equal to 100. And so we can describe the possible sample proportions we could get and their likelihoods with this sampling distribution. So let me do that. So it would look something like this. Because our sample size is so much smaller than the population, it's way less than 10%, we can assume that each person we're asking, that it's approximately independent. Also, if we make the assumption that the true proportion isn't too close to zero or not too close to one, then we can say that, well, look, this sampling distribution is roughly going to be normal. So we'll have a normal, this kind of bell curve shape. And we know a lot about the sampling distribution of the sample proportions. We know already, for example, and if this is foreign to you, I encourage you to watch the videos on this on Khan Academy, that the mean of this sampling distribution is going to be the actual population proportion. And we also know what the standard deviation of this is going to be. So, let me, maybe that's one standard deviation. This is two standard deviations. That's three standard deviations above the mean. That's one standard deviation, two standard deviations, three standard deviations below the mean. So this distance, let me do this in a different color, this standard deviation right over here, which we denote as the standard deviation of the sample proportions, for this sampling distribution, this is, we've already seen the formula there. It's the square root of p times one minus p, where p is, once again, our population proportion divided by our sample size. That's why it's specific for n equals 100 here. And so in this first scenario, let's just focus on this one right over here, when we took a sample size of n equals 100 and we got the sample proportion of 0.54, we could've gotten all sorts of outcomes here. Maybe 0.54 is right over here. Maybe 0.54 is right over here. And the reason why I had this uncertainty is we actually don't know what the real population parameter is, what the real population proportion is. But let me ask you maybe a slightly easier question. What is, what is the probability, probability that our sample proportion of 0.54 is within, is within two times two standard deviations of p? Pause the video, and think about that. Well, that's just saying, look, if I'm gonna take a sample and calculate the sample proportion right over here, what's the probability that I'm within two standard deviations of the mean? Well, that's essentially going to be this area right over here. And we know, from studying normal curves, that approximately 95% of the area is within two standard deviations. So this is approximately 95%. 95% of the time that I take a sample size of 100 and I calculate this sample proportion, 95% of the time, I'm going to be within two standard deviations. But if you take this statement, you can actually construct another statement that starts to feel a little bit more, I guess we could say inferential. We could say there, there is a 95% probability that the population proportion p is within, within two standard deviations, two standard deviations of p-hat, which is equal to 0.54. Pause this video. Appreciate that these two are equivalent statements. If there's a 95% chance that our sample proportion is within two standard deviations of the true proportion, well, that's equivalent to saying that there's a 95% chance that our true proportion is within two standard deviations of our sample proportion. And this is really, really interesting because if we were able to figure out what this value is, well, then we would be able to create what you could call a confidence interval. Now, you immediately might be seeing a problem here. In order to calculate this, our standard deviation of this distribution, we have to know our population parameter. So pause this video, and think about what we would do instead. If we don't know what p is here, if we don't know our population proportion, do we have something that we could use as an estimate for our population proportion? Well, yes, we calculated p-hat already. We calculated our sample proportion. And so a new statistic that we could define is the standard error, the standard error of our sample proportions. And we can define that as being equal to, since we don't know the population proportion, we're going to use the sample proportion, p-hat times one minus p-hat, all of that over n. In this case, of course, n is 100. We do know that. And it actually turns out, I'm not going to prove it in this video, that this actually is an unbiased estimator for this right over here. So this is going to be equal to 0.54 times one minus 0.54, so it's 0.46, all of that over 100. So we have the square root of .54 times .46 divided by 100, close my parentheses, Enter. So if I round to the nearest hundredth, it's going to be, actually, even if I round to the nearest thousandth, it's going to be approximately 5/100. So this is going to be, this is approximately 0.05. So another way to say all of these things is, instead, we don't know exactly this, but now we have an estimate for it. So we could now say with 95% confidence, and that will often be known as our confidence level right over here, with 95% confidence between, between, and so we'd want to go two standard errors below our sample proportion that we just happened to calculate. So that would be 0.54 minus two times 5/100. So that would be 0.54 minus 10/100, which would be 0.44. And we'd also want to go two standard errors above the sample proportion. So that would be that plus 10/100. And 0.64 of voters, of voters support, support A. And so this interval that we have right over here, from 0.44 to 0.64, this will be known as our confidence interval, confidence interval. And this will change, not just in the starting point and the end point, but it will change the actual length of our confidence interval, will change depending on what sample proportion we happened to pick for that sample of 100. A related idea to the confidence interval is this notion of margin of error, margin of error. And for this particular case, for this particular sample, our margin of error, because we care about 95% confidence, so that would be two standard errors. So our margin of error here is two times our standard error, would just be 0.1 or 0.10. And so we're going one margin of error above our sample proportion right over here and one margin of error below our sample proportion right over here to define our confidence interval. And as I mentioned, this margin of error is not going to be fixed every time we take a sample. Depending on what our sample proportion is, it's going to affect our margin of error because that is calculated, essentially, with the standard error. Another interpretation of this is that the method that we used to get this interval right over here, the method that we used to get this confidence, to get this confidence interval, when we use it over and over, it will produce intervals, and the intervals won't always be the same. It's gonna be dependent on our sample proportion, but it will produce intervals which include the true proportion, which we might not know and often don't know. It'll include the true proportion 95% of the time. I'll cover that intuition more in future videos. We'll see how the interval changes, how the margin of error changes. But when you do this calculation over and over and over again, 95% of the time, your true proportion is going to be contained in whatever interval you happen to calculate that time. Now, another interesting question is, is, well, what if you wanted to tighten up the intervals on average? How would you do that? Well, if you wanted to lower your margin of error, the best way to lower the margin of error is if you increase this denominator right over here. And increasing that denominator means increasing the sample size. And so one thing that you will often see when people are talking about election coverage is, well, we need to sample more people in order to get a lower margin of error. But I'll leave you there, and I'll see you in future videos.