If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Margin of error 1

Finding the 95% confidence interval for the proportion of a population voting for a candidate. Created by Sal Khan.

Want to join the conversation?

  • blobby green style avatar for user donald moon
    the variance you got doesn't match the variance calculated by sqrt [ p ( 1 - p ) ].
    I would like to know why ?
    (8 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Dr C
      There are two possible explanations:

      The relevant variance is p(1-p), your calculation of √p(1-p) is the standard deviation.

      If that's not the reason, then note that Sal is working by treating "successes" as a 1 and "failures" a a 0, and then applying the typical variance formula - including division by n-1. The p(1-p) formula assumes division by n. Using p(1-p) will get you 0.2451, whereas Sal got 0.2476.
      (8 votes)
  • leaf green style avatar for user strider
    In my stats module the interpretation of the confidence interval is not that we estimate the true population mean to be in a certain interval, as the true population mean is not a variable and is not subject to probability statements.
    Rather, the confidence interval should be interpreted as saying, if I took a large amount of samples from the population and I used each sample mean as the center for my confidence interval, then the percentage of those intervals that will contain the true population mean will be the confidence level percentage. Is this correct?
    (7 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Dr C
      There are two things to note here:

      1. Yes, you are correct in your understanding of a confidence interval and its interpretation.

      2. The population mean can be subject to probability statements. For instance, it is perfectly valid to write:

      0.95 = P( Xbar - 1.95 σ/ √n < µ < Xbar + 1.95 σ/ √n )

      This is how we derive the formula for the confidence interval. However, this is only a valid probability statement when we are thinking of the sample mean, Xbar, as a random variable. The moment we use our sample to calculate the sample mean and plug it in, we have two actual numbers for the bounds of the interval, and hence there is nothing random anymore and we need to switch to the "confidence" interpretation. But before we plug in the observed sample mean to the formula, the "probability" interpretation is still valid.
      (8 votes)
  • blobby green style avatar for user owen-k
    I am totally confused. What is s?
    s-squared is the variance of the sample. So if you square root it, that's the standard deviation of the sample at . Then why do you have to divide that by the square root of the sample size (n) to get the standard deviation? Why are there two standard deviations? What is the second standard deviation of?
    (7 votes)
    Default Khan Academy avatar avatar for user
  • leafers tree style avatar for user Darren Green
    Around I don't understand how if the sample mean was 0.43 and the sample standard deviation was 0.50. Would this not possible result in one standard deviation to the left being a negative value?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Dr C
      I can't watch the video right now, but even if it's impossible for the things being measured to be negative (e.g. rainfall), those sorts of numbers can come out. If there are a lot of data in some cluster around 0.4, but then a few pretty large numbers - outliers - then the standard deviation can get pretty inflated. Some non-normal distributions might exhibit something like this.
      (5 votes)
  • leafers sapling style avatar for user william237
    Can we also use the formula for "sampling distribution of sample proportion" here?

    sqrt(p(1-p)/n) = sqrt(0.43*0.57/100) = sqrt(0.002451) = 0.0497 = ~0.50
    (4 votes)
    Default Khan Academy avatar avatar for user
  • leaf red style avatar for user connor chang
    At Sal mentions 100 possible values, what do these values represent? And the distribution is sampling distribution of sample mean, why is that, how is this related to 100 discrete values? Thanks.
    (3 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user John Thesing
      The values represent the people who answered the survey about which candidate they were going to vote for. If someone indicated that they would vote for person A, then their vote would be assigned a value of 1. Otherwise, they indicated that they would vote for B and thus their vote would be assigned a value of 0. Hope that helps.
      (3 votes)
  • blobby green style avatar for user Stephanie Dixon Ponteau
    None of that made sense to me and I didnt understand at all where you exactly explained margin of error. Is it possible that you could tell me what the margin of error is and how you do it.
    (3 votes)
    Default Khan Academy avatar avatar for user
    • starky seedling style avatar for user Zachary Carson
      Hi,
      The formula is ME(margin of Error)= 2 times the square root of P "hat" times (1 minus P "hat") divide by the amount of people surveyed. The 2 stands for two standard deviation over that stands for 95 % confidence interval. P hat is the result of the survey as a decimal.
      So I think margin of error is where you have a survey and there are, lets say 100 people doing it. You put 100 at the bottom. And then the survey says for example, that there was 48% who disliked English. So this is the "p hat" Then you solve the equation. 0.48 times the (1 minus 0.48). And that equals 0.2496. Then you divide that by 100, that makes it 0.002496, then you have to do the square root of that, which is 0.049959984. Then you multiply by 2. And that is 0.099919968. Then you have to multiply by 100 again to make it a percent. Which gives us 9.99 which we round to 10%. That is the Margin of Error. I at least think it is, I am still learning all this stuff.
      (1 vote)
  • blobby green style avatar for user trevorzacek
    Could there be a possibility that the sample mean would not equal the population mean? For example, if we surveyed only the people who were going to vote for the given candidate, the sample mean would not equal the population mean. Please correct me if I'm wrong.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Dr C
      The sample mean will often not equal the population mean. That's somewhat the point of Statistics: different samples will give different results, and we want to use just one sample to be able to generalize to the population.
      (3 votes)
  • leaf green style avatar for user Ivan Haralamov
    At why does Sal write µ_¯x instead of ¯x when he refers to the sampling distribution of the sample mean? Isn't that the same as the sample mean ¯x?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • duskpin ultimate style avatar for user Velociraptor89
    How does he know which value should equal 0 and which value should equal 1 He said that they could be switched but if they were, the sample mean would be different, so..how does that work?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • leafers ultimate style avatar for user Phil P
      If you switched 0 and 1, the mean would indeed have a different value, because it would represent the proportion of people who vote for candidate A instead of the proportion who vote for candidate B. But it wouldn't affect the variance or standard deviation.

      Using the values from Sal's example:
      x-bar = (57 * 0 + 43 * 1) / 100 = 0.43
      s^2 = (57(0 - 0.43)^2 + 43(1 - 0.43)^2) / 99
      = (57 * 0.43^2 + 43 * 0.57^2) / 99
      = 0.2475

      If you switch 0 and 1, you get
      x-bar = (57 * 1 + 43 * 0) / 100 = 0.57
      s^2 = (57(1 - 0.57)^2 + 43(0 - 0.57)^2) / 99
      = (57 * 0.43^2 + 43 * 0.57^2) / 99
      = 0.2475

      So the sample variance is unchanged.

      Since the size of a confidence interval only depends on the sample variance, the confidence interval bounds would be the same distance either side of the new mean as they were from the original mean.

      In the second part to this video, Sal concludes that the 95% confidence interval is from 33% to 53% (10% either side of x-bar = 43%). That represents the expected proportion of votes for candidate B, which means the expected proportion of votes for candidate A would be 47% (= 100% - 53%) to 67% (= 100% - 33%).

      After switching 0 and 1, you'd instead get a confidence interval of 47% to 67% (10% either side of x-bar = 57%), which represents the expected proportion of votes for candidate A. Thus, the expected proportion of votes for candidate B would be 33% (= 100% - 67%) to 53% (= 100% - 47%).

      So, ultimately, you get the same results whichever way you assign 0 and 1.
      (5 votes)

Video transcript

Say I live in a country of a 100 million people and there's a presidential election coming up. And in that presidential election there are two candidates. There's candidate A, and candidate B. And there's some reality-- let's say I live in a very decisive country and everyone is going to vote for either-- and everyone participates in election and everyone is going to vote for either candidate A or candidate B. And so there's some percentage, there's some reality there, that p-- let me write it over here-- maybe 1 minus p percent-- let me do the p first. There's some reality that maybe p percent will vote for B, and I could switch them around if I wanted. So p percent are going to vote for B, and the rest of the people are going to vote for A, so maybe 1 minus p percent are going to vote for A. And you might already recognize that this is a Bernoulli Distribution. There's one of two values for a sample I can get. And right here, the values I said you're either voting for candidate A or you're voting for candidate B. It's very hard to deal with those values. You can't calculate a mean between A and B and all of that-- those are letters, they're not numbers. So to make it manipulatable mathematically we're going to say sampling someone who's going to vote for A is equivalent to sampling a 0, and sampling someone who's going to vote for B is equivalent to sampling a 1. And if you do that with a Bernoulli Distribution, we learned in the video on Bernoulli Distributions, that the mean of this distribution right here is going to be equal to p. And it's a pretty straightforward proof for how we got that. So the mean of this distribution, which will actually be not a value that this distribution can take on, is going to be some place over here and it is going to be equal to p. Now my country has a 100 million people. It is practically, or is definitely impossible for me to be able to go and ask all 100 million people who are they going to vote for. So I won't be able to exactly figure out what these parameters are going to be. What my mean is, what p is going to be. But instead of doing that, what I'm going to do is do a random survey. I'm going to sample this population, look at that data, and then get an estimate of what p really is. Because this is what I really care about. I really care about p. So I'm going to try to estimate p with a sample, and then we're also going to think about how good of an estimate that is. So I am going to randomly survey, or sample, 100 people. And let's say I got the following results. Let's say that 57 people say that they were going to vote for person A. Let me write it this way. So 57 people say they're going to vote for A, or that's equivalent to getting 57 samples of 0. And then the rest of the people, once again, very decisive population, no one is undecided, the rest of the people, so 43 people say they're going to vote for B. Or that's the equivalent of sampling 43 1's. Now given this sample here, what is my sample mean and my sample variance? My sample mean right here, well that's just going to be the average of these 0's and 1's So I've got 57 0's, so it's going to be 57 times 0 plus my 43 1's. So the sum of all of my samples, so it's 43 1's, plus 43 times 1, over the total number of samples I took, over 100. So what does this get me? So 57 times 0 is 0. 43 times 1 divided by 100 is 0.43. That is my sample mean, the mean of just the 100 data points that I actually got. Now what is my sample variance? Sample variance is going to be equal to the sum of my squared distances to the mean divided by my samples minus 1. Remember, this is a sample variance, and we want to get the best estimator of the real variance of this distribution. And to do that you don't divide by 100, you're going to divide by 100 minus 1. We learned that many, many videos ago. So I have 57. So I had 57 samples of 0. We'll do it in that same yellow color-- 57 samples of 0. And so each of those samples are 0 minus 0.43 away from the mean. Each of those samples are 0. You subtract 0.43-- this is the difference between 0 and 0.43. And if I want the squared distance, I square it-- that's how we calculate variance. There's 57 of those. And then there's 43 times that I sampled a 1 in my sample population-- 43 times I sampled a 1, and the 1 is 1 minus 0.43 away from the mean because that is the mean, and I want to square that distance. And then I don't want to just divide it by n. I don't want to just divided by 100-- remember, I'm trying to estimate the true population mean. In order for this to be the best estimator of that, and I gave you the intuition of why many, many videos ago, we divide by 100 minus 1 or 99. Let's get the calculator out to actually figure out our sample variance. So let me get the calculator out, and we have-- I'll do the numerator first. I have 57 times 0 minus 0.43 squared, plus 43 times 1 minus 0.43 squared. And then all of that divided by 100 minus 1, or 99-- divided by 99 is equal to 0.2475. So my variance, my sample variance, is equal to 0.2475. And if I want to figure out my sample standard deviation I just take the square root of that. My sample standard deviation is just going to be the square root of my sample variance. So I take the square root of that value that I just had, which is 0.497. So actually let me just round that up as 0.50. So my sample standard deviation is 0.50. Now if you just look at this, you say OK, well your best estimate of the percentage of people voting for A or B is really what you just saw here. Your best estimate or your best estimate of the mean is that 43% of people are going to vote for B and everyone else is going to vote for A. But an interesting question is how good a of a sample is that? Let's take it to the next level. Let's try to think of an interval around 43% for which we are 95%, that we're reasonably confident, roughly 95% sure that the real mean is in that interval. Let me make it very clear. Let me draw. So when we get our sample mean we are sampling from the sampling distribution of the sampling mean. Let me draw that. The sampling distribution of the sample mean. So since we're sampling from a discrete distribution it's actually going to be a discrete distribution, but it's going to have 100 possible values. This can take on 100 different values here. Really anything between 0 and 1. But I'll draw it kind of continuous because it would be hard for me to draw 100 different bars. If I did, you'd have a bar there, you'd have a bar there. The odds that your sample mean would be 1, it would be a very low probability, and then you would have one more bar, a bar like that, a bar like that, but that takes forever to draw. So I'm just going to approximate it with this normal curve right over there. So the sampling distribution of the sample mean-- let me write it over here. So this is the sampling distribution of the sample mean. It has some mean here. It has a mean, and I can denote it with the mu sub x bar-- this tells us this is the mean of the sample distribution. But we know from many, many videos that this is going to be the same thing as the mean of the population mean that we are sampling from, that each sample comes from, each of these 100 samples come from. So this is going to be equal to mu, which is going to be equal to p. Now this variance over here, the variance of this distribution-- let me draw it like this, or even better let's do the standard deviation of this distribution. The standard deviation of this distribution, that distance right over here, the standard deviation of the sampling distribution of the sample mean-- we've seen it multiple times already-- it's going to be this standard deviation-- it's going to be the standard deviation of our population distribution. So that standard deviation is going to be that distance over there. So there's some standard deviation associated with this distribution. It's going to be that standard deviation divided by the square root of our sample size. And we saw many videos ago why that, at least experimentally makes sense, or why it intuitively makes sense. So it's going to be the square root of 100. So it's going to be this guy divided by 10. Now we do not know what this guy is. The only way to figure out what that guy is is to actually survey 100 million people, which would have been impossible. So to estimate the standard deviation of this, we will use our sampling standard deviation as our best estimate for the population standard deviation. So we could say-- and remember, this is an estimate. We cannot come up with the exact number for this just from a sample. But we can estimate it. Because this is our best estimator for this standard deviation, and if we divide it by 10, we will have our best estimator for the standard deviation of the sampling distribution of the sampling mean. So remember, this is just an estimate. It is just an estimate. So you kind of have to take everything after this point with a little bit of a grain of salt. So it's going to be roughly equal to or an estimate of it is going to be 0.5. And remember, every time we take a different sample from here this number is going to change. So this isn't like something in stone. This is dependent on our sample. So it's going to wiggle around a little bit depending on what numbers we actually get in our sample. But it's going to be 0.50. This is the s right here, this 0.50 divided by 10, which is equal to 0.05. So our best estimate of this standard deviation is 0.05, or you could even view it as 5%. Now what I want to do is come up with an interval around the sample mean where I'm reasonably confident using all my estimates and all that that there's a-- let me say I'm really confident that there's a 95% chance that the true mean is within two standard deviations-- or let me put it this way, there's a 95% chance that the true mean is in that interval. So let me write this down. I want to find an interval such that I am reasonably confident-- and I'm putting this kind of touchy-feely language over here because it's all around the fact that I don't know for a fact that the standard deviation is 0.05, I'm just estimating. But I'm reasonably confident that there is a 95% chance that the true mean of the population, which is the same thing as the proportion of the population who are going to vote for person B, or the proportion of the population that are going to be a 1. So this is also, we just have to remember that mu is equal to p. That there's a 95% chance that the true p is in that interval. And actually, since I've already gone 14 minutes into this video, I'm going to pause this video, I'm going to stop this video here, and maybe I'll even let you think about it just based on everything we've done so far. We figured out the sample mean-- sorry, we figured out the sample mean right over here. We've figured out an estimate for the-- and remember, this is just a sampling mean. We don't know the true-- this is the mean of our sample. We don't know the true mean of the sampling distribution, and we also don't know the true standard deviation of the sampling distribution. But we were able to estimate it with the sample standard deviation. Now everything that we have so far, and based on what we've seen before on confidence intervals and all that, how can we find an interval such that roughly-- and I'm saying roughly because we had to estimate the standard deviation-- that there's a 95% chance that the true mean of our population, or the p, the proportion of the population saying 1, is in that interval? And we're going to do that in the next video.