Main content

### Course: Statistics and probability > Unit 11

Lesson 1: Introduction to confidence intervals# Confidence intervals and margin of error

If we poll 100 people, and 56% of them support a candidate, we can use what we know about sampling distributions and margin of error to build a confidence interval to estimate the true value of the percentage in the population.

## Want to join the conversation?

- Where did the standard deviation forumla for population proportion come from?(28 votes)
- p(1-p) is the variance of binary variables. If you have A with prob(A)=p then prob(B)=1-p. The variance (hence standard deviation) can be computed by the usual formulas. He should have explained this at the outset - crucial point.

https://www.statlect.com/probability-distributions/Bernoulli-distribution(26 votes)

- at5:54, i dont get why the two sentences are equivalent.(17 votes)
- Suppose you have three numbers: a, b and c.

Then, "b is within c of a" means the same as "a is within c of b".

Both of those statements are equivalent to "the absolute value of the difference between a and b is less than c".

In the video, a = p, b = p̂, and c = 2σ.(21 votes)

- why p hat (sample proportion) can be used as a substitute for p (population proportion)? in reality, they have different values, right?(11 votes)
- Yes, p hat and p are extremely unlikely to be the same. In reality, p is impossible to know (if population size is huge) or difficult to find out. We only have p hat as an estimate of p. To get the best estimate to p, we need to take more samplings, find p hat of all these samplings and find the mean of p hat; this mean of p hat will be very close to p.

The substitution of p with p hat is used to find the Standard Error. That is why it is called SE and not Standard Deviation.(21 votes)

- Within the interval, in this case from 0.44 to 0.64, which kind of probability distribution is it assumed for the true parameter? Is it a uniform distribution?(7 votes)
- The true parameter doesn't have a probability distribution, because it's not a random variable. It has an exact value, even if we can't actually measure it.(15 votes)

- At5:53the statement that "There is 95% prob that p is within 2sigma_phat of phat" is awkward. This makes it seem that we are imagining another distribution with mean phat and standard deviation sigma_phat and saying that p lies within 2 sigma of that, which I don' think is correct. There is just one distribution for the various sample proportions phat with mean p and for each value of phat the mean will lie within the confidence interval with probability 95%.(11 votes)
- @3:45Why not N-1 instead of N?(8 votes)
- In unbiased estimation of population standard deviation, we have an n - 1 to partially correct for the fact that a sample is likely less spread out than the population. This is estimating a population statistic using a sample statistic.

Here we are not trying to estimate population statistic. We know sample mean is a good estimator for population mean; we are just trying to quantify how good that estimator is, with the SE. The SE is a property of the sample. So no need to do n-1 as it does not have to do with the population.(2 votes)

- if p hat is a good approximation of p, why bother creating an interval? why cant we simply say sample proportion is population proportion(8 votes)
- P hat will always be different for each time of sampling, so we cannot say sample proportion get get just by 1 time sampling is the true population proportion. But we could estimate the range of true proportion base on sample we get, which is intervals, basing on different confidence level.(1 vote)

- why don't we use a 100% confidence interval?(4 votes)
- Because that would span the entire distribution, it will not be very informative to say that with 100% confidence we can say that the values can exist anywhere on the distribution.(1 vote)

- This seems so wrong.. That probability of 95% is when you have the true standard deviation. Since we are using a fake standard deviation (standard error) how come we can still use this probability of 95%?(6 votes)
- Isn't the standard error formula σ/n? How did we just suddenly say that SE= square root ((p(p-1)/n) ?(3 votes)
- Interesting question! First, there are some corrections: the general standard error formula is σ/square root(n), and the SE for a proportion is square root (p(1-p)/n).

In the context of proportions, σ can be thought of as the standard deviation of a Bernoulli random variable that has value 1 with probability p, and 0 with probability 1-p. The mean of this Bernoulli random variable is 0(1-p) + 1p = p, so the variance is (1-p)(0-p)^2 + p(1-p)^2 = p(1-p)(p+(1-p)) = p(1-p). Therefore, the standard deviation is σ = square root(p(1-p)).

So the SE for a proportion is σ/square root(n) = square root(p(1-p))/square root(n) = square root(p(1-p)/n).

Have a blessed, wonderful day!(4 votes)

## Video transcript

- [Instructor] It is election season, and there is a runoff between candidate A versus candidate B. And we are pollsters. And we're interested
in figuring out, well, what's the likelihood that
candidate A wins this election? Well, ideally, we would go
to the entire population of likely voters right over here, let's say there's 100,000 likely voters, and we would ask every one of
the them, who do you support? And from that, we would be able to get
the population proportion, which would be, this
is the proportion that support, support candidate A. But it might not be realistic. In fact, it definitely
will not be realistic to ask, well, all 100,000 people. So instead, we do the thing that we tend to do in statistics is, is that we sample this population, and we calculate a
statistic from that sample in order to estimate this parameter. So let's say we take a
sample right over here. So this sample size, let's say n equals 100. And we calculate the sample proportion that support candidate A. So out of the 100, let's say that 54 say that they're going
to support candidate A. So the sample proportion here is 0.54. And just to appreciate that we're not always going to get 0.54, there could've been a situation where we sampled a different 100, and we would've maybe gotten
a different sample proportion. Maybe in that one, we got 0.58. And we already have
the tools in statistics to think about this, the distribution of the possible sample proportions we could get. We've talked about it when we thought about
sampling distributions. So you could have the
sampling distribution of the sample proportions, of the sample proportions, proportions. And it's going, this
distribution's going to be specific to what our sample size is, for n is equal to 100. And so we can describe the possible sample
proportions we could get and their likelihoods with
this sampling distribution. So let me do that. So it would look something like this. Because our sample size is so much smaller than the population, it's way less than 10%, we can assume that each
person we're asking, that it's approximately independent. Also, if we make the assumption
that the true proportion isn't too close to zero
or not too close to one, then we can say that, well, look, this sampling distribution is
roughly going to be normal. So we'll have a normal, this
kind of bell curve shape. And we know a lot about
the sampling distribution of the sample proportions. We know already, for example, and if this is foreign to
you, I encourage you to watch the videos on this on Khan Academy, that the mean of this
sampling distribution is going to be the actual
population proportion. And we also know what
the standard deviation of this is going to be. So, let me, maybe that's
one standard deviation. This is two standard deviations. That's three standard
deviations above the mean. That's one standard deviation,
two standard deviations, three standard deviations below the mean. So this distance, let me do
this in a different color, this standard deviation right over here, which we denote as the
standard deviation of the sample proportions, for
this sampling distribution, this is, we've already
seen the formula there. It's the square root of p times one minus p, where p is, once again,
our population proportion divided by our sample size. That's why it's specific
for n equals 100 here. And so in this first scenario, let's just focus on this
one right over here, when we took a sample size of n equals 100 and we got the sample proportion of 0.54, we could've gotten all
sorts of outcomes here. Maybe 0.54 is right over here. Maybe 0.54 is right over here. And the reason why I
had this uncertainty is we actually don't know what the real population parameter is, what the real population proportion is. But let me ask you maybe a
slightly easier question. What is, what is the probability, probability that our sample proportion of 0.54 is within, is within two times
two standard deviations of p? Pause the video, and think about that. Well, that's just saying, look,
if I'm gonna take a sample and calculate the sample
proportion right over here, what's the probability that I'm within two standard deviations of the mean? Well, that's essentially going to be this area right over here. And we know, from studying normal curves, that approximately 95% of the area is within two standard deviations. So this is approximately 95%. 95% of the time that I
take a sample size of 100 and I calculate this sample proportion, 95% of the time, I'm going to be within
two standard deviations. But if you take this statement, you can actually construct
another statement that starts to feel a little bit more, I guess we could say inferential. We could say there, there is a 95% probability that the population proportion p is within, within two standard deviations, two standard deviations of p-hat, which is equal to 0.54. Pause this video. Appreciate that these two
are equivalent statements. If there's a 95% chance
that our sample proportion is within two standard deviations
of the true proportion, well, that's equivalent to
saying that there's a 95% chance that our true proportion is
within two standard deviations of our sample proportion. And this is really, really interesting because if we were able to
figure out what this value is, well, then we would be able to create what you could call a confidence interval. Now, you immediately might
be seeing a problem here. In order to calculate this, our standard deviation
of this distribution, we have to know our population parameter. So pause this video, and think about what we would do instead. If we don't know what p is here, if we don't know our
population proportion, do we have something that
we could use as an estimate for our population proportion? Well, yes, we calculated p-hat already. We calculated our sample proportion. And so a new statistic
that we could define is the standard error, the standard error of
our sample proportions. And we can define that as being equal to, since we don't know the
population proportion, we're going to use the sample proportion, p-hat times one minus p-hat, all of that over n. In this case, of course, n is 100. We do know that. And it actually turns out, I'm not going to prove it in this video, that this actually is
an unbiased estimator for this right over here. So this is going to be equal to 0.54 times one minus 0.54, so it's 0.46, all of that over 100. So we have the square root of .54 times .46 divided by 100, close my parentheses, Enter. So if I round to the nearest
hundredth, it's going to be, actually, even if I round
to the nearest thousandth, it's going to be approximately 5/100. So this is going to be, this is approximately 0.05. So another way to say all
of these things is, instead, we don't know exactly this, but now we have an estimate for it. So we could now say with 95% confidence, and that will often be known as our confidence level right over here, with 95% confidence between, between, and so we'd want to go two standard errors below our sample proportion that we just happened to calculate. So that would be 0.54
minus two times 5/100. So that would be 0.54 minus 10/100, which would be 0.44. And we'd also want to
go two standard errors above the sample proportion. So that would be that plus 10/100. And 0.64 of voters, of voters support, support A. And so this interval that
we have right over here, from 0.44 to 0.64, this will be known as
our confidence interval, confidence interval. And this will change, not just in the starting
point and the end point, but it will change the actual length of our confidence interval, will change depending on what sample proportion we happened to pick for that sample of 100. A related idea to the confidence interval is this notion of margin of error, margin of error. And for this particular case, for this particular sample, our margin of error, because we care about 95% confidence, so that would be two standard errors. So our margin of error here is
two times our standard error, would just be 0.1 or 0.10. And so we're going one margin of error above our sample
proportion right over here and one margin of error below our sample
proportion right over here to define our confidence interval. And as I mentioned, this
margin of error is not going to be fixed every time we take a sample. Depending on what our
sample proportion is, it's going to affect our margin of error because that is calculated, essentially, with the standard error. Another interpretation of this is that the method that we used to get this interval right over here, the method that we used
to get this confidence, to get this confidence interval, when we use it over and over, it will produce intervals, and the intervals won't
always be the same. It's gonna be dependent
on our sample proportion, but it will produce intervals which include the true proportion, which we might not know
and often don't know. It'll include the true
proportion 95% of the time. I'll cover that intuition
more in future videos. We'll see how the interval changes, how the margin of error changes. But when you do this calculation over and over and over again, 95% of the time, your true proportion is
going to be contained in whatever interval you
happen to calculate that time. Now, another interesting question is, is, well, what if you wanted to tighten up the intervals on average? How would you do that? Well, if you wanted to
lower your margin of error, the best way to lower the margin of error is if you increase this
denominator right over here. And increasing that denominator means increasing the sample size. And so one thing that you will often see when people are talking
about election coverage is, well, we need to sample more people in order to get a lower margin of error. But I'll leave you there, and I'll see you in future videos.