Main content

## Statistics and probability

### Course: Statistics and probability > Unit 11

Lesson 2: Estimating a population proportion- Confidence interval example
- Margin of error 1
- Margin of error 2
- Conditions for valid confidence intervals for a proportion
- Conditions for confidence interval for a proportion worked examples
- Reference: Conditions for inference on a proportion
- Conditions for a z interval for a proportion
- Critical value (z*) for a given confidence level
- Finding the critical value z* for a desired confidence level
- Example constructing and interpreting a confidence interval for p
- Calculating a z interval for a proportion
- Interpreting a z interval for a proportion
- Determining sample size based on confidence and margin of error
- Sample size and margin of error in a z interval for p

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Margin of error 1

Finding the 95% confidence interval for the proportion of a population voting for a candidate. Created by Sal Khan.

## Want to join the conversation?

- the variance you got doesn't match the variance calculated by sqrt [ p ( 1 - p ) ].

I would like to know why ?(5 votes)- There are two possible explanations:

The relevant variance is p(1-p), your calculation of √p(1-p) is the standard deviation.

If that's not the reason, then note that Sal is working by treating "successes" as a 1 and "failures" a a 0, and then applying the typical variance formula - including division by n-1. The p(1-p) formula assumes division by n. Using p(1-p) will get you 0.2451, whereas Sal got 0.2476.(7 votes)

- In my stats module the interpretation of the confidence interval is not that we estimate the true population mean to be in a certain interval, as the true population mean is not a variable and is not subject to probability statements.

Rather, the confidence interval should be interpreted as saying, if I took a large amount of samples from the population and I used each sample mean as the center for my confidence interval, then the percentage of those intervals that will contain the true population mean will be the confidence level percentage. Is this correct?(6 votes)- There are two things to note here:

1. Yes, you are correct in your understanding of a confidence interval and its interpretation.

2. The population mean*can*be subject to probability statements. For instance, it is perfectly valid to write:

0.95 = P( Xbar - 1.95 σ/ √n < µ < Xbar + 1.95 σ/ √n )

This is how we derive the formula for the confidence interval. However, this is only a valid probability statement when we are thinking of the sample mean, Xbar, as a random variable. The moment we use our sample to calculate the sample mean and plug it in, we have two actual numbers for the bounds of the interval, and hence there is nothing random anymore and we need to switch to the "confidence" interpretation. But before we plug in the observed sample mean to the formula, the "probability" interpretation is still valid.(7 votes)

- I am totally confused. What is s?

s-squared is the variance of the sample. So if you square root it, that's the standard deviation of the sample at7:12. Then why do you have to divide that by the square root of the sample size (n) to get the standard deviation? Why are there two standard deviations? What is the second standard deviation of?(7 votes) - Around7:10I don't understand how if the sample mean was 0.43 and the sample standard deviation was 0.50. Would this not possible result in one standard deviation to the left being a negative value?(2 votes)
- I can't watch the video right now, but even if it's impossible for the things being measured to be negative (e.g. rainfall), those sorts of numbers can come out. If there are a lot of data in some cluster around 0.4, but then a few pretty large numbers - outliers - then the standard deviation can get pretty inflated. Some non-normal distributions might exhibit something like this.(5 votes)

- At8:18Sal mentions 100 possible values, what do these values represent? And the distribution is sampling distribution of sample mean, why is that, how is this related to 100 discrete values? Thanks.(3 votes)
- The values represent the people who answered the survey about which candidate they were going to vote for. If someone indicated that they would vote for person A, then their vote would be assigned a value of 1. Otherwise, they indicated that they would vote for B and thus their vote would be assigned a value of 0. Hope that helps.(3 votes)

- None of that made sense to me and I didnt understand at all where you exactly explained margin of error. Is it possible that you could tell me what the margin of error is and how you do it.(3 votes)
- Hi,

The formula is ME(margin of Error)= 2 times the square root of P "hat" times (1 minus P "hat") divide by the amount of people surveyed. The 2 stands for two standard deviation over that stands for 95 % confidence interval. P hat is the result of the survey as a decimal.

So I think margin of error is where you have a survey and there are, lets say 100 people doing it. You put 100 at the bottom. And then the survey says for example, that there was 48% who disliked English. So this is the "p hat" Then you solve the equation. 0.48 times the (1 minus 0.48). And that equals 0.2496. Then you divide that by 100, that makes it 0.002496, then you have to do the square root of that, which is 0.049959984. Then you multiply by 2. And that is 0.099919968. Then you have to multiply by 100 again to make it a percent. Which gives us 9.99 which we round to 10%. That is the Margin of Error. I at least think it is, I am still learning all this stuff.(1 vote)

- Could there be a possibility that the sample mean would not equal the population mean? For example, if we surveyed only the people who were going to vote for the given candidate, the sample mean would not equal the population mean. Please correct me if I'm wrong.(2 votes)
- The sample mean will
*often*not equal the population mean. That's somewhat the point of Statistics: different samples will give different results, and we want to use just one sample to be able to generalize to the population.(3 votes)

- At9:15why does Sal write µ_¯x instead of ¯x when he refers to the sampling distribution of the sample mean? Isn't that the same as the sample mean ¯x?(3 votes)
- Mu of xbar is the mean of the normal distribution of sample means (and is also the population mean). So of course it's not the same as our (one) sample mean.(1 vote)

- How does he know which value should equal 0 and which value should equal 1 He said that they could be switched but if they were, the sample mean would be different, so..how does that work?(1 vote)
- If you switched 0 and 1, the mean would indeed have a different value, because it would represent the proportion of people who vote for candidate A instead of the proportion who vote for candidate B. But it wouldn't affect the variance or standard deviation.

Using the values from Sal's example:`x-bar = (57 * 0 + 43 * 1) / 100 = 0.43`

s^2 = (57(0 - 0.43)^2 + 43(1 - 0.43)^2) / 99

= (57 * 0.43^2 + 43 * 0.57^2) / 99

= 0.2475

If you switch 0 and 1, you get`x-bar = (57 * 1 + 43 * 0) / 100 = 0.57`

s^2 = (57(1 - 0.57)^2 + 43(0 - 0.57)^2) / 99

= (57 * 0.43^2 + 43 * 0.57^2) / 99

= 0.2475

So the sample variance is unchanged.

Since the**size**of a confidence interval only depends on the sample variance, the confidence interval bounds would be the same distance either side of the new mean as they were from the original mean.

In the second part to this video, Sal concludes that the 95% confidence interval is from 33% to 53% (10% either side of x-bar = 43%). That represents the expected proportion of votes for candidate**B**, which means the expected proportion of votes for candidate**A**would be 47% (= 100% - 53%) to 67% (= 100% - 33%).

After switching 0 and 1, you'd instead get a confidence interval of 47% to 67% (10% either side of x-bar = 57%), which represents the expected proportion of votes for candidate**A**. Thus, the expected proportion of votes for candidate**B**would be 33% (= 100% - 67%) to 53% (= 100% - 47%).

So, ultimately, you get the same results whichever way you assign 0 and 1.(5 votes)

- Can we also use the formula for "sampling distribution of sample proportion" here?

sqrt(p(1-p)/n) = sqrt(0.43*0.57/100) = sqrt(0.002451) = 0.0497 = ~0.50(2 votes)

## Video transcript

Say I live in a country of a 100
million people and there's a presidential election
coming up. And in that presidential
election there are two candidates. There's candidate A,
and candidate B. And there's some reality-- let's
say I live in a very decisive country and everyone is
going to vote for either-- and everyone participates in
election and everyone is going to vote for either candidate
A or candidate B. And so there's some percentage,
there's some reality there, that p-- let me
write it over here-- maybe 1 minus p percent-- let me do
the p first. There's some reality that maybe p percent
will vote for B, and I could switch them around
if I wanted. So p percent are going to vote
for B, and the rest of the people are going to vote for A,
so maybe 1 minus p percent are going to vote for A. And you might already recognize
that this is a Bernoulli Distribution. There's one of two values
for a sample I can get. And right here, the values I
said you're either voting for candidate A or you're voting
for candidate B. It's very hard to deal
with those values. You can't calculate a mean
between A and B and all of that-- those are letters,
they're not numbers. So to make it manipulatable
mathematically we're going to say sampling someone who's
going to vote for A is equivalent to sampling a 0,
and sampling someone who's going to vote for B is
equivalent to sampling a 1. And if you do that with a
Bernoulli Distribution, we learned in the video on
Bernoulli Distributions, that the mean of this distribution
right here is going to be equal to p. And it's a pretty
straightforward proof for how we got that. So the mean of this
distribution, which will actually be not a value that
this distribution can take on, is going to be some place over
here and it is going to be equal to p. Now my country has a
100 million people. It is practically, or is
definitely impossible for me to be able to go and ask all
100 million people who are they going to vote for. So I won't be able to exactly
figure out what these parameters are going to be. What my mean is, what
p is going to be. But instead of doing that, what
I'm going to do is do a random survey. I'm going to sample this
population, look at that data, and then get an estimate
of what p really is. Because this is what I
really care about. I really care about p. So I'm going to try to estimate
p with a sample, and then we're also going to think
about how good of an estimate that is. So I am going to randomly
survey, or sample, 100 people. And let's say I got the
following results. Let's say that 57 people say
that they were going to vote for person A. Let me write it this way. So 57 people say they're going
to vote for A, or that's equivalent to getting
57 samples of 0. And then the rest of the people,
once again, very decisive population, no one is
undecided, the rest of the people, so 43 people say they're
going to vote for B. Or that's the equivalent
of sampling 43 1's. Now given this sample here, what
is my sample mean and my sample variance? My sample mean right here, well
that's just going to be the average of these 0's and
1's So I've got 57 0's, so it's going to be 57 times
0 plus my 43 1's. So the sum of all of my samples,
so it's 43 1's, plus 43 times 1, over the total
number of samples I took, over 100. So what does this get me? So 57 times 0 is 0. 43 times 1 divided
by 100 is 0.43. That is my sample mean, the
mean of just the 100 data points that I actually got. Now what is my sample
variance? Sample variance is going to be
equal to the sum of my squared distances to the mean divided
by my samples minus 1. Remember, this is a sample
variance, and we want to get the best estimator of the real
variance of this distribution. And to do that you don't divide
by 100, you're going to divide by 100 minus 1. We learned that many,
many videos ago. So I have 57. So I had 57 samples of 0. We'll do it in that same
yellow color-- 57 samples of 0. And so each of those samples
are 0 minus 0.43 away from the mean. Each of those samples are 0. You subtract 0.43-- this
is the difference between 0 and 0.43. And if I want the squared
distance, I square it-- that's how we calculate variance. There's 57 of those. And then there's 43 times that
I sampled a 1 in my sample population-- 43 times I sampled
a 1, and the 1 is 1 minus 0.43 away from the mean
because that is the mean, and I want to square
that distance. And then I don't want to
just divide it by n. I don't want to just divided by
100-- remember, I'm trying to estimate the true
population mean. In order for this to be the best
estimator of that, and I gave you the intuition of why
many, many videos ago, we divide by 100 minus 1 or 99. Let's get the calculator out
to actually figure out our sample variance. So let me get the calculator
out, and we have-- I'll do the numerator first. I have 57 times
0 minus 0.43 squared, plus 43 times 1 minus
0.43 squared. And then all of that divided
by 100 minus 1, or 99-- divided by 99 is equal
to 0.2475. So my variance, my sample
variance, is equal to 0.2475. And if I want to figure out my
sample standard deviation I just take the square
root of that. My sample standard deviation is
just going to be the square root of my sample variance. So I take the square root of
that value that I just had, which is 0.497. So actually let me just
round that up as 0.50. So my sample standard
deviation is 0.50. Now if you just look at this,
you say OK, well your best estimate of the percentage of
people voting for A or B is really what you just saw here. Your best estimate or your best
estimate of the mean is that 43% of people are going
to vote for B and everyone else is going to vote for A. But an interesting question
is how good a of a sample is that? Let's take it to
the next level. Let's try to think of an
interval around 43% for which we are 95%, that we're
reasonably confident, roughly 95% sure that the real mean
is in that interval. Let me make it very clear. Let me draw. So when we get our sample mean
we are sampling from the sampling distribution of
the sampling mean. Let me draw that. The sampling distribution
of the sample mean. So since we're sampling from a
discrete distribution it's actually going to be a discrete
distribution, but it's going to have 100
possible values. This can take on 100 different
values here. Really anything between
0 and 1. But I'll draw it kind of
continuous because it would be hard for me to draw 100
different bars. If I did, you'd have a bar
there, you'd have a bar there. The odds that your sample mean
would be 1, it would be a very low probability, and then you
would have one more bar, a bar like that, a bar like
that, but that takes forever to draw. So I'm just going to approximate
it with this normal curve right over there. So the sampling distribution
of the sample mean-- let me write it over here. So this is the sampling
distribution of the sample mean. It has some mean here. It has a mean, and I can denote
it with the mu sub x bar-- this tells us this is
the mean of the sample distribution. But we know from many, many
videos that this is going to be the same thing as the mean of
the population mean that we are sampling from, that each
sample comes from, each of these 100 samples come from. So this is going to be equal
to mu, which is going to be equal to p. Now this variance over here,
the variance of this distribution-- let me draw it
like this, or even better let's do the standard
deviation of this distribution. The standard deviation of this
distribution, that distance right over here, the standard
deviation of the sampling distribution of the sample
mean-- we've seen it multiple times already-- it's going to
be this standard deviation-- it's going to be the standard
deviation of our population distribution. So that standard deviation
is going to be that distance over there. So there's some standard
deviation associated with this distribution. It's going to be that standard
deviation divided by the square root of our
sample size. And we saw many videos ago why
that, at least experimentally makes sense, or why it
intuitively makes sense. So it's going to be the
square root of 100. So it's going to be this
guy divided by 10. Now we do not know
what this guy is. The only way to figure out
what that guy is is to actually survey 100 million
people, which would have been impossible. So to estimate the standard
deviation of this, we will use our sampling standard deviation
as our best estimate for the population standard
deviation. So we could say-- and remember,
this is an estimate. We cannot come up with the exact
number for this just from a sample. But we can estimate it. Because this is our best
estimator for this standard deviation, and if we divide it
by 10, we will have our best estimator for the standard
deviation of the sampling distribution of the
sampling mean. So remember, this is
just an estimate. It is just an estimate. So you kind of have to take
everything after this point with a little bit of
a grain of salt. So it's going to be roughly
equal to or an estimate of it is going to be 0.5. And remember, every time we take
a different sample from here this number is
going to change. So this isn't like something
in stone. This is dependent
on our sample. So it's going to wiggle around a
little bit depending on what numbers we actually
get in our sample. But it's going to be 0.50. This is the s right here, this
0.50 divided by 10, which is equal to 0.05. So our best estimate of this
standard deviation is 0.05, or you could even view it as 5%. Now what I want to do is come up
with an interval around the sample mean where I'm reasonably
confident using all my estimates and all that that
there's a-- let me say I'm really confident that there's
a 95% chance that the true mean is within two standard
deviations-- or let me put it this way, there's a 95%
chance that the true mean is in that interval. So let me write this down. I want to find an interval
such that I am reasonably confident-- and I'm putting
this kind of touchy-feely language over here because it's
all around the fact that I don't know for a fact that
the standard deviation is 0.05, I'm just estimating. But I'm reasonably confident
that there is a 95% chance that the true mean of the
population, which is the same thing as the proportion of the
population who are going to vote for person B, or the
proportion of the population that are going to be a 1. So this is also, we just
have to remember that mu is equal to p. That there's a 95% chance
that the true p is in that interval. And actually, since I've already
gone 14 minutes into this video, I'm going to pause
this video, I'm going to stop this video here, and maybe I'll
even let you think about it just based on everything
we've done so far. We figured out the sample mean--
sorry, we figured out the sample mean right
over here. We've figured out an estimate
for the-- and remember, this is just a sampling mean. We don't know the true-- this
is the mean of our sample. We don't know the true mean of
the sampling distribution, and we also don't know the true
standard deviation of the sampling distribution. But we were able to estimate
it with the sample standard deviation. Now everything that we have so
far, and based on what we've seen before on confidence
intervals and all that, how can we find an interval such
that roughly-- and I'm saying roughly because we had to
estimate the standard deviation-- that there's a 95%
chance that the true mean of our population, or the p, the
proportion of the population saying 1, is in that interval? And we're going to do that
in the next video.