Main content

## Estimating a population proportion

Current time:0:00Total duration:10:05

# Margin of error 2

## Video transcript

Where we left off in the
last video I kind of gave you a question. Find an interval so that we're
reasonably confident-- we'll talk a little bit more about why
I have to give this kind of vague wording right here--
reasonably confident that there's a 95% chance that the
true population mean, which is p, which is the same thing as
the mean of the sampling distribution of the
sampling mean. So there's a 95% chance that
the true mean-- and let me put this here. This is also the same thing as
the mean of the sampling distribution of the sampling
mean is in that interval. And to do that let me just
throw out a few ideas. What is the probability that if
I take a sample and I were to take a mean of that sample,
so the probability that a random sample mean is within two
standard deviations of the sampling mean, of
our sample mean? So what is this probability
right over here? Let's just look at our
actual distribution. So this is our distribution,
this right here is our sampling mean. Maybe I should do it in
blue because that's the color up here. This is our sampling mean. And so what is the probability
that a random sampling mean is going to be two standard
deviations? Well a random sampling is a
sample from this distribution. It is a sample from the sampling
distribution of the sample mean. So it's literally what is the
probability of finding a sample within two standard
deviations of the mean? That's one standard deviation,
that's another standard deviation right over there. In general, if you haven't
committed this to memory already, it's not a bad thing
to commit to memory, is that if you have a normal
distribution the probability of taking a sample within two
standard deviations is 95-- and if you want to get
a little bit more accurate it's 95.4%. But you could say it's roughly--
or maybe I could write it like this--
it's roughly 95%. And really that's all that
matters because we have this little funny language here
called reasonably confident, and we have to estimate the
standard deviation anyway. In fact, we could say if we
want, I could say that it's going to be exactly
equal to 95.4%. But in general, two standard
deviations, 95%, that's what people equate with each other. Now this statement is the
exact same thing as the probability that the sample
mean, that the sampling mean-- not the sample mean, the
probability of the mean of the sampling distribution is within
two standard deviations of the sampling distribution of
x is also going to be the same number, is also going
to be equal to 95.4%. These are the exact
same statements. If x is within two standard
deviations of this, then this, then the mean, is within two
standard deviations of x. These are just two ways of
phrasing the same thing. Now we know that the mean of the
sampling distribution, the same thing as a mean of the
population distribution, which is the same thing as the
parameter p-- the proportion of people or the proportion of
the population that is a 1. So this right here is the same
thing as the population mean. So this statement right here
we can switch this with p. So the probability that p is
within two standard deviations of the sampling distribution
of x is 95.4%. Now we don't know what this
number right here is. But we have estimated it. Remember, our best estimate of
this is the true standard, or it is the true standard
deviation of the population divided by 10. We can estimate the true
standard deviation of the population with our sampling
standard deviation, which was 0.5, 0.5 divided by 10. Our best estimate of the
standard deviation of the sampling distribution of the
sample mean is 0.05. So now we can say-- and I'll
switch colors-- the probability that the parameter
p, the proportion of the population saying 1, is within
two times-- remember, our best estimate of this right here is
0.05 of a sample mean that we take is equal to 95.4%. And so we could say the
probability that p is within 2 times 0.05 is going to be equal
to-- 2.0 is going to be 0.10 of our mean is equal to
95-- and actually let me be a little careful here. I can't say the equal now,
because over here if we knew this, if we knew this parameter
of the sampling distribution of the sample
mean, we could say that it is 95.4%. We don't know it. We are just trying to find our
best estimator for it. So actually what I'm going to
do here is actually just say is roughly-- and just to show
that we don't even have that level of accuracy, I'm going
to say roughly 95%. We're reasonably confident that
it's about 95% because we're using this estimator that
came out of our sample, and if the sample is really
skewed this is going to be a really weird number. So this is why we just have to
be a little bit more exact about what we're doing. But this is the tool
for at least saying how good is our result. So this is going to
be about 95%. Or we could say that the
probability that p is within 0.10 of our sample mean
that we actually got. So what was the sample mean
that we actually got? It was 0.43. So if we're within 0.1 of 0.43,
that means we are within 0.43 plus or minus 0.1 is
also, roughly, we're reasonably confident
it's about 95%. And I want to be very clear. Everything that I started all
the way from up here in brown to yellow and all this magenta,
I'm just restating the same thing inside of this. It became a little bit more
loosey-goosey once I went from the exact standard deviation of
the sampling distribution to an estimator for it. And that's why this is just
becoming-- I kind of put the squiggly equal signs there
to say we're reasonably confident-- and I even got rid
of some of the precision. But we just found
our interval. An interval that we can be
reasonably confident that there's a 95% probability that
p is within that, is going to be 0.43 plus or minus 0.1. Or an interval of-- we have
a confidence interval. We have a 95% confidence
interval of, and we could say, 0.43 minus 0.1 is 0.33. If we write that as a percent
we could say 33% to-- and if we add the 0.1, 0.43 plus
0.1 we get 53%-- to 53%. So we are 95% confident. So we're not saying kind of
precisely that the probability of the actual proportion is 95%,
but we're 95% confident that the true proportion
is between 33% and 55%. That p is in this
range over here. Or another way, and you'll see
this in a lot of surveys that have been done, people will say
we did a survey and we got 43% will vote for number one,
and number one in this case is candidate B. And then the other side, since
everyone else voted for candidate A, 57% will
vote for A. And then they're going to
put on margin of error. And you'll see this in any
survey that you see on TV. They'll put a margin of error. And the margin of error is just
another way of describing this confidence interval. And they'll say that the margin
of error in this case is 10%, which means that there's
a 95% confidence interval, if you go plus or
minus 10% from that value right over there. And I really want to emphasize,
you can't say with certainty that there is a 95%
chance that the true result will be within 10% of this,
because we had to estimate the standard deviation of
the sampling mean. But this is the best measure
we can with the information you have. If you're going to
do a survey of 100 people, this is the best kind of
confidence that we can get. And this number is actually
fairly big. So if you were to look at this
you would say, roughly there's a 95% chance that the true
value of this number is between 33% and 53%. So there's actually still a
chance that candidate B can win, even though only
43% of your 100 are going to vote for him. If you wanted to make it a
little bit more precise you would want to take
more samples. You can imagine. Instead of taking 100 samples,
instead of n being 100, if you made n equal 1,000, then you
would take this number over here, you would take this number
here and divide by the square root of 1,000 instead
of the square root of 100. So you'd be dividing
by 33 or whatever. And so then the size of the
standard deviation of your sampling distribution
will go down. And so the distance of two
standard deviations will be a smaller number, and so
then you will have a smaller margin of error. And maybe you want to get the
margin of error small enough so that you can figure out
decisively who's going to win the election.