Confidence intervals
Confidence Interval Example Confidence Interval Example
⇐ Use this menu to view and help create subtitles for this video in many different languages.
You'll probably want to hide YouTube's captions if using these subtitles.
- In a local teaching district, a technology grant is
- available to teachers in order to install a cluster of four
- computers in their classroom.
- From the 6,250 teachers in the district, 250 were randomly
- selected and asked if they felt that computers were an
- essential teaching tool for their classroom.
- Of those selected, 142 teachers felt that the
- computers were an essential teaching tool.
- And then they ask us, calculate a 99% confidence
- interval for the proportion of teachers who felt that the
- computers are an essential teaching tool.
- So let's just think about the entire population.
- We weren't able to survey all of them, but the entire
- population, some of them fall in the bucket, and we'll
- define that as 1, they thought it was a good tool.
- They thought that the computers were a good tool.
- And we'll just define a 0 value as a teacher
- that says not good.
- And some proportion of the total teachers think that it
- is a good learning tool.
- So that proportion is p.
- And then the rest of them think it's a bad learning
- tool, 1 minus p.
- We have a Bernoulli Distribution right over here,
- and we know that the mean of this distribution or the
- expected value of this distribution is
- actually going to be p.
- So it's actually going to be a value, it's neither 0 or 1, so
- not an actual value that you could actually get out of a
- teacher if you were to ask them.
- They cannot say something in between good and not good.
- The actual expected value is something in between.
- It is p.
- Now what we do is we're taking a sample of those 250
- teachers, and we got that 142 felt that the computers were
- an essential teaching tool.
- So in our survey, so we had 250 sampled, and we got 142
- said that it is good, and we'll say that this is a 1.
- So we got 142 1's, or we sampled 1, 142 times from this
- distribution.
- And then the rest of the time, so what's left over?
- There's another 108 who said that it's not good.
- So 108 said not good, or you could view them as you were
- sampling a 0, right?
- 108 plus 142 is 250.
- So what is our sample mean here?
- We have 1 times 142, plus 0 times 108 divided by our total
- number of samples, divided by 250.
- It is equal to 142 over 250.
- You could even view this as the sample proportion of
- teachers who thought that the computers were a
- good teaching tool.
- Now let me get a calculator out to calculate this.
- So we have 142 divided by 250 is equal to 0.568.
- So our sample proportion is 0.568.
- or 56.8%, either one.
- So 0.568.
- Now let's also figure out our sample variance because we can
- use it later for building our confidence interval.
- Our sample variance here-- so let me draw a sample
- variance-- we're going to take the weighted sum of the square
- differences from the mean and divide by this minus 1.
- So we can get the best
- estimator of the true variance.
- So it's 1 times-- no, it's the other way actually around-- we
- have 142 samples that were 1 minus 0.568 away from our
- sample mean, or we're this far from the sample mean 142
- times, and we're going to square those distances.
- Plus the other 108 times we got a 0, so we were 0 minus
- 0.568 away from the sample mean.
- And then we are going to divide that by the total
- number of samples minus 1.
- That minus 1 is our adjuster so that we don't
- underestimate.
- So 250 minus 1.
- Let's get our calculator out again.
- And so we have 100-- we put a parentheses around
- everything-- I have 142 times 1 minus 0.568 squared, plus
- 108 times 0 minus-- and you could obviously do parts of
- this in your head, but I'm just going to write the whole
- thing out-- minus 0.568 squared, and then all of that
- divided by 250 minus 1 is 249.
- So our sample variance is-- well, I'll just say 0.246.
- It is equal to-- it is our sample variance-- I'll write
- it over here-- our sample variance is equal to 0.246.
- If you were to take the square root of that our actual sample
- standard deviation is going to be, let's take the square root
- of that answer right over there, and we get 0.496 is
- equal to 0.
- I'll just round that up to 0.50.
- So that is our sample standard deviation.
- Now this interval, let's think of it this way, we are
- sampling from some sampling distribution
- of the sample mean.
- So it looks like this over here, it
- looks that over there.
- And it has some mean, and so the mean of the sampling
- distribution of the sample mean is actually going to be
- the same thing as this mean over here-- it's going to be
- the same mean value-- which is the same thing as our
- population proportion.
- We've seen this multiple times.
- And the sampling distribution's standard
- deviation, so the standard deviation of the sampling
- distribution, so we could view that as one standard deviation
- right over there.
- So the standard deviation of the sampling distribution,
- we've seen multiple times, is equal to the standard
- deviation-- let me do this in a different color-- is equal
- to the standard deviation of our original population
- divided by the square root of the number of samples.
- So it's divided by 250.
- Now we do not know this right over here.
- We do not know the actual standard deviation in our
- population.
- But our best estimate of that, and that's why we call it
- confident, we're confident that the real mean or the real
- population proportion, is going to be in this interval.
- We're confident, but we're not 100% sure because we're going
- to estimate this over here, and if we're estimating this
- we're really estimating that over there.
- So if this can be estimated it's going to be estimated by
- the sample standard deviation.
- So then we can say this is going to be approximately, or
- if we didn't get a weird, completely skewed sample, it
- actually might not even be approximately if we just had a
- really strange sample.
- But maybe we should write confident that-- we are
- confident that the standard deviation of our sampling
- distribution is going to be around, instead of using this
- we can use our standard deviation of our sample, our
- sample standard deviation.
- So 0.50 divided by the square root of 250, and what's that
- going to be?
- That is going to be-- so we have this value right over
- here, and actually I don't have to round it, divided by
- the square root of 250.
- We get 0.031.
- So this is equal to 0.031 over here.
- So that's one standard deviation.
- Now they want a 99% confidence interval.
- So the way I think about it is if I randomly pick a sample
- from the sampling distribution, what's the 99%
- chance, or how many-- let me think of it this way.
- How many standard deviations away from the mean do we have
- to be that we can be 99% confident that any sample from
- the sampling distribution will be in that interval?
- So another way to think about think it, think about how many
- standard deviations we need to be away from the mean, so
- we're going to be a certain number of standard deviations
- away from the mean such that any sample, any mean that we
- sample from here, any sample from this distribution has a
- 99% chance of being plus or minus that many standard
- deviations.
- So it might be from there to there.
- So that's what we want.
- We want a 99% chance that if we pick a sample from the
- sampling distribution of the sample mean, it will be within
- this many standard deviations of the actual mean.
- And to figure that out let's look at an actual Z-table.
- So we want 99% confidence.
- So another way to think about it if we want 99% confidence,
- if we just look at the upper half right over here, that
- orange area should be 0.475, because if this is 0.475 then
- this other part's going to be 0.475, and we will get to
- our-- oh sorry, we want to get to 99%, so it's
- not going to be 0.475.
- We're going to have to go to 0.495 if we want 99%
- confidence.
- So this area has to be 0.495 over here, because if that is,
- that over here will also be.
- So that their sum will be 99% of the area.
- Now if this is 0.495, this value on the z table right
- here will have to be 0.5, because all of this area, if
- you include all of this is going to be 0.5.
- So it's going to be 0.5 plus 0.495.
- It's going to be 0.995.
- Let me make sure I got that right.
- 0.995.
- So let's look at our Z-table.
- So where do we get 0.995.
- on our z table?
- 0.995.
- is pretty close, just to have a little error, it will be
- right over here-- this is 0.9951.
- So another way to think about it is 99-- so this value right
- here gives us the whole cumulative area up to that, up
- to our mean.
- So if you look at the entire distribution like this, this
- is the mean right over here.
- This tells us that at 2.5 standard deviations above the
- mean, so this is 2.5 standard deviations above the mean.
- So this is 2.5 times the standard deviation of the
- sampling distribution.
- If you look at this whole area, this whole area over
- here, if you look at the Z-table, is going to be
- 0.9951, which tells us that just this area right over here
- is going to be 0.4951, which tells us that this area plus
- the symmetric area of that many standard deviations below
- the mean, if you combine them, 0.4951 times
- 2 gets us to 99.2.
- So this whole area right here is 99.992.
- So if we look at the area 2.5 standard deviations above and
- below the mean-- oh, let me be careful.
- This isn't just 2.5, we have to add
- another digit of precision.
- This is 2.5, and the next digit of precision is given by
- this column over here.
- So we have to look all the way up into the second to the last
- column, and we have to add a digit of 8 here.
- So this is 2.58 standard deviations.
- We have 2.5 over here, and then we get the next digit 8
- from the column.
- 2.58 standard deviations above and below the standard
- deviation encompasses a little over 99% of the total
- probability.
- So there's a little over a 99% chance that any sample mean
- that I select from the sampling distribution of the
- sample mean will fall within this much of
- the standard deviation.
- So let me put it this way.
- There is a 99-- it's actually, what, a 99.2% chance, right?
- If you multiply this times 2 you get 0.99-- actually you
- get 0.9902.
- So we'll say roughly 99% chance that any sample that a
- random sample mean is within 2.58 standard deviations of
- the sampling mean of the mean of the sampling distribution
- of the sampling mean, which is the same thing as our actual
- population mean, which is the same thing as our population
- proportion.
- So of p.
- And we know what this value is right here.
- At least we have a decent estimate for this value.
- We don't know exactly what this is, but our best estimate
- for this value is this over here.
- So we could re-write this, so we could say that we are
- confident because we are really using an estimator to
- get this value here.
- We are confident that there is a 99% chance that a random x,
- a random sample mean, is within-- and let's figure out
- this value right here using a calculator.
- So it is 2.58 times our best estimate of the standard
- deviation of the sampling distribution, so times 0.031
- is equal to 0.0-- well let's just round this up because
- it's so close to 0.08-- is within 0.08 of the population
- proportion.
- Or you could say that you're confident that the population
- proportion is within 0.08 of your sample mean.
- That's the exact same statement.
- So if we want our confidence interval, our actual number
- that we got for there, our actual sample
- mean we got was 0.568.
- So we could replace this, and actually let me do it.
- I can delete this right here.
- Let me clear it.
- I can replace this, because we actually did take a sample.
- So I can replace this with 0.568.
- So we could be confident that there's a 99% chance that
- 0.568 is within 0.08 of the population proportion, which
- is the same thing as the population mean, which is the
- same thing as the mean of the sampling distribution of the
- sample mean, so forth and so on.
- And just to make it clear we can actually swap these two.
- It wouldn't change the meaning.
- If this is within 0.08 of that, then that is
- within 0.08 of this.
- So let me switch this up a little bit.
- So we could put a p is within of-- let me switch
- this up-- of 0.568.
- And now linguistically it sounds a little bit more like
- a confidence interval.
- We are confident that there's a 99% chance that p is within
- 0.08 of the sample mean of 0.568.
- So what would be our confidence interval?
- It will be 0.568 plus or minus 0.08.
- And what would that be?
- If you add 0.08 to this right over here, at the upper end
- you're going to have 0.648.
- And at the lower end of our range, so this is the upper
- end, the lower end.
- If we subtract 8 from this we get 0.488.
- So we are 99% confident that the true population proportion
- is between these two numbers.
- Or another way, that the true percentage of teachers who
- think those computers are good ideas is between-- we're 99%
- confident-- we're confident that there's a 99% chance that
- the true percentage of teachers that like the
- computers is between 48.8% and 64.8%.
- Now we answered the first part of the question.
- The second part, how could the survey be changed to narrow
- the confidence interval, but to maintain the
- 99% confidence interval?
- Well, you could just take more samples.
- If you take more samples than our estimate of the standard
- deviation of this distribution will go down because this
- denominator will be higher.
- If the denominator is higher then this whole
- thing will go down.
- So if the standard deviations go down here, then when we
- count the standard deviations, when we do the plus or minus
- on the range, this value will go down and
- will narrow our range.
- So you just take more samples.
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
|
Have something that's not a question about this content? |
This discussion area is not meant for answering homework questions.
Discuss the site
For general discussions about Khan Academy, visit our Reddit discussion page.
Flag inappropriate posts
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
abuse
- disrespectful or offensive
- an advertisement
not helpful
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
wrong category
- a tip or feedback in Questions
- a question in Tips & Feedback
- an answer that should be its own question
about the site
Share a tip
Suggest a fix
Have something that's not a tip or feedback about this content?
This discussion area is not meant for answering homework questions.