If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:18:36

Video transcript

in a local teaching district a technology grant is available to teachers in order to install a cluster of four computers in their classroom from the 6250 teachers in the district 250 were randomly selected and asked if they felt that computers were an essential teaching tool for their classroom of those selected 142 teachers 142 teachers felt that the computers were an essential we're an essential teaching tool and then they asked us calculate a 99% confidence interval for the proportion of teachers who felt that the computers are an essential teaching tool so let's just think about the entire population we weren't able to survey all of them but the entire population there were the entire population some of them fall in the bucket and we'll define that as one they thought it was a good tool they thought that the computers were a good tool and we'll just define a zero value as a teacher that says not good not good and some proportion of the total of the total teachers think that it is a good learning tool so that proportion is P and then the rest of them think it's a bad learning tool so the rest of them think it is a bad learning tool 1 - P we have a Bernoulli distribution right over here and we know that the mean of this distribution or the expected value of this distribution is actually going to be P so it's actually going to be a value it's neither 0 or 1 so not an actual value that you could actually get out of a teacher if you were to ask them they cannot say something in between good and not good they but the actual expected value is something in between it is it is P now what we do is we're taking a sample of those 250 teachers and we got that 142 felt that the computers were an essential tool teaching tool so we got so in our survey so we had 250 250 sampled 250 sampled and we got 142 said that it is good and we'll say that this is a 1 so we got 142 ones or we sampled one 140 two times from this distribution and then the rest of the times so what's left over there's another 108 who said that it's not good so 108 said not good or you could view them as you are sampling a zero right 108 plus 142 is 250 so what is our sample mean here our sample mean here we have 1 times 142 times 142 plus 0 times 108 divided by our total number of samples divided by 250 it is equal to 142 over 250 you could even view this as the sample proportion of teachers who thought that the computers were a good teaching tool now let me get a calculator out to calculate this so we have 142 divided by 250 is equal to 0.5 6 8 so our sample proportion is 0.568 or 56.8 percent either one so 0 0.568 now let's also figure out our sample variance because we can use it later for building our confidence interval our sample variance here so let me draw sample variance we're going to take the weighted the weighted sum of the squared differences from the mean and divide by this minus 1 so we can get the best estimator of the true variance so it's 1 x 1 x or let me write know let's the other way actually around we have 142 samples that were 1 minus 0.5 6-8 away from the actual form our sample mean so these are the we're this far from the sample mean 142 times we're going to square those distances plus the other 108 times the other 108 times we got a 0 so we were 0 minus 0.568 away from the sample mean and then we are going to divide that by we're going to divide that by the total number of samples minus 1 this is that minus 1 is our adjuster so that we don't we don't underestimate so 250 -1 let's get our calculator out again and get our calculator out and so we have 100 we put a parenthesis around everything I have 142 x times 1 minus 0.5 6 8 squared plus 108 times 108 times 0 minus and you could obviously do parts of this in your head but I'm just going to write the whole thing out there - 0.568 squared squared and then all of that divided by 250 minus 1 is 249 so our sample variance is well just say point 2 4 6 it is equal to it is our sample variance our sample I'll write it over here our sample variance sample variance is equal to zero point two four six if you were to take the square root of that our actual sample standard deviation our stats sample standard deviation is going to be let's take the square root of that answer right over there and we get 0.49 six is equal to zero I'll just round that up to 0.5 0 so that is our sample standard deviation now this interval let's think of it this way we are we are sampling from some from some sampling distribution of the sample mean so it looks like this over here it looks like that over there and it has some mean it has some it has some mean and so the mean of the sampling distribution of the sample mean is actually going to be the same thing as this mean over here it's going to be the same mean value which is the same thing as our population proportion we've seen this multiple times and this is the sampling distributions standard deviation so the standard deviation of the sampling distribution so the standard deviation of the sampling distribution so we could view that as one standard deviation right over there so the standard deviation of the sampling distribution we've seen multiple times is equal to the standard deviation is equal to the standard deviation let me do this in a different color is equal to is equal to the standard deviation of our original population is equal to the original standard deviation original population divided by the square root of the number of samples so divided by the square root of the number of samples so this divided by 250 now we do not know this right over here we do not know the actual standard deviation in our population but our best estimate of that and that's what we call it confident we're confident that the real mean is going or the real the real population proportion is going to be in this interval but we're not we're confident but we're not 100% sure because we're going to estimate this over here and if we're estimating this we're really estimating that over there so if this can be estimated is going to be estimated by this sample standard deviation so this is so we are so then we can say this is going to be approximately or if we didn't get a weird completely skewed sample it actually might not even be approximately if we just had a really strange sample but maybe we should write confident that confident confident that we are confident that the standard deviation of our sampling distribution is going to be around is going to be around instead of using this we can use our standard deviation of our sample our sample standard deviation so 0.5 oh divided by the square root of 250 and what's that going to be that is going to be so we have this value right over here actually I don't have to round it divided by the square root square root of 250 we get point 0 3 1 so this is this is equal to point zero 3 1 over here so that's one standard daeviation now they want a 99% confidence interval so the way I think about it is is how many if I randomly if I randomly pick a sample from the sampling distribution what's the 99% chance or how many if let me let me think of it this way how many standard deviations away from the mean do we have to be that we can be 99% confident that any sample from the sampling distribution will be in that interval so another way to think about it think about how many standard deviations we need to be away from the mean so we're going to be a certain number of standard deviations away from the mean such that any sample any pot any mean that we calculate that we sample from here any any sample from this distribution has a 99% chance of being with plus or minus that many standard deviation so it might be from there to there so that's what we want we want a 99% chance that if we pick a sample from the sampling distribution of the sample mean it will be within this many standard deviations of the actual mean and to figure that out let's look at an actual Z table so we want 99% confidence so another way to think about it if we want 99% confidence if we just look at the upper half if we look at the upper half right over here that orange area should be 0.475 because if this is 0.475 then this other part is going to be four point four seven five and we will get to our sorry we want to get to 99 percent so it's not going to be 0.475 we have we're going to have to go to 0.4 we're gonna have to go to 0.49 five if we want 99% confidence so this area has to be 0.49 five over here because if that is that over here we'll also be so that their sum will be 99 percent of the area now this is 0.49 five this value on the z table right here will have to be 0.5 because all of this area if you include all of this is going to be 0.5 so it's going to be 0.5 plus 0.4 9/5 it's going to be 0.9 it is going to be 0.9 9.99 five let me make sure I got that right point nine nine five and so let's look at our Z table so where do we get point nine nine five on our Z table 0.99 0.995 is pretty close just to have a little error it will be right over here this is point nine nine five one so another way to think about it is 99 so this this value right here gives us the whole cumulative area up to that up to our mean so if you look at the entire distribution like this if you look at the entire distribution this is the mean right over here this tells us that at 2.5 standard deviations above the mean so this is 2.5 standard deviations above the mean so this is 2.5 times the standard deviation of the sampling distribution if you look at this whole area this whole area over here if you look at the Z table is going to be 0.995 1 which tells us just this area just this area right over here is going to be 0.49 5-1 which tells us this area plus a symmetric area of that many standard deviations below the mean if you combine them 0.495 1 times 2 gets us to 99 point 2 so this whole area right here is 99.99 2 so if we look at the area 2.5 standard deviations above and below the mean above and below the mean let me be careful this isn't just 2.5 we have to add another digit of precision this is 2.5 and the next digit of precision is given by this column over here so we have to look all the way up into the second-to-last column and we have to add a digit of 8 here so this is 0.2 this is two point five eight standard deviations two point five we have two point five over here and then we get the next digit eight from the column two point five eight standard deviations above and below the standard deviation encompasses a little over 99% of the two probability so there's a little over 99% chance that any sample mean that I select from the sampling distribution of the sample mean will fall within this much of the standard deviation so let me put it this way there is a there is a 99 there is a 99 it's actually what a ninety-nine point two percent chance right if you multiply this time so you get 0.99 you get 0.99 actually you this you get 0.99 Oh two so we'll say roughly well say roughly 99% chance 99% chance that any sample that a random a random a random sample mean is within two point five eight two point eight five eight standard deviations of the sampling mean of of our population me of our of the the mean of the sampling distribution of the sampling mean which is the same thing as our actual population mean which is the same thing as our population proportion so of P and we know what this value is right here at least we have a decent estimate for this value we don't know exactly what this is but our best estimate for this value is this over here this over here so we could rewrite this and we could say that we are confident we are confident because we're all really using an estimator to get this value or here we are confident that there is a 99% chance chance that a random X a random sample mean is within and let's figure out this value right here using a calculator so it is 2.5 eight times our best estimate of the standard deviation of the sampling distribution so times 0.03 one is equal to point oh well let's just round this up because it's so close to 0.08 is within 0.08 is within point zero point oh eight of the population of the population proportion or you could say that you're confident that the population proportion is within 0.08 of your sample mean right that's the exact same suggest the exact same statement so if we want our confidence interval our actual our actual number that we got for there are actual sample mean we got was 0.568 0.568 so we could replace this and actually let me do it I can delete this right here let me clear it I can replace this because we actually did take a sample so I can replace this with 0.568 so we could be confident that 99 that there's a 99% chance that 0.568 is within 0.08 of the actual sample of the population proportion which is the same thing as the population mean which is the same thing as the mean of the sampling distribution of the sample mean so forth and so on and just to make it clear we can actually swap these two it wouldn't change the meaning if this is within point O eight of that then that is within point eight eight of this so let me switch this up a little bit so we can put a p is within of of let me switch this up of 0.568 and now linguistically it sounds a little bit more like a confidence interval we are confident that there's a 99% chance that p is within 0.08 of the sample mean of 0.568 so what would be our confidence interval it would be 0.568 0.568 plus or minus 0.08 and what would that be if you add point oh eight to this right over here at the upper end you're going to have you're going to have point six four eight and at the lower end of our range so this is going to be this is the upper end the lower end if we subtract eight from this we get point four eight eight so we are 99% confident that the true population proportion is between these two numbers or another way that the true percentage of teachers who think those computers or good ideas is between where 99% confident where that night there's a 99% chance that the true percentage of teachers that like the computers is between forty eight point eight percent and sixty four point eight percent now that we answered the first part of the question the second part how could the survey be changed to narrow the confidence interval but to maintain the 99% confidence interval well you could just take more samples if you take more samples than our estimate then our estimate of this of the standard deviation of this distribution will go down because this denominator will be higher if that denominator is higher then this whole thing will go down so if the standard deviations go down here then when we count the standard deviations when we do the plus or minus on the range this value will go down and we'll narrow our range so you just take more samples