If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Significance test for a proportion free response example

AP.STATS:
DAT‑3 (EU)
,
DAT‑3.A (LO)
,
DAT‑3.A.1 (EK)
,
DAT‑3.A.2 (EK)
,
DAT‑3.B (LO)
,
DAT‑3.B.2 (EK)
,
DAT‑3.B.8 (EK)
,
VAR‑6 (EU)
,
VAR‑6.D (LO)
,
VAR‑6.D.1 (EK)
,
VAR‑6.D.2 (EK)
,
VAR‑6.D.3 (EK)
,
VAR‑6.D.4 (EK)
,
VAR‑6.D.5 (EK)
,
VAR‑6.E (LO)
,
VAR‑6.E.1 (EK)
,
VAR‑6.F (LO)
,
VAR‑6.F.1 (EK)
,
VAR‑6.G (LO)
,
VAR‑6.G.1 (EK)
,
VAR‑6.G.2 (EK)
,
VAR‑6.G.3 (EK)
,
VAR‑6.G.4 (EK)

## Video transcript

we're told that some boxes of a certain brand of breakfast cereal include a voucher for a free video rental inside the box the company that makes the cereal claims that a voucher can be found in 20% of boxes however based on their experiences eating the cereal at home a group of students believes that the proportion of boxes with vouchers is less than 20% this group of students purchased 65 boxes of the cereal to investigate the company's claim the student found a total of 11 vouchers for free video rentals in the 65 boxes suppose it is reasonable to assume that the 65 box is purchased by the students are a random sample of all boxes of this cereal based on this sample is there support for the students belief that the proportion of boxes with vouchers is less than 20% provides statistical evidence to support your answer and so like always pause this video and see if you could answer it by yourself and this actually is a question from an AP statistics exam all right now let's work through this together and we're going to try to model some of what you might want to do if you were actually trying to answer this on an exam so the first thing you might want to say is well what's our null in our alternative hypothesis well our null hypothesis would be well the reality is what the breakfast brand claims that 20% of the boxes contain a voucher so that would be our null hypothesis and our alternative hypothesis would be what we suspect that the true proportion of boxes that contain a voucher is actually less than 20% now if you're going to do a significance test it's good practice to set up your significance level that you're going to eventually compare your p-value to ahead of time and so let's say we would want to assume assume significance level so let me write this significance significance level alpha let's just go with 0.05 and then we'll want to think about the sample and we're going to figure out if we assume that the null hypothesis is true what's the probability that we get the the the the sample proportion that we do and if that is below this significance level then we would reject the null hypothesis and so what we know about the sample we know that we took 65 boxes of cereal and is equal to 65 they tell us that right over there and we from that we can calculate what the sample proportion is it's going to be 11 out of 65 and we can get our calculator out calculators are allowed on this part of the exam and so what is 11 divided by 65 it gives us and I'll just round to the nearest thousandth 169 0.169 0.169 I'll say approximately because I rounded it there now the next thing we want to do before we make an inference is to make sure we're meeting the conditions for inference so I'll write this down over here conditions conditions for inference conditions for inference and this is to feel good that we have properly sampling the population that our sampling distribution is going to be roughly normal so the first one is random sample that is truly a random sample and here they tell us it is reasonable to assume that 65 that the 65 box is purchased by the students are a random sample so that checks that off so I will just point that to that right over there so that checks that off the next one is the normal condition that the shape is roughly normal and it isn't skewed dramatically one way or the other in order to meet that condition the sample size times the true assumed proportion and we're going to assume that the null hypothesis is true and so we could say that and we could even say that this is the proportion assumed in the null hypothesis that's what that notation would imply and if you're doing this on the actual test you should explain your use of notation a little bit more than I might do for the sake of time but this needs to be greater than or equal to 10 and n times 1 minus the assumed proportion needs to be greater than equal to 10 let's see n is 65 so 65 times the assumed proportion is 0.2 that is going to be equal to 1313 is indeed greater than or equal to 10 so that checks off and then we would take n 65 times 1 minus the assumed proportion so 0.8 and that is going to be equal to let's see that would just be 65 minus 13 which is going to be equal to 52 and that indeed is also greater than or equal to 10 so we met that condition right over there and then the last one is the independence independence we aren't sampling these boxes with replacement so we need to feel good that they are less than 10% of the population of boxes and they don't tell us that explicitly but it would be good practice to say going going to assume assume more than to see 10 times that 650 boxes in the population boxes in population population which would imply that the that n is less than less than 10% or less than or equal to 10% of population of population which would allow us to check off the independence condition and so given that we've met our conditions for inference now let's think about the sampling distribution so the sampling distribution of the sample proportions because that's what we're going to use to calculate a p-value so we know a few things about the sampling distribution of the sample proportions we know that the mean of the sampling distribution of the sample proportions is just going to be the assumed true proportion so that's the proportion from the null hypothesis and we know that the standard deviation of the sampling distribution of the sample proportions this is going to be equal to and we've seen this in multiple videos already this is the assumed proportion times 1 minus the assumed proportion from our null hypothesis divided by n which in this case is going to be equal to 0.2 times 0.8 all of that over at 65 once again let's get our calculator out so we're gonna have the square root of 0.2 times 0.8 divided by 65 and then close my parentheses I get so zero point zero zero point zero four nine six so this is approximately zero point zero four nine six now the next step is to figure out the p value which we can then compare to our significance level to decide whether or not to reject the null hypothesis and in order to calculate the p value let's figure out our z statistic which is how many standard deviations above or below the mean of the sampling distribution is the sample statistic that we happen to get for this sample of 65 and we have seen this in previous videos this would be equal to our sample proportion minus the assumed proportion for the population in the null hypothesis so the difference between those and then divided by the standard deviation of the sampling distribution of the sample proportions this would tell us how many standard deviations are we above or below the mean of the sampling distribution so in this particular situation this is going to be 0.16 9 minus 0.2 all of that over this value right over here which is approximately zero point zero four nine six I can get the calculator out again and so we have 0.16 nine minus 0.2 so that's how far below our sample proportion is then the mean of the sampling distribution which is the assumed proportion from the null hypothesis assumed population proportion and then we divide that we're going to divide that by the standard deviation of the sampling distribution of the proportions so divide that by point zero four nine six and we get a Z value of approximately because remember this is using a bunch of approximations right over here about negative 0.625 so Z is approximately negative zero point six two five and so now we can think about the actual p value our p value which is equal to the probability of getting a sample proportion that is at least as low as the one that we got so a sample proportion that is less than or equal to the one that we got 0.169 assuming the null hypothesis is true so we could say assuming the null hypothesis is true which is equal to the probability of getting a Z statistic that is less than or equal to this value right over here negative 0.625 and now we can use our calculator to actually calculate this so what we can do is we can go to second distribution we want to do normal if normal CDF so go to normal CDF and then our lower bound is actually going to be we could say negative infinity our upper bound is going to be negative so negative point six two five six two five and then this is where this is you could say the normalized normal distribution here so we'll just go with all of this because we're just thinking about the Z statistic right over here click enter and then click enter and then we get this is going to be lets say 0.266 so this is approximately zero point two six six and so let's just make sure what we just did if this right over here is the assumed sampling distribution of the sample proportions where we are assuming that our null hypothesis is true so the mean of our sampling distribution is going to be our sumed proportion what we're saying is look we got a result over here this is where our p hat happened to be right over here what's the probability of getting a result that far below the true proportion or further so this is what we calculated just now and now when you look at this this is almost a 27% probability when you compare our p value we're going to compare our p value to our significance level and we see that our p value is clearly greater than our significance level 0.266 is clearly greater than our significance level of 0.05 what we were saying is if there was less than a 5% chance of getting the sample proportion that we got then we would reject the null hypothesis which would suggest the alternative but here the probability of getting the sample proportion that we got if we assume that the null hypothesis is true is almost 27 percent and so that's well above our significance level so we will fail so because of this because of this we fail to reject reject our null hypothesis and from that we can say not enough evidence to suggest our alternative hypothesis and if you have time you might want to say there's not enough evidence to suggest that less than 20% of the boxes have the free video rental voucher that they talked about in the original problem description
AP® is a registered trademark of the College Board, which has not reviewed this resource.