Main content

## Testing hypotheses about a proportion

Current time:0:00Total duration:5:27

# Calculating a P-value given a z statistic

AP Stats: DAT‑3 (EU), DAT‑3.A (LO), DAT‑3.A.1 (EK), DAT‑3.A.2 (EK), VAR‑6 (EU), VAR‑6.G (LO), VAR‑6.G.4 (EK)

## Video transcript

- [Instructor] Fay read
an article that said 26% of Americans can speak
more than one language. She was curious if this
figure was higher in her city, so she tested her null hypothesis that the proportion in her city is the same as all Americans, 26%. Her alternative hypothesis
is it's actually greater than 26%, where P represents the
proportion of people in her city that can speak more than one language. She found that 40 of 120 people sampled could speak more than one language. So what's going on is here's
the population of her city, she took a sample, her sample size is 120. And then she calculates
her sample proportion which is 40 out of 120 and this is going to be equal to one-third, which is approximately equal to 0.33. And then she calculates the test statistic for these results was Z is
approximately equal to 1.83. We do this in other videos, but just as a reminder
of how she gets this, she's really trying to say well how many standard deviations above
the assumed proportion, remember when we're doing
these significance tests we're assuming that the
null hypothesis is true and then we figure out
well what's the probability of getting something at least this extreme or this extreme or more? And then if it's below a threshold, then we would reject the null hypothesis which would suggest the alternative. But that's what this Z statistic is, is how many standard deviations above the assumed proportion is that? So the Z statistic, and we did this in previous videos, you would find the
difference between this, what we got for our sample, our sample proportion, and the assumed true proportion. So 0.33 minus 0.26, all of that over the standard deviation of the sampling distribution
of the sample proportions. And we've seen that in previous videos. That is just going to be
the assumed proportion, so it would be just this. It would be the assumed
population proportion times one, minus the assumed population
proportion over N. In this particular situation, that would be 0.26 times one, minus 0.26, all of that over our N, that's our sample size, 120. And if you calculate this, this should give us approximately 1.83. So they did all of that for us. And they say assuming that the necessary conditions are met, they're talking about
the necessary conditions to assume that the sampling distribution of the sample proportions
is roughly normal and that's the random condition, the normal condition, the independence condition that we have talk about in the past. What is the approximate P value? Well this P value, this is the P value would be equal to the probability of in
a normal distribution, we're assuming that the
sampling distribution is normal 'cause we met the necessary conditions, so in a normal distribution, what is the probability of getting a Z greater than or equal to 1.83? So to help us visualize this, let's visualize what the sampling distribution would look like. We're assuming it is roughly normal. The mean of the sampling
distribution right over here would be the assumed
population proportion, so that would be P not. When we put that little zero there that means the assumed population proportion from the null hypothesis, and that's 0.26, and this result that
we got from our sample is 1.83 standard deviations above the mean of the sampling distribution. So 1.83. So that would be 1.83 standard deviations. And so what we wanna do, this probability is this area under our normal curve right here. So now let's get our Z table. So notice this Z table gives us the area to the left of a certain Z value. We wanted it to the right
of a certain Z value. But a normal distribution is symmetric. So instead of saying anything
greater than or equal to 1.83 standard deviations above the mean, we could say anything
less than or equal to 1.83 standard deviations below the means. So this is negative 1.83. And so we could look at that on this Z table right over here, negative 1.8, negative 1.83 is this right over here. So 0.0336. So there we have it. So this is approximately 0.0336 or a little over 3% or
a little less than 4%. And so what Fay would
then do is compare that to the significance level that she should have set before conducting
this significance test. And so if her significance
level was say 5%, well then that situation
since this is lower that that significance level, she would be able to
reject the null hypothesis. She would say hey the probability of getting this result assuming that the null hypothesis is true, is below my threshold. It's quite low. And so I will reject it and it would suggest the alternative. However, if her
significance level was lower than this for whatever reason, if she has say a 1% significance level, then she would fail to
reject the null hypothesis.