If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Calculating a P-value given a z statistic

AP.STATS:
DAT‑3 (EU)
,
DAT‑3.A (LO)
,
DAT‑3.A.1 (EK)
,
DAT‑3.A.2 (EK)
,
VAR‑6 (EU)
,
VAR‑6.G (LO)
,
VAR‑6.G.4 (EK)
In a significance test about a population proportion, we first calculate a test statistic based on our sample results. We then calculate a p-value based on that test statistic using a normal distribution.

Want to join the conversation?

  • blobby green style avatar for user Parizad Babaei
    If you calculate the z-score, it will actually be 1.75, not 1.83.
    (18 votes)
    Default Khan Academy avatar avatar for user
  • starky tree style avatar for user Thye Tzy Yee
    It is not mention in video, but in practice: Calculating the P-value in a z test for a proportion. How do I know it is 2 tail or 1 tail?
    (6 votes)
    Default Khan Academy avatar avatar for user
  • leaf blue style avatar for user Brian Bale
    The P-value equation is misleading here. Whether Ha: is p>26 or p<26, P-value = P(z <= -|1.83|). That is the only way you can validly argue that p≠26 is P-value = 2 * P(z <= -|1.83|).
    (2 votes)
    Default Khan Academy avatar avatar for user
    • female robot grace style avatar for user Emily H
      Realize P(z ≤ -1.83) = P(z ≥ 1.83) since a normal curve is symmetric about the mean. The distribution for z is the standard normal distribution; it has a mean of 0 and a standard deviation of 1. For Ha: p ≠ 26, the P-value would be P(z ≤ -1.83) + P(z ≥ 1.83) = 2 * P(z ≤ -1.83). Regardless of Ha, z = (p̂ - p0) / sqrt(p0 * (1 - p0) / n), where z gives the number of standard deviations p̂ is from p0.
      (4 votes)
  • leaf green style avatar for user tahseen1995
    At why do we not divide by n - 1 to get an unbiased estimate?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • spunky sam blue style avatar for user Kevin L
      Because the sampling distribution of the sample proportion, whose standard deviation we're calculating, is itself a population and not a sample. We're not trying to estimate anything there, this is a "true" standard deviation.

      Think of it this way: while a single sample is part of a population, several samples are collectively a separate thing, a population of samples.

      And because of the central limit theorem, the mean of the sampling distribution will be the mean of the parent distribution:
      µ[p̂] = p
      µ[x̄] = µ
      (0 votes)
  • blobby green style avatar for user ju lee
    for this question, am i right in saying that the p value is also known as the probability of getting a sample proportion of 1/3, given that the null hypothesis is true?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • mr pink green style avatar for user sziszi.szilvi
      hm. Let me think "loud". For sure the test statistic here is z, and so we run the p-value calculation on our test statistic, namely the probability of z being at least as big as in the sample. Now as we got the reference z value from a sample showing 1/3 sample proportion, yes, I would say this is true what you are saying that
      P(z at least as this extreme | H0 is true) = P(sample proportion is at least 1/3 | H0 is true)
      or at least I can not imagine a different situation how else we could have an at least this large z value from a population of the same size.

      any mistake in my logic?
      (2 votes)
  • boggle blue style avatar for user Bryan
    Why do we decide what kind of p value we're using based on the alternative hypothesis?
    e.g. If our Ha was p > 10, then we would have a one tailed p-value of the probability of getting a sample proportion at least as deviant as our actual sample proportion, given that Ho is true.

    What's the logic behind this?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • spunky sam green style avatar for user Alexandr  Dmitrichenko
    Why the population proportion is located in mean point and we count z-statistic from it? I thought proportion is an area in a normal distribution.
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user mahendra.bm
    In Z-Score Table. P value for -1.83 is 0.0336 but for +1.83 is .96638.. Could you please tell me which one to chose.. but Sal told .0336 for both + and - 1.83.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • spunky sam blue style avatar for user Kevin L
      Sal used a simple shortcut.
      A z table indicates the proportion of the area of the distribution TO THE LEFT of a given z score. Given that normal distributions are by definition symmetric around their means, if we're looking for the area of just one tail in the positives, we can either subtract the proportion given by the z table from 1, or simply look at the corresponding negative z-score. To put it more formally:

      P(z ≤ -a) = P(z ≥ +a)

      Hope that helps!
      (1 vote)
  • leaf red style avatar for user dfbarbour
    At , Sal says, "...all of that over the standard deviation of the sampling distribution of the sample proportions." Why doesn't he just say "over the standard error"?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • hopper cool style avatar for user Iron Programming
    Hmmm... I calculated the z and I got;
    0.07/sqrt((0.26 * 0.74)/120) = 1.748
    Why did they round to 1.83? Did the numbers just work out better for the question (to simplify the calculations for the user)?

    Thanks! :)
    (1 vote)
    Default Khan Academy avatar avatar for user

Video transcript

- [Instructor] Fay read an article that said 26% of Americans can speak more than one language. She was curious if this figure was higher in her city, so she tested her null hypothesis that the proportion in her city is the same as all Americans, 26%. Her alternative hypothesis is it's actually greater than 26%, where P represents the proportion of people in her city that can speak more than one language. She found that 40 of 120 people sampled could speak more than one language. So what's going on is here's the population of her city, she took a sample, her sample size is 120. And then she calculates her sample proportion which is 40 out of 120 and this is going to be equal to one-third, which is approximately equal to 0.33. And then she calculates the test statistic for these results was Z is approximately equal to 1.83. We do this in other videos, but just as a reminder of how she gets this, she's really trying to say well how many standard deviations above the assumed proportion, remember when we're doing these significance tests we're assuming that the null hypothesis is true and then we figure out well what's the probability of getting something at least this extreme or this extreme or more? And then if it's below a threshold, then we would reject the null hypothesis which would suggest the alternative. But that's what this Z statistic is, is how many standard deviations above the assumed proportion is that? So the Z statistic, and we did this in previous videos, you would find the difference between this, what we got for our sample, our sample proportion, and the assumed true proportion. So 0.33 minus 0.26, all of that over the standard deviation of the sampling distribution of the sample proportions. And we've seen that in previous videos. That is just going to be the assumed proportion, so it would be just this. It would be the assumed population proportion times one, minus the assumed population proportion over N. In this particular situation, that would be 0.26 times one, minus 0.26, all of that over our N, that's our sample size, 120. And if you calculate this, this should give us approximately 1.83. So they did all of that for us. And they say assuming that the necessary conditions are met, they're talking about the necessary conditions to assume that the sampling distribution of the sample proportions is roughly normal and that's the random condition, the normal condition, the independence condition that we have talk about in the past. What is the approximate P value? Well this P value, this is the P value would be equal to the probability of in a normal distribution, we're assuming that the sampling distribution is normal 'cause we met the necessary conditions, so in a normal distribution, what is the probability of getting a Z greater than or equal to 1.83? So to help us visualize this, let's visualize what the sampling distribution would look like. We're assuming it is roughly normal. The mean of the sampling distribution right over here would be the assumed population proportion, so that would be P not. When we put that little zero there that means the assumed population proportion from the null hypothesis, and that's 0.26, and this result that we got from our sample is 1.83 standard deviations above the mean of the sampling distribution. So 1.83. So that would be 1.83 standard deviations. And so what we wanna do, this probability is this area under our normal curve right here. So now let's get our Z table. So notice this Z table gives us the area to the left of a certain Z value. We wanted it to the right of a certain Z value. But a normal distribution is symmetric. So instead of saying anything greater than or equal to 1.83 standard deviations above the mean, we could say anything less than or equal to 1.83 standard deviations below the means. So this is negative 1.83. And so we could look at that on this Z table right over here, negative 1.8, negative 1.83 is this right over here. So 0.0336. So there we have it. So this is approximately 0.0336 or a little over 3% or a little less than 4%. And so what Fay would then do is compare that to the significance level that she should have set before conducting this significance test. And so if her significance level was say 5%, well then that situation since this is lower that that significance level, she would be able to reject the null hypothesis. She would say hey the probability of getting this result assuming that the null hypothesis is true, is below my threshold. It's quite low. And so I will reject it and it would suggest the alternative. However, if her significance level was lower than this for whatever reason, if she has say a 1% significance level, then she would fail to reject the null hypothesis.