If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Sampling distribution of a sample proportion example

Here's the type of problem you might see on the AP Statistics exam where you have to use the sampling distribution of a sample proportion.

Example: Proportions in polling results

According to the US Census Bureau's American Community Survey, 87% of Americans over the age of 25 have earned a high school diploma. Suppose we are going to take a random sample of 200 Americans in this age group and calculate what proportion of the sample has a high school diploma.
What is the probability that the proportion of people in the sample with a high school diploma is less than 85%?
Let's solve this problem by breaking it down into smaller parts.

Part 1: Establish normality

Note: The sampling distribution of a sample proportion p^ is approximately normal as long as the expected number of successes and failures are both at least 10.
Question A (Part 1)
What is the expected number of people in the sample with a high school diploma?
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3/5
  • a simplified improper fraction, like 7/4
  • a mixed number, like 1 3/4
  • an exact decimal, like 0.75
  • a multiple of pi, like 12 pi or 2/3 pi
people

Question B (Part 1)
What is the expected number of people in the sample without a high school diploma?
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3/5
  • a simplified improper fraction, like 7/4
  • a mixed number, like 1 3/4
  • an exact decimal, like 0.75
  • a multiple of pi, like 12 pi or 2/3 pi
people

Question C (Part 1)
Is the sampling distribution of p^ approximately normal?
Choose 1 answer:

Part 2: Find the mean and standard deviation of the sampling distribution

The sampling distribution of a sample proportion p^ has:
μp^=pσp^=p(1p)n
Note: For this standard deviation formula to be accurate, our sample size needs to be 10% or less of the population so we can assume independence.
Question A (Part 2)
What is the mean of the sampling distribution of p^?
μp^=
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3/5
  • a simplified improper fraction, like 7/4
  • a mixed number, like 1 3/4
  • an exact decimal, like 0.75
  • a multiple of pi, like 12 pi or 2/3 pi

Question B (Part 2)
What is the standard deviation of the sampling distribution of p^?
You may round your answer to three decimal places.
σp^=
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3/5
  • a simplified improper fraction, like 7/4
  • a mixed number, like 1 3/4
  • an exact decimal, like 0.75
  • a multiple of pi, like 12 pi or 2/3 pi

Part 3: Use normal calculations to find the probability in question

What is the probability that the proportion of people in the sample with a high school diploma is less than 85%?
Choose 1 answer:

Want to join the conversation?

  • starky ultimate style avatar for user Ollie
    why is the formula for Standard deviation either sqrt(np(1-p)) OR its sqrt(p(1-p)/n)?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • primosaur seed style avatar for user Ian Pulizzotto
      It depends on what quantity you’re taking the standard deviation of. In a binomial distribution, the first formula you wrote is the standard deviation of the number of successes, while the second formula you wrote is the standard deviation of the sample proportion of successes.

      Have a blessed, wonderful day!
      (19 votes)
  • blobby green style avatar for user dennisj
    how do you tell if the sampling distribution is describing proportions or means
    (2 votes)
    Default Khan Academy avatar avatar for user
    • winston default style avatar for user Brad Barakat
      Proportions would sound like "40% of the population knows morse code," where it's a "yes or no" situation. They either do have a certain trait/item or don't. In the case of proportions, the p given is the mean, as seen in the problem on this page.
      For means, it would be more like "The average age of people that know morse code is 50 years old," where there's a range of possible values.
      I hope this helps!
      (The situations I made up don't contain real data lol)
      (3 votes)
  • aqualine ultimate style avatar for user rdeyke
    Sorry, but using a normal distribution to solve this problem gives incorrect results.

    The sampling distribution is a binomial distribution. Using the formula for binomial distributions, one can determine that exactly 85% of the sample has a high school diploma is a whopping 0.0561. It therefore makes a huge difference if we are looking at the probability that the 85% or less of the sample have a high school diploma, or if we are looking at the probability that strictly less than 85% have a diploma. Using the binomial distribution formula again, the former gives a value of 0.2273 and the latter 0.1711. Since the question is asking about P(p^​<0.85), 0.17 is in fact the correct answer.
    (3 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      The sampling distribution of a sample proportion is based on the binomial distribution. The binomial distribution provides the exact probabilities for the number of successes in a fixed number of independent Bernoulli trials (like success/failure or yes/no).

      When the sample size is large, the sampling distribution of the sample proportion can be approximated by a normal distribution due to the Central Limit Theorem. However, when dealing with exact probabilities, especially at specific proportions like "exactly 85%", it's crucial to use the binomial distribution rather than the normal approximation.
      (1 vote)
  • blobby green style avatar for user jplatt
    What is the best way to find standard deviarion.
    (0 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Rorisang Lesaoana
    In a set of 10,000 invoices,it is known that 500 contain errors.If 100 of the 10,000 invoices are randomly selected,what is the probability that the sample proportion of invoices with errors will exceed 0.08?
    (0 votes)
    Default Khan Academy avatar avatar for user
  • male robot johnny style avatar for user Mohamed Ibrahim
    Why do we need to prove independence to get the sample proportion standard deviation and not to get the mean ?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      Independence is a crucial assumption for using the standard deviation formula of the sample proportion. This assumption ensures that the sampling distribution behaves similarly to the binomial distribution.

      The mean does not require the same independence assumption because the expected value of the sample proportion is directly related to the population proportion and the sample size, and it doesn't rely on individual outcomes being independent.
      (1 vote)
  • aqualine ultimate style avatar for user ariannab
    I don't really know what I'm doing... How do I find the answer to something using only the mean, the standard deviation, and the total population while knowing there are only two possible outcomes in the total population.
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      You'd use the properties of the normal distribution. The mean (μ) represents the center of the distribution, and the standard deviation (σ) represents the spread.

      If there are only two possible outcomes and you want to find the probability of one of them, you'd use the Normal CDF function in Excel or similar tools.
      (1 vote)
  • boggle blue style avatar for user Bryan
    I'm still confused as to how we can use normal calculations, like a z-table.

    The sampling distribution (of sample proportions) is a discrete distribution, and on a graph, the tops of the rectangles represent the probability.
    The z-table/normal calculations gives us information on the area underneath the normal curve, since normal dists are continuous.

    So we have a sampling dist, and we want to find the probability that we get a sample proportion that is less than 0.7. We know that the dist is approximately normal, and we have it's mean, and SD.
    The probability that sample proportion < 0.7 is the tops of all the rectangles below 0.7 summed up for the sampling distribution.
    But, for the normal dist (density curve) that approximates our sampling dist, using normalcdf on a calculator or a z-table gives us the proportion of the area under the curve that is < 0.7.

    So how is the tops of all the rects below 0.7 summed up equal to the area of the rectangles (area under the normal curve) that is below 0.7?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • winston default style avatar for user Victor Gutierrez
      "The sampling distribution (of sample proportions) is a discrete distribution, and on a graph, the tops of the rectangles represent the probability.
      The z-table/normal calculations gives us information on the area underneath the normal curve, since normal dists are continuous."

      But we have not here barrs whre the top of each bar represent a probability. What we have here is a dot plot, and the height of each "bar" in a dot plot doesn´t represent a probability.
      (0 votes)
  • blobby green style avatar for user Deepti Bist
    Hi, I do not have a calcultor as used in this exercise. Can someone explain, how can I use Excel to get the Normal CDF. I have tried and my answer is not the same. Many Thanks
    (0 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      To calculate the Normal CDF in Excel, you can use the NORM.DIST function.

      For the cumulative probability to the left of a specific value (e.g., less than 0.85), use:
      =NORM.DIST(0.85, mean, standard deviation, TRUE)

      For the cumulative probability to the right of a specific value, use: =1 - NORM.DIST(0.85, mean, standard deviation, TRUE)
      (1 vote)
  • blobby green style avatar for user isaac.uzzle
    Question 2 Part A has a typo, the question on my screen clearly said round to 3 decimal places, but when I entered the correct answer it told me it was incorrect. So I checked the explanation and it had the answer rounded to the 4th decimal place instead of the third. I put the 4th decimal place answer in and it worked. You need to fix that because its going to confuse people who follow directions but then get the problem wrong.
    (0 votes)
    Default Khan Academy avatar avatar for user