If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## AP®︎/College Statistics

### Course: AP®︎/College Statistics>Unit 9

Lesson 4: Sampling distributions for sample proportions

# Sampling distribution of a sample proportion example

Here's the type of problem you might see on the AP Statistics exam where you have to use the sampling distribution of a sample proportion.

## Example: Proportions in polling results

According to the US Census Bureau's American Community Survey, $87\mathrm{%}$ of Americans over the age of 25 have earned a high school diploma. Suppose we are going to take a random sample of $200$ Americans in this age group and calculate what proportion of the sample has a high school diploma.
What is the probability that the proportion of people in the sample with a high school diploma is less than $85\mathrm{%}$?
Let's solve this problem by breaking it down into smaller parts.

### Part 1: Establish normality

Note: The sampling distribution of a sample proportion $\stackrel{^}{p}$ is approximately normal as long as the expected number of successes and failures are both at least $10$.
Question A (Part 1)
What is the expected number of people in the sample with a high school diploma?
people

Question B (Part 1)
What is the expected number of people in the sample without a high school diploma?
people

Question C (Part 1)
Is the sampling distribution of $\stackrel{^}{p}$ approximately normal?

### Part 2: Find the mean and standard deviation of the sampling distribution

The sampling distribution of a sample proportion $\stackrel{^}{p}$ has:
$\begin{array}{rl}{\mu }_{\stackrel{^}{p}}& =p\\ \\ {\sigma }_{\stackrel{^}{p}}& =\sqrt{\frac{p\left(1-p\right)}{n}}\end{array}$
Note: For this standard deviation formula to be accurate, our sample size needs to be $10\mathrm{%}$ or less of the population so we can assume independence.
Question A (Part 2)
What is the mean of the sampling distribution of $\stackrel{^}{p}$?
${\mu }_{\stackrel{^}{p}}=$

Question B (Part 2)
What is the standard deviation of the sampling distribution of $\stackrel{^}{p}$?
You may round your answer to three decimal places.
${\sigma }_{\stackrel{^}{p}}=$

### Part 3: Use normal calculations to find the probability in question

What is the probability that the proportion of people in the sample with a high school diploma is less than $85\mathrm{%}$?

## Want to join the conversation?

• i want thank who ever did right this equation
• why is the formula for Standard deviation either sqrt(np(1-p)) OR its sqrt(p(1-p)/n)?
• It depends on what quantity you’re taking the standard deviation of. In a binomial distribution, the first formula you wrote is the standard deviation of the number of successes, while the second formula you wrote is the standard deviation of the sample proportion of successes.

Have a blessed, wonderful day!
• how do you tell if the sampling distribution is describing proportions or means
• Proportions would sound like "40% of the population knows morse code," where it's a "yes or no" situation. They either do have a certain trait/item or don't. In the case of proportions, the p given is the mean, as seen in the problem on this page.
For means, it would be more like "The average age of people that know morse code is 50 years old," where there's a range of possible values.
I hope this helps!
(The situations I made up don't contain real data lol)
• Sorry, but using a normal distribution to solve this problem gives incorrect results.

The sampling distribution is a binomial distribution. Using the formula for binomial distributions, one can determine that exactly 85% of the sample has a high school diploma is a whopping 0.0561. It therefore makes a huge difference if we are looking at the probability that the 85% or less of the sample have a high school diploma, or if we are looking at the probability that strictly less than 85% have a diploma. Using the binomial distribution formula again, the former gives a value of 0.2273 and the latter 0.1711. Since the question is asking about P(p^​<0.85), 0.17 is in fact the correct answer.
• What is the best way to find standard deviarion.
• In a set of 10,000 invoices,it is known that 500 contain errors.If 100 of the 10,000 invoices are randomly selected,what is the probability that the sample proportion of invoices with errors will exceed 0.08?
• First, calculate your population proportion.
p = 500/10,000 = 0.05

Your sample size is 100.

Next, check for normality.
np >= 10 AND n(1-p) >= 10
100*0.05 = 5 which is NOT >= 10.
100*0.95 = 95 which IS >= 10.

The sample distribution of sample proportions violates normality.
• Why do we need to prove independence to get the sample proportion standard deviation and not to get the mean ?
(1 vote)
• I don't really know what I'm doing... How do I find the answer to something using only the mean, the standard deviation, and the total population while knowing there are only two possible outcomes in the total population.
(1 vote)
• I'm still confused as to how we can use normal calculations, like a z-table.

The sampling distribution (of sample proportions) is a discrete distribution, and on a graph, the tops of the rectangles represent the probability.
The z-table/normal calculations gives us information on the area underneath the normal curve, since normal dists are continuous.

So we have a sampling dist, and we want to find the probability that we get a sample proportion that is less than 0.7. We know that the dist is approximately normal, and we have it's mean, and SD.
The probability that sample proportion < 0.7 is the tops of all the rectangles below 0.7 summed up for the sampling distribution.
But, for the normal dist (density curve) that approximates our sampling dist, using normalcdf on a calculator or a z-table gives us the proportion of the area under the curve that is < 0.7.

So how is the tops of all the rects below 0.7 summed up equal to the area of the rectangles (area under the normal curve) that is below 0.7?
(1 vote)
• "The sampling distribution (of sample proportions) is a discrete distribution, and on a graph, the tops of the rectangles represent the probability.
The z-table/normal calculations gives us information on the area underneath the normal curve, since normal dists are continuous."

But we have not here barrs whre the top of each bar represent a probability. What we have here is a dot plot, and the height of each "bar" in a dot plot doesn´t represent a probability.