If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Reference: Conditions for inference on a proportion

AP.STATS:
UNC‑4 (EU)
,
UNC‑4.B (LO)
,
UNC‑4.B.1 (EK)
,
UNC‑4.B.2 (EK)
,
VAR‑6 (EU)
,
VAR‑6.F (LO)
,
VAR‑6.F.1 (EK)
When we want to carry out inferences on one proportion (build a confidence interval or do a significance test), the accuracy of our methods depend on a few conditions. Before doing the actual computations of the interval or test, it's important to check whether or not these conditions have been met, otherwise the calculations and conclusions that follow aren't actually valid.
The conditions we need for inference on one proportion are:
  • Random: The data needs to come from a random sample or randomized experiment.
  • Normal: The sampling distribution of p, with, hat, on top needs to be approximately normal — needs at least 10 expected successes and 10 expected failures.
  • Independent: Individual observations need to be independent. If sampling without replacement, our sample size shouldn't be more than 10, percent of the population.
Let's look at each of these conditions a little more in-depth.

The random condition

Random samples give us unbiased data from a population. When samples aren't randomly selected, the data usually has some form of bias, so using data that wasn't randomly selected to make inferences about its population can be risky.
More specifically, sample proportions are unbiased estimators of their population proportion. For example, if we have a bag of candy where 50, percent of the candies are orange and we take random samples from the bag, some will have more than 50, percent orange and some will have less. But on average, the proportion of orange candies in each sample will equal 50, percent. We write this property as mu, start subscript, p, with, hat, on top, end subscript, equals, p, which holds true as long as our sample is random.
This won't necessarily happen if our sample isn't randomly selected though. Biased samples lead to inaccurate results, so they shouldn't be used to create confidence intervals or carry out significance tests.

The normal condition

The sampling distribution of p, with, hat, on top is approximately normal as long as the expected number of successes and failures are both at least 10. This happens when our sample size n is reasonably large. The proof of this is beyond the scope of AP statistics, but our tutorial on sampling distributions can provide some intuition and verification that this condition indeed works.
So we need:
expected successes: np10expected failures: n(1p)10\begin{aligned} &\text{expected successes: } np \geq 10 \\\\ &\text{expected failures: } n(1-p) \geq 10 \end{aligned}
If we are building a confidence interval, we don't have a value of p to plug in, so we instead count the observed number of successes and failures in the sample data to make sure they are both at least 10. If we are doing a significance test, we use our sample size n and the hypothesized value of p to calculate our expected numbers of successes and failures.

The independence condition

To use the formula for standard deviation of p, with, hat, on top, we need individual observations to be independent. When we are sampling without replacement, individual observations aren't technically independent since removing each item changes the population.
But the 10, percent condition says that if we sample 10, percent or less of the population, we can treat individual observations as independent since removing each observation doesn't significantly change the population as we sample. For instance, if our sample size is n, equals, 150, there should be at least N, equals, 1500 members in the population.
This allows us to use the formula for standard deviation of p, with, hat, on top:
sigma, start subscript, p, with, hat, on top, end subscript, equals, square root of, start fraction, p, left parenthesis, 1, minus, p, right parenthesis, divided by, n, end fraction, end square root
In a significance test, we use the sample size n and the hypothesized value of p.
If we are building a confidence interval for p, we don't actually know what p is, so we substitute p, with, hat, on top as an estimate for p. When we do this, we call it the standard error of p, with, hat, on top to distinguish it from the standard deviation.
So our formula for standard error of p, with, hat, on top is
sigma, start subscript, p, with, hat, on top, end subscript, approximately equals, square root of, start fraction, p, with, hat, on top, left parenthesis, 1, minus, p, with, hat, on top, right parenthesis, divided by, n, end fraction, end square root

Want to join the conversation?

  • blobby green style avatar for user Warren Sunada-Wong
    Why don't we use the sample standard deviation for the standard error?

    At the end, it says the formula for standard error ≈ sqrt(p-hat*(1-p-hat)/n). But since p-hat is a sample, why don't we use the sample standard deviation with the n-1 correction to estimate the true standard deviation of the sample distribution? Shouldn't it be sqrt(p-hat*(1-p-hat)/n-1)?
    (14 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Schrödinger's Cat
      The appearance of n in the expression for the standard deviation for p-hat is not due to sampling, but due to the number of trials n for the Binomial random variable X~B(n,p), where n is the number of trials and p is the probability of a success in any given trial.

      Unfortunately, in this context, the letter p is used for both the probability and the proportion.

      So, the random variable p-hat is actually a scaling, by 1/n, of the Binomial random variable X~B(n,p). That is, p-hat = B(n,p)/n. That's how we get the proportion of successes - divide the number of successes, X, by the number of trials, n.

      So, by the properties of scaling a random variable by the factor 1/n, the expected value E(p-hat)=(1/n)E(X) and the variance V(p-hat)=(1/n^2)V(X).

      Thus, the standard deviation for p-hat is given by the square root of (1/n^2)V(X)

      Recall, the mean and variance for the binomial random variable are, np and np(1-p), respectively. Hence the variance for p-hat is...
      V(p-hat) = np(1-p)/n^2,
      so that, the standard deviation for p-hat is...
      sqrt(np(1-p)/n^2) = sqrt(p(1-p)/n) as shown in the video.

      Hope this helped,
      with kind regards...
      (9 votes)
  • leaf blue style avatar for user marcello834
    I remember another condition where something (sample size maybe?) had to be at equal or greater than 30. What was that?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • starky seed style avatar for user Soerenna Farhoudi
    Can someone show me this proof for the normal condition or reference a link?
    All I can find is information about the 10% rule
    (4 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Qingyun
    What is the difference between the standard error of the mean(sigma^2/n) and the standard error of the sample proportion mentioned above? Thanks!
    (3 votes)
    Default Khan Academy avatar avatar for user
  • male robot johnny style avatar for user Mohamed Ibrahim
    Isn't "not being independent" would also affect the sampling distribution of the sample mean ?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • boggle blue style avatar for user Bryan
      Yes it would! I'm fairly sure the CLT assumes that the instances in the samples that you're taking the mean of are independent.

      Also, the formula for the SD of the sampling distribution of the sample mean would not work if our instances aren't independent.
      (3 votes)
  • leafers seed style avatar for user Priscilla Baltezar
    If our data does not meet normality, randomness, and independence conditions for statistical inference, what is the consequence? Can you still technically make inference if you do not meet one or more of this conditions?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • leaf blue style avatar for user Robert905
    If we are building a confidence interval for p,why do we get the standard error for the answer and how do we find the standard error without any variable.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • leaf blue style avatar for user Renee E.
    If a problem asking to check the 3 conditions has the random condition not met, but the Normal and independent conditions met, does that mean we can't check the confidence interval at all?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Andrea Menozzi
    it talks about significance test, these are yet to be explained in this course right?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • mr pink red style avatar for user johnson.jaylin
    When we don´t know what p is, is it always going be an standard error of p?
    (1 vote)
    Default Khan Academy avatar avatar for user