If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Random sampling vs. random assignment (scope of inference)

Hilary wants to determine if any relationship exists between Vitamin D and blood pressure.
She is considering using one of a few different designs for her study.
Determine what type of conclusions can be drawn from each study design.

Scenario 1

Hilary obtains a random sample of residents from her town. She surveys those residents on whether or not they consume Vitamin D and how much Vitamin D they get. She also measures their blood pressures.
Suppose Hilary finds that among the people sampled, those who consume higher amounts of Vitamin D had significantly lower blood pressure than those who did not.
Problem a (scenario 1)
Based on this study, we can safely say this result probably holds true for:
Choose 1 answer:

PROBLEM B (SCENARIO 1)
Can we conclude that the difference in blood pressures is caused by the Vitamin D?
Choose 1 answer:

Scenario 2

Hilary recruits residents from her town who have physical exams scheduled in the next month with the local doctor's office. She randomly assigns the volunteers to either a Vitamin D supplement pill or a placebo pill. Participants do not know which pill they are taking. They have their blood pressures measured before the study begins and at the end of the study.
Suppose Hilary finds that the group who took the Vitamin D supplements had a significant decrease in blood pressure, while the placebo group showed no significant change in blood pressure.
Problem a (scenario 2)
Based on this study, we can safely say this result probably holds true for:
Choose 1 answer:

PROBLEM B (scenario 2)
Can we conclude that the difference in blood pressures is caused by the Vitamin D?
Choose 1 answer:

Note: In the real world, we can't ethically take a random sample of people and make them participate in a study involving drugs. However, there are more advanced methods for controlling for this type of selection bias. When we rely on volunteers for testing new drugs and we see significant results, we need to be willing to assume that the volunteers are representative of the larger population. We can also repeat the study on a different group of volunteers to see if we get the same results.
Key idea: If a sample isn't randomly selected, it may not be representative of the larger population. On the AP test, be ready to apply this concept and some nuance when it comes to discussing if a sample is representative of the larger population.

Summary

The table below summarizes what type of conclusions we can make based on the study design.
Random samplingNot random sampling
Random assignmentCan determine causal relationship in population. This design is relatively rare in the real world.Can determine causal relationship in that sample only. This design is where most experiments would fit.
No random assignmentCan detect relationships in population, but cannot determine causality. This design is where many surveys and observational studies would fit.Can detect relationships in that sample only, but cannot determine causality. This design is where many unscientific surveys and polls would fit.

Want to join the conversation?

  • blobby green style avatar for user 3Garcia, Elijah
    what is the meaning of life
    (20 votes)
    Default Khan Academy avatar avatar for user
  • duskpin ultimate style avatar for user HarleyQuinn
    Can you delve a bit deeper into generalizability please?

    ~HarleyQuinn
    (10 votes)
    Default Khan Academy avatar avatar for user
    • starky tree style avatar for user Erik L.
      Generalizability is a measure of how useful the results of a study are for a wider group of people. For example, if the results of a study are broadly applicable to several different types of people/situations, the study is said to have good generalizability. I hope this answers your question
      (2 votes)
  • blobby green style avatar for user Christian Fernandes
    I understand the basic idea of why randomization is so important in order to draw valid conclusions in any study design, but what I don't really get is, for scenario 1, how can we be so sure that Hilary's random sample is truly representative of all residents in the town itself? She could've randomly sampled 3 folks in her town to which I think may be insufficient amount of data to draw any valid conclusion. In the reverse, yeah she could have sampled say 100 or even a 1000 people. But we just don't know because it doesn't say so in the prompt. Do we just assume that, when it says "obtains a random sample blah blah blah," said random sample has sufficient amount of observations? OR does this even matter at all? Appreciate the help! :)
    (8 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Saivishnu Tulugu
      It is mathematically proven according to the Central Limit Thereom the larger the sample size the closer the sample mean will approach the population means. Thus samples are typically good if they have 30 or more. Randomization occurs to prevent bias. If the sample size is 30 or more we can assume its good.
      (2 votes)
  • duskpin ultimate style avatar for user Em
    For Scenario 2, why does the result only hold true for the people involved in the experiment and not the whole town? I'm not sure I understand this part.
    (2 votes)
    Default Khan Academy avatar avatar for user
  • leafers seed style avatar for user Zachary Keefe
    Problem A scenario 2 is absolutely ridiculous. Coincidence does not equal causality. Just because the ones taking vitamin D happened to have lower blood pressure absolutely does not unequivocally make one the cause of the other. This is simply incorrect.
    (4 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Chuck B
      There are some unstated assumptions, for instance that the treatment and control groups are similar in terms of demographic makeup, health, health-related habits, etc. To the extent the assumptions hold true, however, the differentiating factor between the two groups was exactly the consumption of vitamin D. Does this prove causality beyond any doubt? No. But in the absence of counter-evidence or alternative hypotheses, it is convincing.
      (2 votes)
  • blobby green style avatar for user 12azxsl
    I got myself into a pretty deep hole by taking ap stats.
    (4 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seedling style avatar for user David Bryant
    I do not agree with your contention that mere correlation (the result of a statistical analysis of a limited number of human subjects) can ever establish "causation". Cause and effect are categories of human action. One ought never conflate mere statistical correlation, no matter how "perfect" it appears to be, with causation. It is an epistemological error.
    (1 vote)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Arjun
      While it might not be perfectly established as causation, multiple experiments showing the same results, can as you say, reduce the error bars in size to a point at which they are no longer relevant. There is a small chance that the sun might not rise tomorrow, but would you change your plans for tomorrow based on that extremely small chance?
      (5 votes)
  • blobby green style avatar for user tashfiarahman20
    How can we differentiate between rbd and crd by observing an experimental design layout?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user A A
    Regarding representation, for random sampling, each person has an equal chance to be withdrawn, also the conditions of those people are not known, it could be anyone from the population who has some conditions or no conditions, and thus it’s not selective.

    For random assignment, researcher are selective and can choose from the population which group to conduct their study on, thus how can this be a representative of the population? It’s not because the selected group’s condition(s) don’t apply to the whole population

    Regarding causality, for random sampling, there are many conditions applied for a person like taking vit D and C and sleep early etc…so there’s no causality inferred since the confounding factor(s) exist.

    For Random assignment, causality is inferred. The treatment caused the effect because other irrelevant factors are more limited for the placebo group and treatment group, and thus, treatment causality is eligible to be inferred only for those two groups.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • starky sapling style avatar for user Daphne M
    Jared is interested in finding out which of two types of soda the students in his school prefer. To find out he wants to randomly select 50 students to participate in a study Read the following options and determine if they do or do not represent a random selection. Drag each statement to the appropriate box All of the students vote, and the 50 students with the All of the students select a marble from a bag, and the most votes participate 50 students with green marbles participate Jared asks 50 of his friends to participate in the study. The names of all of the students in the school are put in a bowl and 50 names are drawn. The first 50 students who come into the cafeteria are asked to participate.

    -Represents a Random Selection All of the students vote

    1. All of the students select a marble from a bag, and the 50 students with green marbles participate.
    2. Jared asks 50 of his friends to participate in the study.
    3. The names of all of the students in the school are put in a bowl and 50 names are drawn.
    4. The first 50 students who come into the cafeteria are asked to participate.
    5. The 50 students with the most votes participate.

    Which represents a Random Selection Does Not Represent a Random Selection?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      Represents a Random Selection:
      3. The names of all of the students in the school are put in a bowl and 50 names are drawn.

      The first 50 students who come into the cafeteria are asked to participate.
      Does Not Represent a Random Selection:

      All of the students vote
      All of the students select a marble from a bag, and the 50 students with green marbles participate.
      The 50 students with the most votes participate.
      (1 vote)