If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Correlation and Causation | Lesson

What is the difference between correlation and causation?

Many studies and surveys consider data on more than one variable. For example, suppose a study finds that, over the years, the prices of burgers and fries have both increased. Does this mean that an increase in the price of burgers causes the an increase in the price of fries? To answer questions like this, we need to understand the difference between correlation and causation.
Correlation means there is a relationship or pattern between the values of two variables. A scatterplot displays data about two variables as a set of points in the xy-plane and is a useful tool for determining if there is a correlation between the variables.
Causation means that one event causes another event to occur. Causation can only be determined from an appropriately designed experiment. In such experiments, similar groups receive different treatments, and the outcomes of each group are studied. We can only conclude that a treatment causes an effect if the groups have noticeably different outcomes.

What skills are tested?

  • Describing a relationship between variables
  • Identifying statements consistent with the relationship between variables
  • Identifying valid conclusions about correlation and causation for data shown in a scatterplot
  • Identifying a factor that could explain why a correlation does not imply a causal relationship

How can we determine if variables are correlated?

If there is a correlation between two variables, a pattern can be seen when the variables are plotted on a scatterplot. If this pattern can be approximated by a line, the correlation is linear. Otherwise, the correlation is non-linear.
There are three ways to describe correlations between variables.
  • : As x increases, y tends to increase.
  • : As x increases, y tends to decrease.
  • : As x increases, y tends to stay about the same or have no clear pattern.

Why doesn't correlation mean causation?

Even if there is a correlation between two variables, we cannot conclude that one variable causes a change in the other. This relationship could be coincidental, or a third factor may be causing both variables to change.
For example, Liam collected data on the sales of ice cream cones and air conditioners in his hometown. He found that when ice cream sales were low, air conditioner sales tended to be low and that when ice cream sales were high, air conditioner sales tended to be high.
  • Liam can conclude that sales of ice cream cones and air conditioner are positively correlated.
  • Liam can't conclude that selling more ice cream cones causes more air conditioners to be sold. It is likely that the increases in the sales of both ice cream cones and air conditioners are caused by a third factor, an increase in temperature!

Your turn!

TRY: DESCRIBING A RELATIONSHIP
Vivek notices that students in his class with larger shoe sizes tend to have higher grade point averages. Based on this observation, what is the best description of the relationship between shoe size and grade point average?
Choose 1 answer:

TRY: FINDING A CONSISTENT STATEMENT
A principal collected data on all students at her high school and concluded that there is no correlation between the number of absences and grade point average. Which of the following statements are consistent with the principal's findings?
Choose all answers that apply:

TRY: INTERPRETING A SCATTERPLOT
The scatterplot above shows the price of a hot dog and a small drink at seventeen different baseball stadiums. Based on the scatterplot, which of the following statements is true?
Choose 1 answer:

TRY: IDENTIFYING A CAUSAL FACTOR
Data from a certain city shows that the size of an individual's home is positively correlated with the individual's life expectancy. Which of the following factors would best explain why this correlation does not necessarily imply that the size of a individual's home is the main cause of increased life expectancy?
Choose 1 answer:

Things to remember

If there is a correlation between two variables, a pattern will be seen when the variables are plotted on a scatterplot.
There are three ways to describe the correlation between variables.
  • Positive correlation: As x increases, y increases.
  • Negative correlation: As x increases, y decreases.
  • No correlation: As x increases, y stays about the same or has no clear pattern.
Causation can only be determined from an appropriately designed experiment.
  • Sometimes when two variables are correlated, the relationship is coincidental or a third factor is causing them both to change.

Want to join the conversation?