Main content

### Course: Praxis Core Math > Unit 1

Lesson 3: Statistics and probability- Data representations | Lesson
- Data representations | Worked example
- Center and spread | Lesson
- Center and spread | Worked example
- Random sampling | Lesson
- Random sampling | Worked example
- Scatterplots | Lesson
- Scatterplots | Worked example
- Interpreting linear models | Lesson
- Interpreting linear models | Worked example
- Correlation and Causation | Lesson
- Correlation and causation | Worked example
- Probability | Lesson
- Probability | Worked example

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Correlation and Causation | Lesson

## What is the difference between correlation and causation?

Many studies and surveys consider data on more than one variable. For example, suppose a study finds that, over the years, the prices of burgers and fries have both increased. Does this mean that an increase in the price of burgers

*causes*the an increase in the price of fries? To answer questions like this, we need to understand the difference between correlation and causation.**Correlation**means there is a relationship or pattern between the values of two variables. A scatterplot displays data about two variables as a set of points in the

**Causation**means that one event causes another event to occur. Causation can only be determined from an appropriately designed experiment. In such experiments, similar groups receive different treatments, and the outcomes of each group are studied. We can only conclude that a treatment

*causes*an effect if the groups have noticeably different outcomes.

### What skills are tested?

- Describing a relationship between variables
- Identifying statements consistent with the relationship between variables
- Identifying valid conclusions about correlation and causation for data shown in a scatterplot
- Identifying a factor that could explain why a correlation does not imply a causal relationship

## How can we determine if variables are correlated?

If there is a correlation between two variables, a pattern can be seen when the variables are plotted on a scatterplot. If this pattern can be approximated by a line, the correlation is

**linear**. Otherwise, the correlation is**non-linear**.There are three ways to describe correlations between variables.

- : As
increases,$x$ tends to increase.$y$

- : As
increases,$x$ tends to decrease.$y$ - : As
increases,$x$ tends to stay about the same or have no clear pattern.$y$

## Why doesn't correlation mean causation?

Even if there is a correlation between two variables, we cannot conclude that one variable causes a change in the other.
This relationship could be coincidental, or a third factor may be causing both variables to change.

For example, Liam collected data on the sales of ice cream cones and air conditioners in his hometown. He found that when ice cream sales were low, air conditioner sales tended to be low and that when ice cream sales were high, air conditioner sales tended to be high.

- Liam can conclude that sales of ice cream cones and air conditioner are positively correlated.
- Liam can't conclude that selling more ice cream cones causes more air conditioners to be sold. It is likely that the increases in the sales of both ice cream cones and air conditioners are caused by a third factor, an increase in temperature!

## Your turn!

## Things to remember

If there is a correlation between two variables, a pattern will be seen when the variables are plotted on a scatterplot.

There are three ways to describe the correlation between variables.

- Positive correlation: As
increases,$x$ increases.$y$ - Negative correlation: As
increases,$x$ decreases.$y$ - No correlation: As
increases,$x$ stays about the same or has no clear pattern.$y$

Causation can only be determined from an appropriately designed experiment.

- Sometimes when two variables are correlated, the relationship is coincidental or a third factor is causing them both to change.

## Want to join the conversation?

- I don't like the use of the word "linear" in question two. If there were no correlation, then the relationship could still be linear in that the "line" would be a flat line along one of the axes showing that one factor stays consistent whether or not the other factor is changed (no correlation). Do people refer to "linear" relationship to strictly mean correlated or has our definition become more precise?(11 votes)
- Two variables can have a linear relationship and not be correlated, or have a linear relationship and be correlated (positively or negatively).

The 'linear' is important because you could have other ways of correlating data which are not linear (for example, variables which are very strongly correlated in an exponential relationship, but only slightly correlated in a linear relationship)(5 votes)

- to be honest, I knew what the answer to each one was, but i didnt know how to phrase it(8 votes)
- how can the data on a scatter-plot be considered linear if it is not linear but instead it seems to have no correlation.(3 votes)
- Is there a way to identify if a relationship is causal rather than correlated?(1 vote)
- We need explainability. If we can explain why the relationship is causal, that still only makes it a theory. In order to verify causality, we would need to design an experiment in such a way that all other variables are controlled/constant so that any change in our Y variable could only be occuring because of the changes in our X variables (as all other factors are being kept constant).

Maybe this article could further clarify:

https://towardsdatascience.com/correlation-is-not-causation-ae05d03c1f53(4 votes)

- I don't like linear since it doesn't go straight(1 vote)
- Is there a way to identify if a relationship is causal rather than correlated?(1 vote)
- what is causation..? I'm so confused rn.(1 vote)
- I think I'm going to need more help on this(1 vote)
- how do you know when something is negatively correlated?(1 vote)
- Discuss why you think people assume a cause-and-effect relationship (use your example) when such a relationship has not been demonstrated with real data(1 vote)