If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Correlation and causation | Worked example

Sal Khan works through a question on correlation and causation from the Praxis Core Math test.

Want to join the conversation?

Video transcript

- What we have here is a correlation and causation question and it says Data from a certain village hospital show that the number of frostbite cases is positively correlated with the number of sledding accidents. Which of the following factors would best explain why this correlation does not necessarily imply that frostbites are the main cause of sledding accidents? And they give us five choices over here, and they want us to pick which of the following factors would best explain why this correlation does not necessarily imply that frostbites are the main cause of sledding accidents. So, like always pause this video and see if you can answer it on your own before we do it together. Okay, now let's do it together. And just to make sure we understand what's going on, when they say that the number of frostbite cases is positively correlated with the number of sledding accidents, what they're saying one way to visualize this is you can do some type of a scatter plot and you don't have to understand this to be able to answer the question, but it might help some of you. Where this is the amount of frostbite, or the number of frostbite cases. And then on this axis this would be the sled accidents. And then each dot could be a day so that, let's say, this right over here is a day that had a high number of frostbite cases and a high number of sledding accidents. This would be a day where you had a low number of frostbite cases and a low number of sledding accidents. And then if you were to plot all of the days that you have data for, like this, the fact that it's positively correlated means that days that have more frostbite cases tend to have more sledding accidents and vice versa. Now you have to be very careful when you have this type of correlation. It's very tempting to say, Well maybe one of them causes the other. Maybe frostbite somehow causes sledding accidents, or maybe sledding accidents, people are stuck out in the snow, and it causes frostbites. And maybe that's the case, or maybe it isn't. Maybe there is some other thing that drives both of these. And then what jumps in my head, even before looking at the choices are, well when things get very cold, so when things are cold, it probably drives frostbite cases and sledding accidents. Or another way to think about it is that either of these things are negatively correlated with temperature. Cold is low temperature, so when you have low temperature you have high cases of these, when you have high temperature, you have low cases of these. Which is one thing that I could imagine. So you have a positive correlation between these but they both might have a negative correlation with temperature. Or another way of thinking about it they both might be driven or in some ways even caused, it might be more than correlation, by cold. So let's look at the choices here. People with frostbites tend to have more difficulty controlling sleds. Well if that were true, that would actually speak to the causality argument. That it actually is frostbite that is a major cause of sledding accidents. But we want to explain why this correlation does not necessarily imply that frostbites are the main cause of sledding accidents. So this is not what we would wanna pick. The frostbite cases and sledding accidents were not randomly selected. Well this just speaks to how the data was collected, but it doesn't speak to the main issue of not necessarily implying that frostbites are the main cause of sledding accidents. The hospital treats more sledding accidents than frostbites. That's just making a comparison between the two, but it still does not explain why there could be other cause of the sledding accidents. Are there another main cause of the sledding accidents outside of frostbite. So I would rule that out. It is likely that both the number of frostbite cases and the number of sledding accidents are negatively correlated with temperature. Well this is exactly what we were talking about. Saying that it's negatively correlated by temperature, that's the same thing as saying that it's positively correlated with cold. The colder it is, the more likely that you're going to have more frostbite cases or more sledding accidents. And I would go even further that that might not be a correlation, that cold might be the underlying variable. The underlying cause that would drive both of these things. So I like this choice a lot. Choice E, statistical errors were made in the date collection process. Well, that's kind of like choice B right over here. It speaks to the data itself not being good, but that doesn't help us explain why the correlation does not imply that frostbites are the main cause of sledding accidents.