Main content
Course: Praxis Core Math > Unit 1
Lesson 3: Statistics and probability- Data representations | Lesson
- Data representations | Worked example
- Center and spread | Lesson
- Center and spread | Worked example
- Random sampling | Lesson
- Random sampling | Worked example
- Scatterplots | Lesson
- Scatterplots | Worked example
- Interpreting linear models | Lesson
- Interpreting linear models | Worked example
- Correlation and Causation | Lesson
- Correlation and causation | Worked example
- Probability | Lesson
- Probability | Worked example
© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Correlation and causation | Worked example
Sal Khan works through a question on correlation and causation from the Praxis Core Math test.
Want to join the conversation?
- my question is why is the video so long
and no i have no questions(3 votes) - Um guys, is this freddy fazbear?(2 votes)
- bhhiihuiugguiyhiihupui(1 vote)
- There was a third variable, a lurking variable, which was the cold weather.(1 vote)
- I have no question(1 vote)
Video transcript
- What we have here is a
correlation and causation question and it says Data from a certain village
hospital show that the number of frostbite cases is
positively correlated with the number of sledding accidents. Which of the following
factors would best explain why this correlation does
not necessarily imply that frostbites are the main
cause of sledding accidents? And they give us five choices over here, and they want us to pick which of the following
factors would best explain why this correlation does
not necessarily imply that frostbites are the main
cause of sledding accidents. So, like always pause this video and see if you can answer it on your own before we do it together. Okay, now let's do it together. And just to make sure we
understand what's going on, when they say that the
number of frostbite cases is positively correlated with the
number of sledding accidents, what they're saying one way to visualize this is you can do some type of a scatter plot and you don't have to
understand this to be able to answer the question, but
it might help some of you. Where this is the amount of frostbite, or the number of frostbite cases. And then on this axis this
would be the sled accidents. And then each dot could be a day so that, let's say, this
right over here is a day that had a high number of frostbite cases and a high number of sledding accidents. This would be a day where you had a low number of frostbite cases and a low number of sledding accidents. And then if you were
to plot all of the days that you have data for, like this, the fact that it's
positively correlated means that days that have more frostbite cases tend to have more sledding
accidents and vice versa. Now you have to be very careful when you have this type of correlation. It's very tempting to say, Well maybe one of them causes the other. Maybe frostbite somehow
causes sledding accidents, or maybe sledding accidents, people are stuck out in the
snow, and it causes frostbites. And maybe that's the
case, or maybe it isn't. Maybe there is some other thing
that drives both of these. And then what jumps in my head, even before looking at the choices are, well when things get very cold, so when things are cold, it probably drives frostbite
cases and sledding accidents. Or another way to think about it is that either of these things are negatively correlated
with temperature. Cold is low temperature, so
when you have low temperature you have high cases of these, when you have high temperature,
you have low cases of these. Which is one thing that I could imagine. So you have a positive correlation between these but they both might have a negative correlation with temperature. Or another way of thinking about it they both might be driven
or in some ways even caused, it might be more than
correlation, by cold. So let's look at the choices here. People with frostbites tend to have more difficulty controlling sleds. Well if that were true,
that would actually speak to the causality argument. That it actually is frostbite that is a major cause
of sledding accidents. But we want to explain
why this correlation does not necessarily imply that frostbites are the main cause of sledding accidents. So this is not what we would wanna pick. The frostbite cases and sledding accidents were not randomly selected. Well this just speaks to
how the data was collected, but it doesn't speak to the main issue of not necessarily implying that frostbites are the main cause of sledding accidents. The hospital treats more sledding
accidents than frostbites. That's just making a
comparison between the two, but it still does not explain why there could be other cause
of the sledding accidents. Are there another main cause of the sledding accidents
outside of frostbite. So I would rule that out. It is likely that both the
number of frostbite cases and the number of sledding accidents are negatively correlated
with temperature. Well this is exactly what
we were talking about. Saying that it's negatively
correlated by temperature, that's the same thing as saying that it's positively correlated with cold. The colder it is, the more
likely that you're going to have more frostbite cases or
more sledding accidents. And I would go even further that that might not be a correlation, that cold might be the underlying variable. The underlying cause that would
drive both of these things. So I like this choice a lot. Choice E, statistical errors were made in the date collection process. Well, that's kind of like
choice B right over here. It speaks to the data
itself not being good, but that doesn't help us explain why the correlation does not
imply that frostbites are the main cause of sledding accidents.