If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Describing scatterplots (form, direction, strength, outliers)

AP.STATS:
DAT‑1 (EU)
,
DAT‑1.A (LO)
,
DAT‑1.A.1 (EK)
When we look at scatterplot, we should be able to describe the association we see between the variables.
A quick description of the association in a scatterplot should always include a description of the form, direction, and strength of the association, along with the presence of any outliers.
Form: Is the association linear or nonlinear?
Direction: Is the association positive or negative?
Strength: Does the association appear to be strong, moderately strong, or weak?
Outliers: Do there appear to be any data points that are unusually far away from the general pattern?
It's also important to include the context of the two variables in the description of these features. Here's an example.

Example

Let's describe this scatterplot, which shows the relationship between the age of drivers and the number of car accidents per 100 drivers in the year 2009.
Here's a possible description that mentions the form, direction, strength, and the presence of outliers—and mentions the context of the two variables:
"This scatterplot shows a strong, negative, linear association between age of drivers and number of accidents. There don't appear to be any outliers in the data."
Notice that the description mentions the form (linear), the direction (negative), the strength (strong), and the lack of outliers. It also mentions the context of the two variables in question (age of drivers and number of accidents).

Practice

Problem 1
Choose the scatterplot that best fits this description:
"There is a strong, positive, linear association between the two variables."
Choose 1 answer:
Choose 1 answer:

Problem 2
Choose the scatterplot that best fits this description:
"There is a moderately strong, negative, linear association between the two variables with a few potential outliers."
Choose 1 answer:
Choose 1 answer:

Problem 3
Choose the scatterplot that best fits this description:
"There is a strong, negative, nonlinear association between the two variables."
Choose 1 answer:
Choose 1 answer:

Want to join the conversation?

  • blobby green style avatar for user Art Lightstone
    In Problem #3, illustrations A and B, you show something we see in economics quite a bit. In economics, we're always interested in identifying "effects" that take place between variables. However, sometimes one effect drops off and then a new effect takes over. I call this phenomenon a "split" effect.

    For example, in the Laffer curve, we at first see the government raise more tax revenue as tax rates increase because they collect more money from citizens. Simple enough. However, after a certain tax rate is reached, we start to see a new effect take place wherein the tax revenue drops off as the tax rate is increased further. This is because at very high rates of taxation, people either lose interest in working, or they start to seek ways of hiding their income from the government. Thus, we often see two or more different effects express themselves through a full range of data.

    While I have always used the term "split" effect to describe such phenomenon, I have not been able to find this phenomenon acknowledged or identified (by any particular term) amongst economists or mathematicians. Mathematicians seem to simply call these scenarios "non-linear" or "curvilinear" relationships, without seeming to notice that there are invariably two distinct relationships being identified by the data.

    Am I mistaken? Do mathematicians acknowledge split effects? If so, what term do mathematicians use to describe this type of phenomenon?
    (32 votes)
    Default Khan Academy avatar avatar for user
  • mr pink red style avatar for user Andrew McClellen
    aren't there too many outliers in problem 2 !*
    (7 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Arbaaz Ibrahim
    How is it possible to tell whether the correlation is strong or moderately strong?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user jacob collier
    no questions i understand
    (4 votes)
    Default Khan Academy avatar avatar for user
  • leafers seed style avatar for user sa06383
    why hast this world lose its mind?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • starky sapling style avatar for user Brian Pedregon
    This is not the last of PEDTROL!
    (2 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seed style avatar for user larkin23
    I get confused with strong and not so strong relationships.
    (0 votes)
    Default Khan Academy avatar avatar for user
  • aqualine seed style avatar for user Maria Groff
    In the graph C how can you know if is moderate or strong, because the dots are compacted, can it be strong positive?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user 27linkaili
    For question 2, the dots in the plot looks kind of scattered, but why is it the actual answer?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user jlopez1829
    I get confused with strong and not so strong relationships.
    (1 vote)
    Default Khan Academy avatar avatar for user