Main content

## Statistics and probability

### Unit 5: Lesson 1

Introduction to scatterplots- Constructing a scatter plot
- Constructing scatter plots
- Making appropriate scatter plots
- Example of direction in scatterplots
- Scatter plot: smokers
- Bivariate relationship linearity, strength and direction
- Positive and negative linear associations from scatter plots
- Describing trends in scatter plots
- Positive and negative associations in scatterplots
- Outliers in scatter plots
- Clusters in scatter plots
- Describing scatterplots (form, direction, strength, outliers)
- Scatterplots and correlation review

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Describing scatterplots (form, direction, strength, outliers)

AP.STATS:

DAT‑1 (EU)

, DAT‑1.A (LO)

, DAT‑1.A.1 (EK)

When we look at scatterplot, we should be able to describe the association we see between the variables.

A quick description of the association in a scatterplot should always include a description of the

*form, direction,*and*strength*of the association, along with the presence of any*outliers*.*Form:*Is the association linear or nonlinear?

*Direction:*Is the association positive or negative?

*Strength:*Does the association appear to be strong, moderately strong, or weak?

*Outliers:*Do there appear to be any data points that are unusually far away from the general pattern?

It's also important to include the context of the two variables in the description of these features. Here's an example.

## Example

Let's describe this scatterplot, which shows the relationship between the age of drivers and the number of car accidents per 100 drivers in the year 2009.

Here's a possible description that mentions the form, direction, strength, and the presence of outliers—and mentions the context of the two variables:

"This scatterplot shows a strong, negative, linear association between age of drivers and number of accidents. There don't appear to be any outliers in the data."

Notice that the description mentions the

*form*(linear), the*direction*(negative), the*strength*(strong), and the lack of*outliers*. It also mentions the context of the two variables in question (age of drivers and number of accidents).## Practice

## Want to join the conversation?

- In Problem #3, illustrations A and B, you show something we see in economics quite a bit. In economics, we're always interested in identifying "effects" that take place between variables. However, sometimes one effect drops off and then a new effect takes over. I call this phenomenon a "split" effect.

For example, in the Laffer curve, we at first see the government raise more tax revenue as tax rates increase because they collect more money from citizens. Simple enough. However, after a certain tax rate is reached, we start to see a new effect take place wherein the tax revenue drops off as the tax rate is increased further. This is because at very high rates of taxation, people either lose interest in working, or they start to seek ways of hiding their income from the government. Thus, we often see two or more different effects express themselves through a full range of data.

While I have always used the term "split" effect to describe such phenomenon, I have not been able to find this phenomenon acknowledged or identified (by any particular term) amongst economists or mathematicians. Mathematicians seem to simply call these scenarios "non-linear" or "curvilinear" relationships, without seeming to notice that there are invariably two distinct relationships being identified by the data.

Am I mistaken? Do mathematicians acknowledge split effects? If so, what term do mathematicians use to describe this type of phenomenon?(32 votes)- Mathematicians probably include your "split effect" in the category of nonlinear correlation(2 votes)

- aren't there too many outliers in problem 2 !*(7 votes)
- What do you mean? If you mean in general, there isn't a lot of outliers. There is only 2 and the 2 are in answer C..... was that a statement or a question?(3 votes)

- How is it possible to tell whether the correlation is strong or moderately strong?(5 votes)
- Strong correlation means that there aren't many outliers. In simple words, the dots on the graph are close to each other.(2 votes)

- no questions i understand(4 votes)
- why hast this world lose its mind?(4 votes)
- This is not the last of PEDTROL!(2 votes)
- I get confused with strong and not so strong relationships.(0 votes)
- If there are not too many outliers or none, that is a strong correlation.(5 votes)

- In the graph C how can you know if is moderate or strong, because the dots are compacted, can it be strong positive?(2 votes)
- For question 2, the dots in the plot looks kind of scattered, but why is it the actual answer?(1 vote)
- I get confused with strong and not so strong relationships.(1 vote)
- right now we are identify strong or weak intuitively, but in the future, we will evaluate it mathematically(1 vote)