If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: Digital SAT Math>Unit 3

Lesson 4: Center, spread, and shape of distributions: foundations

# Center, spread, and shape of distributions | Lesson

A guide to center, spread, and shape of distributions on the digital SAT

## What are center, spread, and shape of distributions?

Center, spread, and shape of distributions are also known as summary statistics (or statistics for short). These measurements are used to concisely describe data sets.
• Center describes a typical value of in a data set. The SAT covers three measures of center: mean, median, and occasionally mode.
• Spread describes the variation of the data. Two measures of spread are range and standard deviation.
You can learn anything. Let's do this!

## What do the measures of center represent?

### Statistics intro: mean, median, & mode

Khan Academy video wrapper
Statistics intro: Mean, median, & modeSee video transcript

### How do I find the mean, median, and mode?

On the SAT, we need to know how to find the mean, median, and mode of a data set.

#### Mean

The mean is the average value of a data set.
$\text{mean}=\frac{\text{sum of values}}{\text{number of values}}$

Example:
$2$, $5$, $6$, $7$, $10$
What is the mean of the data set above?

Example:
Pets ownedNumber of students
$0$$4$
$1$$3$
$2$$3$
$3$$2$
A teacher asked $12$ students how many pets they owned. The results are shown in the table above. What is the average number of pets owned by the students?

#### Median

The median is the middle value when the data are ordered from least to greatest.
• If the number of values is odd, the median is the middle value.
• If the number of values is even, the median is the average of the two middle values.

Example:
$9$, $7$, $12$, $5$, $9$
What is the median of the data set above?

Example:
$2$, $5$, $6$, $7$, $7$, $10$
What is the median of the data set above?

#### Mode

The mode is the value that appears most frequently in a data set. A data set can have no mode if no value appears more than any other; a data set can also have more than one mode.

Example:
$1$, $1$, $2$, $3$, $3$, $3$, $3$, $3$, $8$
What is the mode of the data set above?

### Try it!

Try: find the centers of a distribution
ItemPrice (dollars)
VHS tape$3$
Salad bowl$5$
Salt box$2$
Hammock$15$
Concert poster$5$
Hoodie$5$
Raccoon statue$7$
The table above shows the items Stevie bought from a garage sale and their prices.
What is the mean price of the items Stevie bought?
dollars
What is the median price of the items Stevie bought?
dollars
What is the mode of the prices?
dollars

## What do the measures of spread represent?

### Measures of spread: range, variance & standard deviation

Khan Academy video wrapper
Measures of spread: range, variance & standard deviationSee video transcript
Note: variance is not covered on the SAT, and while you may be asked about standard deviation, you will not need to calculate it on your own.

### How do I find the range and standard deviation?

On the SAT, we need to know how to find the range of a data set. While we won't be asked to calculate the standard deviation, we do need to have a sense of the relative standard deviations of two data sets.

#### Range

The range measures the total spread of the data; it is the difference between the maximum and minimum values.
$\text{range}=\text{maximum value}-\text{minimum value}$
A larger range indicates a greater spread in the data.

Example:
$1$, $9$, $4$, $3$, $8$
What is the range of the data set above?

#### Standard deviation

Standard deviation measures the typical spread from the mean; it is the average distance between the mean and a value in the data set.
Larger standard deviations indicate greater spread in the data.

Example:
Of the two dot plots shown above, which one has a greater standard deviation?

### Try it!

Try: compare two distributions
Guitar practice time in minutes
DayJazminPablo
Monday$30$$30$
Tuesday$45$$0$
Wednesday$30$$45$
Thursday$45$$30$
Friday$45$$0$
Saturday$60$$120$
Sunday$60$$90$
The table above shows the amount of time Jazmin and Pablo spent practicing guitar last week.
The range of Jazmin's practice times is
minutes.
The range of Pablo's practice times is
minutes.
Both Jazmin and Pablo practiced an average of $45$ minutes a day. However, because Jazmin's practice times are
the $45$-minute mean than Pablo's, the standard deviation of Jazmin's practice times is
that of Pablo's practice times.

## How do outliers affect summary statistics?

### Impact on median & mean: removing an outlier

Khan Academy video wrapper
Impact on median & mean: removing an outlierSee video transcript

### The effect of outliers

An outlier is a value in a data set that significantly differs from other values. The inclusion of outliers in data sets can greatly skew the summary statistics, which is why outliers are often removed from data sets.

#### Effect on the range and standard deviation

The inclusion of outliers increases the spread of data, leading to larger range and standard deviation. Conversely, removing outliers decreases the spread of data, leading to smaller range and standard deviation.

#### Effect on the mean

An outlier can significantly skew the mean of a data set. For example, consider the data set $\left\{3,5,7,7,10,100\right\}$.
$100$ is an outlier; it is significantly larger than the other values in the data set. If we include the $100$, the mean of the data set is:
$\frac{3+5+7+7+10+100}{6}=22$
Notice that the mean, $22$, is greater than $5$ of the $6$ values in the data set! If we remove the $100$, however, the mean of the remaining values is:
$\frac{3+5+7+7+10}{5}=6.4$
The removal of an outlier is guaranteed to change the mean.
• If a very large outlier is removed, the mean of the remaining values will decrease.
• If a very small outlier is removed, the mean of the remaining values will increase.

#### Effect on the median

The median of the data set $\left\{3,5,7,7,10,100\right\}$ is $7$.
If we remove the outlier $100$, the median of the remaining values, $\left\{3,5,7,7,10\right\}$, is still $7$ !
Because the median is based on the middle values of a data set, an outlier does not affect the median of a data set as strongly as it affects the mean. As such, the removal of an outlier can still change the median, but that change is not guaranteed.
• If a very large outlier is removed, the median of the remaining value will either decrease or remain the same.
• If a very small outlier is removed, the median of the remaining value will either increase or remain the same.

### Try it!

Try: determine the effect of removing an outlier
The dot plot above shows the height in inches of $20$ elementary school students.
If the shortest student is removed from the data set and the summary statistics are re-calculated, how would they compare to the summary statistics for all $20$ students?
The mean height of the $19$ remaining students would be
that of all $20$ students.
The median height of the $19$ remaining students would be
that of all $20$ students.
The range of the heights of the $19$ remaining students would be
that of all $20$ students.

## How do I use the mean to calculate a missing value?

### Missing value given the mean

Khan Academy video wrapper
Missing value given the meanSee video transcript

### How do I solve for a missing value?

If we know the mean of a data set and the number of values, we can calculate a missing value in the data set by:
1. Calculating the sum of values by multiplying the mean by the number of values.
2. Subtract all known values from the sum of values.

Example:
$20$, $20$, $40$, $60$, $x$
If the mean of the five numbers above is $30$, what is the value of $x$ ?

### Try it!

Try: find a missing value using the mean
GamePoints scored
$1$$11$
$2$$x$
$3$$13$
$4$$7$
$5$$9$
$6$$12$
The table above shows the number of points Marco scored in the last six basketball games he played. Marco doesn't remember how many points he scored in game $2$, but his coach tells him he averaged $10$ points per game.
What is the total number of points Marco scored in the six games?
points
How many points did Marco score in games $1$, $3$, $4$, $5$, and $6$ ?
points
How many points did Marco score in game $2$ ?
points

Practice: compare two distributions
NameTest $1$Test $2$Test $3$Test $4$Test $5$
Amara$98$$95$$94$$93$$95$
Lance$96$$95$$100$$88$$96$
Amara and Lance are taking the same class. The table above shows their test scores for the class. Which of the following statements about their test scores is true?

Practice: find the median given frequency data
Ned runs a soybean farm and recorded the yields for $175$ different one-acre sections. The results are shown in the graph above. Which of the following could be the median yield of Ned's soybean acres?

Practice: determine the effects of changing a data set
The minimum value of a data set consisting of $15$ positive integers is $29$. A new data set consisting of $16$ positive integers is created by including $22$ in the original data set. Which of the following measures must be $7$ greater for the new data set than for the original data set?

Practice: find a missing value using the mean
Last week, George drove an average of $52$ miles per day. If the day he drove the longest distance is removed, the average distance he drove in the remaining $6$ days becomes $40$ miles per day. What was the longest distance, in miles, George drove in a single day last week?

## Things to remember

$\text{mean}=\frac{\text{sum of values}}{\text{number of values}}$
The median is the middle value when the data are ordered from least to greatest.
• If the number of values is odd, the median is the middle value.
• If the number of values is even, the median is the average of the two middle values.
The mode is the most common value in a data set.
$\text{range}=\text{maximum value}-\text{minimum value}$
Standard deviation measures the typical spread from the mean.

## Want to join the conversation?

• My brain is not braining anymore *-*
• sounds like me :)
• can't be the only one who thinks the 7 range question is phrased horridly
• not at all, its a very straightforward question 💀
• didnt uderstand that how come 48 us the answer for the median of the yield.
• Guys its tomorrow for me aaaaaaaaaaa (pls pray for me n you)
• how did you do bro?
• standard deviation is confusing 😭
• bro some stuff is confusing esp from the soybean one until the end
• What exactly is a standard deviation??
• The standard deviation is the average amount of variability in your dataset. It tells you, on average, how far each value lies from the mean. A high standard deviation means that values are generally far from the mean, while a low standard deviation indicates that values are clustered close to the mean. So it basically tells you how close on average the numbers in your data set are to the mean of your data set. If the standard deviation is low, that means the numbers in your data set are clustered around the mean, whereas if the standard deviation is high, it means the numbers are more spread out from the mean. I hope this answer makes sense!
• in the practice section, i cant understand that bar graph question
• The number of one acre sections taken is 175. The median of that is the [(175+1)/2]= 88th section. Since the ranges of acres are arranged in ascending order, we can use the graph as is. We see that 40-45 bushels occur in 25 sections- not enough. Next, 45-50 bushels occur in the next 70 sections- the 88th section falls within this range. Since 48 bushels is the only option that falls into the corresponding bushel range, that is our answer.
• (For Soybeans one)
On the graph:
the x-axis represents the yield of soybeans in bushels from the soybean acres
the y-axis represents the number of acres

The average of the 175 acre section is 175/2 which is 87.5 or approximated to 88

Using the number of acres from the first 2 bars: 0-25 and 0-70
25 + 70 = 95 acres

95 acres is greater than 88. Thus, the 88th acre will be found within the 2nd bar where the yields of soybeans are between 45 and 50 bushels. The answer should be greater than 45 and less than 50 (45 > x > 50).

From the options, the only answer that fits this criteria is 48 bushels