If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: 6th grade (Illustrative Mathematics)>Unit 8

Lesson 12: Lesson 14: Comparing mean and median

# Impact on median & mean: removing an outlier

AP.STATS:
UNC‑1 (EU)
,
UNC‑1.K (LO)
,
UNC‑1.K.2 (EK)
CCSS.Math:
Sal thinks through the effects of removing a low outlier from a data set. What will happen to the mean and median?

## Want to join the conversation?

• Won't removing an outlier be manipulating the data set? This video shows how the mean and median can change when the outlier is removed. So, if a scientist does some tests and gets an outlier, he/she can remove it to change the results to what he/she wants. So, I ask again, won't removing an outlier be unfairly changing the results?
• Depends. You're right that a scientist can't just arbitrarily discard a result, but if she'd been getting consistent results previously an outlier would suggest some kind of experimental error. If she can identify the source of that error then she is justified in removing the data.
In the video, it turned out that the score of 80 was as a result of "cheating", so we are right to discount it.
• I remember much about mean, but not so much about the rest. can someone fill me in?
• Mean: Add all the numbers together and divide the sum by the number of data points in the data set.
Example: Data set; 1, 2, 2, 9, 8. (1 + 2 + 2 + 9 + 8) / 5

Median: Arrange all the data points from small to large and choose the number that is physically in the middle. If there is an even number of data points, then choose the two numbers in the (physical) middle and find the mean of the two numbers.
Example: Data set; 1, 2, 2, 9, 8, 10. Small to Large; 1, 2, 2, 8, 9, 10. Find the mean of 2 & 8.

Mode: The mode is the number that appears most frequently in a data set.
Example: Data set; 1, 2, 2, 9, 4, 10, 4. Mode: 2 and 4
• At . If removing a number that is larger than the mean will make the mean itself go down, what will then happen with the median in this case? (when removing a number larger than the median)
• The median will also change because you've altered the data set. However, if you simply alter a number (other than the median), then the mean will change but the median will not.
• Starting from to , how does Sal find the mean without calculating? I thought about it and still couldn't understand how the mean increases, because removing one number means decreasing the total. If he removed 80, the original mean would drop.
• Actually, Sal is correct, if you remove a number that is lower than the mean, the mean would increase. You have to remember that you are not only removing the 80 which decreases the total, but you are also removing one of the numbers, so the denominator also drops from 5 to 4. Dividing the sum of the higher number by 4 increases the mean.
• How can you remember all of this?
• at ,why does the mean have to go up?
(1 vote)
• 80 is the lowest score.
All the other four scores are greater than 80, so they can be written as
80 + 𝑎, 80 + 𝑏, 80 + 𝑐, and 80 + 𝑑, for some positive values 𝑎, 𝑏, 𝑐, 𝑑.

The mean of these five scores is
(80 + (80 + 𝑎) + (80 + 𝑏) + (80 + 𝑐) + (80 + 𝑑))∕5 =
= (5 ∙ 80 + 𝑎 + 𝑏 + 𝑐 + 𝑑)∕5 = 80 + (𝑎 + 𝑏 + 𝑐 + 𝑑)∕5

If we remove the lowest score, then the new mean will be
((80 + 𝑎) + (80 + 𝑏) + (80 + 𝑐) + (80 + 𝑑))∕4 =
= (4 ∙ 80 + 𝑎 + 𝑏 + 𝑐 + 𝑑)∕4 = 80 + (𝑎 + 𝑏 + 𝑐 + 𝑑)∕4

𝑎, 𝑏, 𝑐, 𝑑 > 0 ⇒ 𝑎 + 𝑏 + 𝑐 + 𝑑 > 0 ⇒
⇒ (𝑎 + 𝑏 + 𝑐 + 𝑑)∕4 > (𝑎 + 𝑏 + 𝑐 + 𝑑)∕5, and thereby the new mean must be greater than the previous mean.
• Pretty useful but how will we solve for the mean if it has a negative number?
• Why "mean" increases? These still were 5 games. Shouldn't the lowest score become 0 and still divide by 5.
• Since Ana "cheated" in that last game, the score didn't count, and you calculate the total as if she sat out that round.