If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: 6th grade (Eureka Math/EngageNY)>Unit 6

Lesson 2: Topic B: Summarizing a distribution that is approximately symmetric using the mean and mean absolute deviation

# Impact on median & mean: removing an outlier

In this golf game, Ana's lowest score of 80 was removed due to rule-breaking. This change increased both the mean and median of her remaining scores. However, the mean increased more than the median.

## Want to join the conversation?

• Won't removing an outlier be manipulating the data set? This video shows how the mean and median can change when the outlier is removed. So, if a scientist does some tests and gets an outlier, he/she can remove it to change the results to what he/she wants. So, I ask again, won't removing an outlier be unfairly changing the results?
• Depends. You're right that a scientist can't just arbitrarily discard a result, but if she'd been getting consistent results previously an outlier would suggest some kind of experimental error. If she can identify the source of that error then she is justified in removing the data.
In the video, it turned out that the score of 80 was as a result of "cheating", so we are right to discount it.
• Why is Ana so bad at golf
• She never learned how to play :(
• I remember much about mean, but not so much about the rest. can someone fill me in?
• Mean: Add all the numbers together and divide the sum by the number of data points in the data set.
Example: Data set; 1, 2, 2, 9, 8. (1 + 2 + 2 + 9 + 8) / 5

Median: Arrange all the data points from small to large and choose the number that is physically in the middle. If there is an even number of data points, then choose the two numbers in the (physical) middle and find the mean of the two numbers.
Example: Data set; 1, 2, 2, 9, 8, 10. Small to Large; 1, 2, 2, 8, 9, 10. Find the mean of 2 & 8.

Mode: The mode is the number that appears most frequently in a data set.
Example: Data set; 1, 2, 2, 9, 4, 10, 4. Mode: 2 and 4
• Sal, the lower the score the better in golf.
• At . If removing a number that is larger than the mean will make the mean itself go down, what will then happen with the median in this case? (when removing a number larger than the median)
• The median will also change because you've altered the data set. However, if you simply alter a number (other than the median), then the mean will change but the median will not.
• Pretty useful but how will we solve for the mean if it has a negative number?
• When calculating the mean of a dataset that includes negative numbers, you simply follow the same process as you would for positive numbers. Negative numbers are treated the same way as positive numbers when performing arithmetic operations. You sum all the values in the dataset, including negative ones, and then divide by the total number of values to find the mean. Just be mindful of the sign convention when adding and subtracting negative numbers.
(1 vote)
• l,m like math
• Starting from to , how does Sal find the mean without calculating? I thought about it and still couldn't understand how the mean increases, because removing one number means decreasing the total. If he removed 80, the original mean would drop.
(1 vote)
• Actually, Sal is correct, if you remove a number that is lower than the mean, the mean would increase. You have to remember that you are not only removing the 80 which decreases the total, but you are also removing one of the numbers, so the denominator also drops from 5 to 4. Dividing the sum of the higher number by 4 increases the mean.
• at ,why does the mean have to go up?
(1 vote)
• 80 is the lowest score.
All the other four scores are greater than 80, so they can be written as
80 + 𝑎, 80 + 𝑏, 80 + 𝑐, and 80 + 𝑑, for some positive values 𝑎, 𝑏, 𝑐, 𝑑.

The mean of these five scores is
(80 + (80 + 𝑎) + (80 + 𝑏) + (80 + 𝑐) + (80 + 𝑑))∕5 =
= (5 ∙ 80 + 𝑎 + 𝑏 + 𝑐 + 𝑑)∕5 = 80 + (𝑎 + 𝑏 + 𝑐 + 𝑑)∕5

If we remove the lowest score, then the new mean will be
((80 + 𝑎) + (80 + 𝑏) + (80 + 𝑐) + (80 + 𝑑))∕4 =
= (4 ∙ 80 + 𝑎 + 𝑏 + 𝑐 + 𝑑)∕4 = 80 + (𝑎 + 𝑏 + 𝑐 + 𝑑)∕4

𝑎, 𝑏, 𝑐, 𝑑 > 0 ⇒ 𝑎 + 𝑏 + 𝑐 + 𝑑 > 0 ⇒
⇒ (𝑎 + 𝑏 + 𝑐 + 𝑑)∕4 > (𝑎 + 𝑏 + 𝑐 + 𝑑)∕5, and thereby the new mean must be greater than the previous mean.