If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Impact on median & mean: removing an outlier

In this golf game, Ana's lowest score of 80 was removed due to rule-breaking. This change increased both the mean and median of her remaining scores. However, the mean increased more than the median.

Want to join the conversation?

Video transcript

- "Ana played five rounds of golf "and her lowest score was an 80. " The scores of the first four rounds and the lowest round "are shown in the following dot plot." And we see it right over here. The lowest round she scores an 80, she also scores a 90 once, a 92 once, a 94 once, and a 96 once. "It was discovered that Ana broke some rules when she scored "80, so that score", so I guess cheating didn't help her, "so that score will be removed from the data set." So they removed that 80 right over there. We're just left with the scores from the other four rounds. "How will the removal of the lowest round "affect the mean and the median?" So let's actually think about the median first. So the median is the middle number. So over here when you had five data points the middle data point is gonna be the one that has two to the left and two to the right. So the median up here is going to be 92. The median up there is 92. And what's the median once you remove this? Now you only have four data points. When you're trying to find the median of an even number of numbers you look at the middle two numbers. So that's a 92 and a 94. And then you take the average of them. You go halfway between them to figure out the median. So the median here is going to be, let me do that a little bit clearer. The median over here is going to be halfway between 92 and 94 which is 93. So the median, the median is 93. Median is 93. So removing the lowest data point in this case increased the median. So the median, let me write it down here. So the median increased by a little bit. The median increases. Now what's going to happen to the mean? What's going to happen to the mean? Well one way to think about it without having to do any calculations is if you remove a number that is lower than the mean, lower than the existing mean, and I haven't calculated what the existing mean is, but if you remove that the mean is going to go up. The mean is going to go up. So hopefully that gives you some intuition. If you removed a number that's larger than the mean your mean is, your mean is going to go down cause you don't have that large number anymore. If you remove a number that's lower than the mean, well you take that out, you don't have that small number bringing the average down and so the mean will go up. But let's verify it mathematically. So let's calculate the mean over here. So we're gonna add 80, plus 90, plus 92, plus 94, plus 96. Those are our data points. And that gets us: two plus four is six, plus six is 12. And then we have one plus eight is nine, and this is, so these are nine and then you have another nine, another nine, another nine, another nine. You essentially have, this is five nines right over here. So this is going to be 452. So that's the sum of the scores of these five rounds, and then you divide it by the number of rounds you have. So it would be 452 divided by five. So 452 divided by five is going to give us, five goes into, it doesn't go into four, it goes into 45 nine times. Nine times five is 45, you subtract, get zero, bring down the two. Five goes into two zero times, zero times five is, zero times five is zero, subtract. You have two left over, so you can say that the mean here, the mean here is 90 and 2/5. Not nine and 2/5, 90 and 2/5. So the mean is right around here. So that's the mean of these data points right over there. And if you remove it what is the mean going to be? So here we're just going to take our 90, plus our 92, plus our 94, plus our 96, add 'em together. So let's see, two plus four plus six is 12. And then you add these together you're gonna get 37. 372 divided by four, cause I have four data points now, not five. Four goes into, let me do this in a place where you can see it. So four goes into 372, goes into 37 nine times. Nine times four is 36, subtract, you get a one. Bring down the two, it goes exactly three times. Three times four is 12. You have no remainder. So the median and the mean here are both, so this is also the mean. The mean here is also 93. So you see that the median, the median went from 92 to 93, it increased. The mean went from 90 and 2/5 to 93. So the mean increased by more than the median. They both increased but the mean increased by more. And it makes sense cause this number was way, way below all of these over here. So you could imagine if you take this out the mean should increase by a good amount. But let's see which of these choices are what we just described. "Both the mean and the median will decrease", nope. "Both the mean and the median will decrease", nope. "Both the mean and the median will increase, "but the mean will increase by more than the median." That's exactly, that's exactly, what happened. The mean went from 90 and 2/5 or 90.4, went from 90.4 or 90 and 2/5 to 93. And then the median only increased by one. So this is the right answer.