Main content

### Course: AP®︎/College Statistics > Unit 3

Lesson 2: More on mean and median- Mean as the balancing point
- Missing value given the mean
- Missing value given the mean
- Impact on median & mean: increasing an outlier
- Impact on median & mean: removing an outlier
- Effects of shifting, adding, & removing a data point
- Estimating mean and median in data displays
- Estimating mean and median in data displays

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Estimating mean and median in data displays

Estimating mean and median in data displays.

## Want to join the conversation?

- I am so confused. How did Sal get 16th Data Point in B!(10 votes)
- If you try to solve the example yourself you'll
**realise**that**A is the 12.5th data point**and**B is the 17.5th data point***Upon averaging the values*(of the data points)**you'll get 15**, which is close to**16**(the solution); considering**that Sal and we all estimate**you can consider this to be*accurate**if observed from the terms of estimation.*(3 votes)

- wouldn't the mean be around 29 if you actually calculate it?(7 votes)
- I believe it will be around 2.8 ( divide 31 by 11 which is the total number of bars)(3 votes)

- I don't know how to do any of this the videos don't make any sense at all(7 votes)
- Can someone explain to me with toddler-level explanation on how to think of it intuitively?(3 votes)
- In brief, the mean is a value that is
**nonresistant**to extreme values/significant outliers, meaning that it would severely fluctuate if a new extreme value is introduced to the data. In contrast, the median is somewhat**resistant**to extreme values, as explained in the video. (Mean would experience more change than median when an extreme value is introduced)

These concepts are relatively difficult to grasp when we're just starting to learn about statistics, but eventually it'll come to you with a snap of your fingers!(6 votes)

- At about1:32minutes into the video, Sal mentioned, 16th highest and lowest data point. what did he mean by this?(6 votes)
- I have recalculated the first example and received another answer. Let me show my thoughts.

We have 31 different marks and we want to know "mean" of these marks.

Mean = (2marks which are zero(0)+ 1mark*1+2*2+1*3+1*4+1*5+1*6+2*7+2*8+5*9+13*10)/31 = 7.3

What actually is one step to the left.

Correct me where I am wrong.(4 votes)- The explanation is a bit technical, but hopefully you can follow.

The mean has to be a value between 0 and 10.

So, the mean sits on a segment of the number line, and that segment has length 10.

Now, because our histogram has 11 columns, we divide that segment into 11 equal parts.

That means that each part has a length of 10∕11.

As you mentioned, the mean is approximately 7.35, and we need

7.35∕(10∕11) ≈ 8.09 of these smaller parts to reach a total length of 7.35

So the mean sits on the 9th of the 11 parts, or in the 9th column, i.e. column A.(3 votes)

- *I m kinda not understanding. Anyone mind expaining?*(5 votes)
- This video is so confusing it doesn't make sense to me at all is there an easier way to explain this?(3 votes)
- Median- Middle number if the number is odd just find the middle number, if the number is even take the two numbers and find the number in-between the two

Mean- All numbers added together and divided by the total amount of numbers. I hope this helps as I am not the best at explaining(3 votes)

- the median doesn't make sense at all I know what it is but the way he explains it on a graph is confusing(4 votes)
- what would happen if you had a chart and then a...CAT ate it.On paper of cours(4 votes)

## Video transcript

- [Instructor] We are told
researchers scored 31 athletes on an agility test. Here are their scores. It's in this histogram. And what I'm going to ask you is, which of these intervals,
interval A, B, or C, which one contains the
median of the scores, and which one, or give an
estimate which one contains the mean of the scores. Pause this video and see
if you can figure that out. So let's just start with the median. Remember, the median you could
view as the middle number, or if you have an even
number of data points, it would be the average of the middle two. Here we have an odd number of data points, so it would be the middle number. So what would be the middle number if you were to order them
from least to greatest? Well, it would be the one
that has 15 on either side, so it would be the 16th data point. 16th data point. And so we could just think
about which interval here contains the 16th data point. You could view it for
the 16th from the highest or the 16th from the lowest. It is the middle one. All right, so let's
start from the highest. So this interval C contains
the 13 highest data points and then interval B goes
from the 14th highest all the way to the 18th highest. So this B contains the median. It contains the 16th highest data point, or if you started from the left, it would also be the
16th lowest data point. So that's where the median is. The median. Now what about an estimate for the mean? Well, you have calculated
the mean in the past, but when you're looking at
a distribution like this, when you're looking at a histogram, one way to think about the mean is it would be the balancing point. If you imagine that this
histogram was made out of some material of, let's
say, uniform density, where would you put a fulcrum
in order to balance it? If you put the fulcrum right over here, it feels like you would
tip over to the left because this is a
left-skewed distribution. You have this long tail to the left. If you really wanted to balance it out, it seems like you would
have to move your fulcrum in that direction of that left skew, in the direction of the tail. And so I would estimate to balance it out, it would actually be closer to that, which would be interval A. Interval A would contain the mean. The intention of this type of exercise isn't for you to try to
calculate every data point, in fact, they don't give you
all of the information here, and add them all up and then divide by 31. It's really to estimate and
to also get the intuition that when you have a left-skewed
distribution like this, you will often see a
situation where your mean is to the left of the median. If you have a right-skewed distribution, it would be the other way around. And as we will see, when we
see a symmetric distribution, the mean and the median will
be awfully close to each other, or when you have a roughly
symmetric distribution. If you have a perfectly
symmetric distribution, they might be exactly in the same place. So let's do another example. So here it says we have
the ages of 14 coworkers, and I want you to do is say
roughly where is the mean and roughly where is the median? Is it roughly at A, is it roughly at B, or its it roughly at C? Pause this video and try to figure it out. So let's first start off with the median. We have 14 data points. So this would be the average
of the middle two data points. It would be the average of the seventh and eighth data point. Well, you could say one, two,
three, four, five, six, seven, and then the eighth one is here. So the seventh data point is a 30. The eighth one is in the 31 bucket. So the average of the
two would get you to B. Another way that you
could think about it is you can just eyeball it and see you have just as many data points below B as you do have above B,
and so that also gives you a good indication that B
would be where the median is. So that is where the median is. Now what about the mean? Well, this is a perfectly
symmetric distribution. If I wanted to balance it, I would put the fulcrum
right in the middle. So I would say that the mean would also, the mean would also be
at B, and we are done.