Main content

## More on mean and median

# Comparing means of distributions

## Video transcript

Voiceover:Kenny interviewed
freshmen and seniors at his high school, asking them how many pieces
of fruit they eat each day. The results are shown
in the 2 plots below. The first statement
that we have to complete is the mean number of
fruits is greater for, and actually, let me go
down the actual screen, is greater for, we have to pick between
freshmen and seniors. Then they said the mean is a good measure for the center of distribution of, and we pick either freshmen or seniors. Let me go back to my scratch pad here, and let's think about this. Let's first think about the first part. Let's just calculate the mean for each of these distributions. I encourage you to pause the video and try to calculate it out on your own. Let's first think about
the mean number of fruit for freshmen. Essentially, we're just going to take each of these data points, add them all together, and then divide by the
number of data points that we have. We have one data point at 0. We have one data point at 0, so I'll write 0. And then we have two data points at 1, so we could say plus 2 times 1. And then we have two data points at 2, so you write plus 2 times 2. And then, let's see, we
have a bunch of data. We have four data points at 3, so we could say we have four 3s. Let me circle that. So we have four 3s, plus 4 times 3. And then we have three 4s, so plus 3 times 4. And then we have a 5, so plus 5, and then we have a 6. Let me do this in a
color that you can see. And then we have a 6 right over here, plus 6. How many total points did we have? We had 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, oh, actually, be careful. We had 15 points and I
didn't put that one in there. Actually, let me just ... So we have 15 points, and I can't forget this one over here, so plus ... my pen is acting a little funny right now, but we'll power through that, plus 19. So what is this going to be? This is just going to be 0. This is going to be 2. This is going to be 4. This is going to be 12. My pen is really acting up. It's almost like it's
running out of digital ink or something. This is going to be another 12, and then we have 5, 6, and 19. So what is this going to be? 2 plus 4 is 6, plus 24 is 30, plus 11 is 41, plus 19 gets us to 60. 60 divided by 15 is 4, so the mean number of fruit per day for the freshmen is 4
pieces of fruit per day. This right over here, that right over there
is our mean for the ... Let me put that in a color
that you can actually see. Now let's do the same
calculation for the seniors. We have one data point where they didn't eat any fruit at all each
day, not too healthy. Then you have one 1, so I'll just write that as, we could actually write that as 1 times 1, but I'll just write that as 1. Then we have two 2s, so plus 2 times 2. Then we have one, two
, three, four, five 3s, five 3s, so plus 5 times 3. And then we have three 4s, so plus 3 times 4. And then we have two 5s, plus 2 times 5, and then we have a 6. We have a 6, plus 6, and we have a 7, someone eats 7 pieces of fruit each day, a lot of fiber, plus 7. And now, how many data points did we have? We have 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 16 data points. So we're going to divide this by 16. So what is this going to be? This is just 0. Let's see. This is, just right over, that's 0. This is 4. This is 15. This is 12. This is 10. So we have 1 plus 4 is 5 plus 15 is 20 plus 12 is 32 plus 10 is 42. 42 plus 6 is 48, 48. Am I doing ... 42 plus 6 is 48 plus 7, 48 plus 7 is 55. Did I do that right? Let me do that one more time. 1 plus 4 is 5 plus 15 is 20, 32, 42. 42 plus 13 is 55. So this is equal to 55 over 16, which is the same thing as, let's see, that's the same thing as 3 and 3 that ... 3 times 16 is 48, so 3 and 7/16. So the mean for the seniors, 3 and 7/16, that's right around ... let's see. This is 3, that's 4, so 7/16, it's a little less than a half. It's right around there. So the mean number of
fruits is defnitely greater for the freshmen. They have 4 ... Their mean number of
fruit eaten per day is 4 versus 3 and 7/16. The mean is a good measure for the center of the distribution of. So when we think about whether
it's freshmen or seniors, the mean is fairly sensitive
to when you have outliers here. For example, someone here was eating 19 pieces of fruit per day. That's an enormous amount of fruit. They must be only eating fruit. You can imagine if it
was even a bigger number, if someone was eating 20
or 30 pieces of fruit, just that one data point will
skew the entire mean upwards. That wouldn't be the effect on the mode because the mode is a middle number. Even if you change this one point all the way out here, it's not going to change
what the middle number is. So the mean is more
sensitive to these outliers, to these really, these points that are really, really high, really, really low. And because the seniors don't seem to have any outliers like that, I would say that the
mean is a good measure for the center of
distribution for the seniors, or a better measure for
the center of distribution for the seniors. Let's fill both of those out. The mean number of fruit is
greater for the freshmen, and the mean is a good measure for the center of
distribution for the seniors. You actually even see it here. We saw that the mean number
for freshmen was at 4, but if you just ignored
this person right over here and just you thought about the bulk of this distribution right over here, 4 really doesn't look
like the center of it. The center of it looks closer to 3 here. What happened is this one person eating 19 pieces of fruit per day skewed the mean upwards. While here, that 3 and
7/16 really did look closer to the actual distribution, closer to the ... actually, I shouldn't say ... I mean in both times, we actually did calculate the mean of the actual distribution. But here, since there's no outliers, it does seem the mean
seemed much closer to, I guess you could say
the middle of this pile right over here. Let's check our answer, and we got it right.