Main content

## Box and whisker plots

Current time:0:00Total duration:7:44

## Video transcript

- [Voiceover] So i have
a box and whiskers plot showing us the ages of
students at a party. And what I'm hoping to do in this video is get a little bit of
practice interpreting this. And what I have here are
five different statements and I want you to look
at these statements. Pause the video, look at these statements, and think about which of these, based on the information in
the box and whiskers plot, which of these are for sure true, which of these are for sure false, and which of these we don't
have enough information, it could go either way. Alright, so let's work through these. So the first statement is
that all of the students are less than 17 years old. Well we see, right over
here, that the maximum age, that's the right end of
this right whisker is 16. So it is the case that
all of the students are less than 17 years old. So this is definitely going to be true. The next statement. At least 75% of the students
are 10 years old or older. So, when you look at
this, this feels right, because 10 is, 10 is the value that is at the beginning
of the second quartile. This is the second
quartile right over there. And actually, let me do this, let me do this in a different color. So, this is the second quartile. So 25% of the value of the
numbers are in the second, or roughly, sometimes it's not exactly, so approximately, I'll say roughly 25% are going to be in this second quartile, approximately 25% are going
to be in the third quartile, and approximately 25% are going to be in the fourth quartile. So it seems reasonable for
saying 10 years old or older that this is going to be, this is going to be true. In fact, you could even have a couple of values in the first
quartile that are 10. But to make that a little more tangible, let's look at some, so I'm feeling, I'm feeling good that this is true, but let's look a few more examples to make this a little more concrete. So they don't know, we don't know, based on the information here exactly how many students are at the party. We'll have to construct some scenarios. So we could do a scenario, let's see if we can do... We could do a scenario where well let's see, let's see if I can, I can construct something
where, let's see, the median is 13. We know that for sure. The median is 13, so if
I have an odd number, I would have 13 in the
middle, just like that, and maybe I have three on either side. And I'm just making that number up. I'm just trying to see what I can learn about different types of data sets that could be described by
this box and whiskers plot. So 10 is going to be the
middle of the bottom half. So that's 10 right over there. And 15 is going to be the
middle of the top half. That's what this box and
whiskers plot is telling us. And they of course tell
us what the minimum, the minimum is seven. And they tell us that the maximum is 16. So we know that's seven
and then that is 16. And then this, right over
here, could be anything. It could be 10, it could be 11, it could be 12, it could be 13. It wouldn't change what these medians are. It wouldn't change this
box and whiskers plot. Similarly, this could
be 13, it could be 14, it could be 15, and so any of those values wouldn't change it. And so 75% are 10 or older, well, this value, in this case, six out of seven are 10 years old or older. And we could try it out with other, other scenarios where... let's try to minimize the number
of 10s given this data set. Well we could do something like, let's say that we have eight. So let's see, one, two, three, four, five, six, seven, eight. And so here we know that the minimum, we know that the minimum is seven, we know that the maximum is 16. We know, we know that the, we know that the mean of
these middle two values, we have an even number now so, the median is going to be
the mean of these two values. So, it's going to be the
mean of this and this, is going to be 13. And we know that the mean of, we know that the mean of this and this is going to be 10 and that
the mean of this and this is going to be, is going to be 15. So what could we construct? Well actually, we don't
even have to construct to answer this question. We know that, we know that this is going, this is going to have to be 10 or larger. And then all of these other things are going to be 10 or larger,
so this is exactly 75%. Exactly 75% if we assume
that this is less than 10, are going to be 10 years old or older. So feeling very good, very good, about this one right over here. And actually, just to make this concrete, I'll put in some values here. You know this could be, this could be a nine and an 11. This could be a 12 and a 14. This could be a 14 and a 16. Or, it could be, it
could be a 15 and a 15. You could think about
it and in any of those, in any of those ways. But feeling very good
that this is definitely going to be true based on the information given in this plot. Now they say there's only one
seven-year-old at the party. One seven-year-old at the party. Well this first, this first
possibility that we looked at, that was the case. There was only one
seven-year-old at the party and there was one
16-year-old at the party. And actually, that was the next statement, there's only one 16-year-old at the party. So both of these seem like we can definitely construct
data that's consistent with this box plot, box and whiskers plot, where this is true. But could we construct
one where it's not true? Well sure. Let's imagine, let's see,
we have our median at 13. Median at 13. And then we have, let's see, one, two, three, four, five. One, two, three, four, five. This is gonna be, this is gonna be the 10, the median of this bottom half. This is going to be 15. This is going to be seven. This is going to be 16. Well this could also be seven. It doesn't have to be. It could be seven, eight, nine, or ten. This could also be 16. Doesn't have to be. It could be 15 as well. But just like that, I've
constructed a data set, and these could be, you know
this could be 10, 11, 12, 13. This could be 10, 11, 12, 13. This could be 13, 14, 15. This one also could be 13, 14, 15. But the simple thing is,
or the basic idea here, I can have a data set where
I have multiple sevens and multiple 16s, or I
could have a data set where I only have one
seven or only one 16. So both of these statements,
we just plain don't know. We just don't, we just don't know. Now the next statement, exactly half the students
are older than 13. Well if you look at this
possibility up here, we saw that three out of the seven are older than 13. So that's not exactly half. 3/7 is not 1/2. But in this one over here, we did see that exactly half are over, are older than 13. In fact, if you're saying exactly half... Well, in this one we're
saying that exactly half are older than 13. We have an even number right over here. And so it is exactly half. So it's possible that it's true, it's possible that it's not true based on the information given. We once again, we once again don't, we once again do not know. Anyway, hopefully you
found this interesting. This is, the whole point of me doing this is when you look at statistics, sometimes it's easy to kind of say, okay I think it roughly means that, and that's sometimes okay. But it's very important
to think about what types of actual statements you can make and what you can't make
and it's very important when you're looking at statistics to say, well you know what, I just don't know. That the data actually is not telling me that thing for sure.