If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Interpreting box plots

A box and whisker plot is a handy tool to understand the age distribution of students at a party. It helps us identify the minimum, maximum, median, and quartiles of the data. However, it doesn't provide specific details like the exact number of students at certain ages.

Want to join the conversation?

Video transcript

- [Voiceover] So i have a box and whiskers plot showing us the ages of students at a party. And what I'm hoping to do in this video is get a little bit of practice interpreting this. And what I have here are five different statements and I want you to look at these statements. Pause the video, look at these statements, and think about which of these, based on the information in the box and whiskers plot, which of these are for sure true, which of these are for sure false, and which of these we don't have enough information, it could go either way. Alright, so let's work through these. So the first statement is that all of the students are less than 17 years old. Well we see, right over here, that the maximum age, that's the right end of this right whisker is 16. So it is the case that all of the students are less than 17 years old. So this is definitely going to be true. The next statement. At least 75% of the students are 10 years old or older. So, when you look at this, this feels right, because 10 is, 10 is the value that is at the beginning of the second quartile. This is the second quartile right over there. And actually, let me do this, let me do this in a different color. So, this is the second quartile. So 25% of the value of the numbers are in the second, or roughly, sometimes it's not exactly, so approximately, I'll say roughly 25% are going to be in this second quartile, approximately 25% are going to be in the third quartile, and approximately 25% are going to be in the fourth quartile. So it seems reasonable for saying 10 years old or older that this is going to be, this is going to be true. In fact, you could even have a couple of values in the first quartile that are 10. But to make that a little more tangible, let's look at some, so I'm feeling, I'm feeling good that this is true, but let's look a few more examples to make this a little more concrete. So they don't know, we don't know, based on the information here exactly how many students are at the party. We'll have to construct some scenarios. So we could do a scenario, let's see if we can do... We could do a scenario where well let's see, let's see if I can, I can construct something where, let's see, the median is 13. We know that for sure. The median is 13, so if I have an odd number, I would have 13 in the middle, just like that, and maybe I have three on either side. And I'm just making that number up. I'm just trying to see what I can learn about different types of data sets that could be described by this box and whiskers plot. So 10 is going to be the middle of the bottom half. So that's 10 right over there. And 15 is going to be the middle of the top half. That's what this box and whiskers plot is telling us. And they of course tell us what the minimum, the minimum is seven. And they tell us that the maximum is 16. So we know that's seven and then that is 16. And then this, right over here, could be anything. It could be 10, it could be 11, it could be 12, it could be 13. It wouldn't change what these medians are. It wouldn't change this box and whiskers plot. Similarly, this could be 13, it could be 14, it could be 15, and so any of those values wouldn't change it. And so 75% are 10 or older, well, this value, in this case, six out of seven are 10 years old or older. And we could try it out with other, other scenarios where... let's try to minimize the number of 10s given this data set. Well we could do something like, let's say that we have eight. So let's see, one, two, three, four, five, six, seven, eight. And so here we know that the minimum, we know that the minimum is seven, we know that the maximum is 16. We know, we know that the, we know that the mean of these middle two values, we have an even number now so, the median is going to be the mean of these two values. So, it's going to be the mean of this and this, is going to be 13. And we know that the mean of, we know that the mean of this and this is going to be 10 and that the mean of this and this is going to be, is going to be 15. So what could we construct? Well actually, we don't even have to construct to answer this question. We know that, we know that this is going, this is going to have to be 10 or larger. And then all of these other things are going to be 10 or larger, so this is exactly 75%. Exactly 75% if we assume that this is less than 10, are going to be 10 years old or older. So feeling very good, very good, about this one right over here. And actually, just to make this concrete, I'll put in some values here. You know this could be, this could be a nine and an 11. This could be a 12 and a 14. This could be a 14 and a 16. Or, it could be, it could be a 15 and a 15. You could think about it and in any of those, in any of those ways. But feeling very good that this is definitely going to be true based on the information given in this plot. Now they say there's only one seven-year-old at the party. One seven-year-old at the party. Well this first, this first possibility that we looked at, that was the case. There was only one seven-year-old at the party and there was one 16-year-old at the party. And actually, that was the next statement, there's only one 16-year-old at the party. So both of these seem like we can definitely construct data that's consistent with this box plot, box and whiskers plot, where this is true. But could we construct one where it's not true? Well sure. Let's imagine, let's see, we have our median at 13. Median at 13. And then we have, let's see, one, two, three, four, five. One, two, three, four, five. This is gonna be, this is gonna be the 10, the median of this bottom half. This is going to be 15. This is going to be seven. This is going to be 16. Well this could also be seven. It doesn't have to be. It could be seven, eight, nine, or ten. This could also be 16. Doesn't have to be. It could be 15 as well. But just like that, I've constructed a data set, and these could be, you know this could be 10, 11, 12, 13. This could be 10, 11, 12, 13. This could be 13, 14, 15. This one also could be 13, 14, 15. But the simple thing is, or the basic idea here, I can have a data set where I have multiple sevens and multiple 16s, or I could have a data set where I only have one seven or only one 16. So both of these statements, we just plain don't know. We just don't, we just don't know. Now the next statement, exactly half the students are older than 13. Well if you look at this possibility up here, we saw that three out of the seven are older than 13. So that's not exactly half. 3/7 is not 1/2. But in this one over here, we did see that exactly half are over, are older than 13. In fact, if you're saying exactly half... Well, in this one we're saying that exactly half are older than 13. We have an even number right over here. And so it is exactly half. So it's possible that it's true, it's possible that it's not true based on the information given. We once again, we once again don't, we once again do not know. Anyway, hopefully you found this interesting. This is, the whole point of me doing this is when you look at statistics, sometimes it's easy to kind of say, okay I think it roughly means that, and that's sometimes okay. But it's very important to think about what types of actual statements you can make and what you can't make and it's very important when you're looking at statistics to say, well you know what, I just don't know. That the data actually is not telling me that thing for sure.