Main content

### Course: High school statistics > Unit 1

Lesson 5: Box and whisker plots- Worked example: Creating a box plot (odd number of data points)
- Worked example: Creating a box plot (even number of data points)
- Creating box plots
- Reading box plots
- Reading box plots
- Interpreting box plots
- Interpreting quartiles
- Judging outliers in a dataset
- Identifying outliers

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Interpreting box plots

A box and whisker plot is a handy tool to understand the age distribution of students at a party. It helps us identify the minimum, maximum, median, and quartiles of the data. However, it doesn't provide specific details like the exact number of students at certain ages.

## Want to join the conversation?

- I still don't understand what a 'quartile' is. Can someone please help?(24 votes)
- quartile: The values that divide a list of numbers into quarters.(36 votes)

- i am very confused where did he get precents from(19 votes)
- Think of the box-and-whisker plot as split into four parts (the first, second, third, and fourth quartiles), making each part equal to 1/4 (essentially 25%) of the plot.

As shown in the video, there are three quartiles that have values larger than ten; that means that 3/4 of the quartiles have kids older than 10. In other words, 75% of the plot accounts for kids 10 and older (since 3/4 can be written as 75%).

The fact that every quartile is 25% is a guestimate; the point is that all three quartiles should add up to at least 75% of the plot.

Hope this clears things up!😄(26 votes)

- At7:14, Isn't there 50% on one side of the median and 50% on the other so technically isn't 13 still in the middle so therefore it would be true? Thanks!(10 votes)
- the sign with the curly = sign means that it is approximated, so he can't be sure that it is truly 50% on that side, on the flip side, because we don't know, it might as well be like that.(11 votes)

- I do not get the last statement. We DO know that exactly half of the students are older than 13, because 50% is on the right side of the median which is also 13. Don't we?(6 votes)
- If he'd said that exactly 50% ARE 13 or older, that would be true, because it includes the median.

For example, if i say that 5 is greater than 10/2, that would be false. Because 10/2 is 5, but i said it was GREATER. On the other hand, if i said that 5 >= 10/2 (greater or equal to), that would be true.(16 votes)

- At1:06, when he says that it is the second quartile, wouldn't that be Q1 and the median be Q2?(11 votes)
- No, this is because the first quartile is the line before the box.(4 votes)

- Would it be true if the second question was "Exactly 75% of the students are >10"?(6 votes)
- No.

As an example, let's say there were 17 students at the party, of the following ages:

7, 8, 8, 9, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 15, 16, 16

The median is 13.

The second quartile is 10.

The third quartile is 15.

The minimum age is 7.

The maximum age is 16.

So, this data set gives the same box plot as shown in the video.

But 13 of the 17 students are older than 10.

13∕17 ≈ 76.47%, which is of course greater than 75%.

So, the number of students older than 10 is not necessarily exactly 75%.(6 votes)

- Where does he get the percentage.(5 votes)
- think of the plot as a chocolate bar, 25% would be 1/4 of the bar. Say that you had to share that chocolate bar with your 3 friends and you, would divide it into fourths, and 1 friend says he got more than me. what would you say?

I would´ve said ¨I have given us all 25% of the chocolate. so It is all equal.(5 votes)

- How does he find 25%(3 votes)
- The box and whiskers plot breaks the line of numbers into four. Split in half at the median, then half both of the quartiles.

All of the plot is equal to 100% a fourth of that would be 25%.

I hope I managed to explain this well. Here's some KA videos on finding percentages that might explain it better if you need:https://www.khanacademy.org/math/cc-sixth-grade-math/x0267d782:cc-6th-rates-and-percentages/cc-6th-percent-problems/v/finding-percentages-example and https://www.khanacademy.org/math/pre-algebra/xb4832e56:percentages/xb4832e56:intro-to-percents/v/percent-from-fraction-models.

I hope this helps!🍀(7 votes)

- if the 4th quartile is 25 about then isnt the 1st like 100 or something? why is it not(3 votes)
- no each quartile just represents 25% of the data. So the 1st quatile would represent the 75-100 percentile(4 votes)

- nice video it helped me understand the topic a little bit better(4 votes)

## Video transcript

- [Voiceover] So i have
a box and whiskers plot showing us the ages of
students at a party. And what I'm hoping to do in this video is get a little bit of
practice interpreting this. And what I have here are
five different statements and I want you to look
at these statements. Pause the video, look at these statements, and think about which of these, based on the information in
the box and whiskers plot, which of these are for sure true, which of these are for sure false, and which of these we don't
have enough information, it could go either way. Alright, so let's work through these. So the first statement is
that all of the students are less than 17 years old. Well we see, right over
here, that the maximum age, that's the right end of
this right whisker is 16. So it is the case that
all of the students are less than 17 years old. So this is definitely going to be true. The next statement. At least 75% of the students
are 10 years old or older. So, when you look at
this, this feels right, because 10 is, 10 is the value that is at the beginning
of the second quartile. This is the second
quartile right over there. And actually, let me do this, let me do this in a different color. So, this is the second quartile. So 25% of the value of the
numbers are in the second, or roughly, sometimes it's not exactly, so approximately, I'll say roughly 25% are going to be in this second quartile, approximately 25% are going
to be in the third quartile, and approximately 25% are going to be in the fourth quartile. So it seems reasonable for
saying 10 years old or older that this is going to be, this is going to be true. In fact, you could even have a couple of values in the first
quartile that are 10. But to make that a little more tangible, let's look at some, so I'm feeling, I'm feeling good that this is true, but let's look a few more examples to make this a little more concrete. So they don't know, we don't know, based on the information here exactly how many students are at the party. We'll have to construct some scenarios. So we could do a scenario, let's see if we can do... We could do a scenario where well let's see, let's see if I can, I can construct something
where, let's see, the median is 13. We know that for sure. The median is 13, so if
I have an odd number, I would have 13 in the
middle, just like that, and maybe I have three on either side. And I'm just making that number up. I'm just trying to see what I can learn about different types of data sets that could be described by
this box and whiskers plot. So 10 is going to be the
middle of the bottom half. So that's 10 right over there. And 15 is going to be the
middle of the top half. That's what this box and
whiskers plot is telling us. And they of course tell
us what the minimum, the minimum is seven. And they tell us that the maximum is 16. So we know that's seven
and then that is 16. And then this, right over
here, could be anything. It could be 10, it could be 11, it could be 12, it could be 13. It wouldn't change what these medians are. It wouldn't change this
box and whiskers plot. Similarly, this could
be 13, it could be 14, it could be 15, and so any of those values wouldn't change it. And so 75% are 10 or older, well, this value, in this case, six out of seven are 10 years old or older. And we could try it out with other, other scenarios where... let's try to minimize the number
of 10s given this data set. Well we could do something like, let's say that we have eight. So let's see, one, two, three, four, five, six, seven, eight. And so here we know that the minimum, we know that the minimum is seven, we know that the maximum is 16. We know, we know that the, we know that the mean of
these middle two values, we have an even number now so, the median is going to be
the mean of these two values. So, it's going to be the
mean of this and this, is going to be 13. And we know that the mean of, we know that the mean of this and this is going to be 10 and that
the mean of this and this is going to be, is going to be 15. So what could we construct? Well actually, we don't
even have to construct to answer this question. We know that, we know that this is going, this is going to have to be 10 or larger. And then all of these other things are going to be 10 or larger,
so this is exactly 75%. Exactly 75% if we assume
that this is less than 10, are going to be 10 years old or older. So feeling very good, very good, about this one right over here. And actually, just to make this concrete, I'll put in some values here. You know this could be, this could be a nine and an 11. This could be a 12 and a 14. This could be a 14 and a 16. Or, it could be, it
could be a 15 and a 15. You could think about
it and in any of those, in any of those ways. But feeling very good
that this is definitely going to be true based on the information given in this plot. Now they say there's only one
seven-year-old at the party. One seven-year-old at the party. Well this first, this first
possibility that we looked at, that was the case. There was only one
seven-year-old at the party and there was one
16-year-old at the party. And actually, that was the next statement, there's only one 16-year-old at the party. So both of these seem like we can definitely construct
data that's consistent with this box plot, box and whiskers plot, where this is true. But could we construct
one where it's not true? Well sure. Let's imagine, let's see,
we have our median at 13. Median at 13. And then we have, let's see, one, two, three, four, five. One, two, three, four, five. This is gonna be, this is gonna be the 10, the median of this bottom half. This is going to be 15. This is going to be seven. This is going to be 16. Well this could also be seven. It doesn't have to be. It could be seven, eight, nine, or ten. This could also be 16. Doesn't have to be. It could be 15 as well. But just like that, I've
constructed a data set, and these could be, you know
this could be 10, 11, 12, 13. This could be 10, 11, 12, 13. This could be 13, 14, 15. This one also could be 13, 14, 15. But the simple thing is,
or the basic idea here, I can have a data set where
I have multiple sevens and multiple 16s, or I
could have a data set where I only have one
seven or only one 16. So both of these statements,
we just plain don't know. We just don't, we just don't know. Now the next statement, exactly half the students
are older than 13. Well if you look at this
possibility up here, we saw that three out of the seven are older than 13. So that's not exactly half. 3/7 is not 1/2. But in this one over here, we did see that exactly half are over, are older than 13. In fact, if you're saying exactly half... Well, in this one we're
saying that exactly half are older than 13. We have an even number right over here. And so it is exactly half. So it's possible that it's true, it's possible that it's not true based on the information given. We once again, we once again don't, we once again do not know. Anyway, hopefully you
found this interesting. This is, the whole point of me doing this is when you look at statistics, sometimes it's easy to kind of say, okay I think it roughly means that, and that's sometimes okay. But it's very important
to think about what types of actual statements you can make and what you can't make
and it's very important when you're looking at statistics to say, well you know what, I just don't know. That the data actually is not telling me that thing for sure.