Box-and-Whisker Plots Box-and-Whisker Plots
⇐ Use this menu to view and help create subtitles for this video in many different languages. You'll probably want to hide YouTube's captions if using these subtitles.
- "The owner of a restaurant wants to find out
- more about where his patrons are coming from.
- One day he decided to gather data about the distance
- in miles that people commuted to get to his restaurant.
- People reported the following distances traveled."
- This is our data right over here.
- "He wants to create a graph that helps him understand
- the spread of the distances and the median distance that people travel.
- What kind of a graph should he create?"
- And you can plot this data in many different types of graphs,
- but they tend to depict things in different ways.
- For example, a line graph shows a trend over time.
- He's not interested in a trend over time, so a line graph doesn't make sense.
- Or a line graph could be a trend of one variable with respect to another variable;
- it doesn't just have to be time.
- But he doesn't want to see a trend here.
- A bar graph is good when you're trying to bucket things,
- put things into buckets and see how those buckets are performing.
- Once again, that's not exactly what he wants to see.
- A pie graph, you want to see kind of how things make up a whole.
- That's not what he wants to see right over here.
- A stem and leaf plot does help with distributions a little bit,
- but it really doesn't tackle the median distance and the spread really, really well.
- So the one that does -- and especially when people talk about medians and spread --
- is a box-and-whiskers plot.
- I'll show you how to do it right now.
- Box-and-whiskers. And what a box-and-whiskers plot literally does is
- it shows the spread of the data, it splits it into quartiles
- (I'll talk about that in a second),
- and it also shows you where the median of the data actually is.
- And that's one of the things that the owner of the restaurant cares about.
- So whenever you're dealing with medians --
- And box-and-whiskers plots deal with medians --
- the first thing you want to do is order all the numbers.
- 'Cause a median is really the middle number when you order them all up.
- So let's order this; let's write it in order.
- So, first we have 1 (so get rid of that 1).
- Then we have one 2 right over here.
- And then we have another 2, right over there; that's all of our 2's.
- Then we have this 3 (that 3) and then we have that three;
- so those are two people who've traveled three miles to the restaurant.
- And let's see, do we have any 4's?
- We have one 4 right over there, and then we have another 4 right over there.
- And then, let's see, do we have any 5's?
- (Actually I just realized that I skipped one of the 1's. We have another 1 right over there.
- So let's write that 1 right over there. We actually had two 1's.)
- And then let's see. Do we have any 5's? We do not have any 5's over here.
- Do we have any 6's? We have one 6 right over here. Then we have no more 6's.
- Do we have any 7's? We have one 7 right over there.
- 8's? We have an 8 right over there; that's the only one.
- Let's see. Do we have any 9's? No nines.
- Any 10's? Yes, we have a 10.
- Do we have an 11? Yup. 11.
- Do we have any 12's? No 12's, no 13's.
- 14. And we have a 15.
- (People must like this restaurant; they're traveling a good bit.)
- And we have a 20, and then we have a 22.
- So I've ordered all the numbers. Let me make sure I haven't skipped or gotten some duplicates.
- So 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 people were surveyed,
- seventeen patrons of the restaurant.
- And we have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17.
- Alright. It seems like I got all of them, and I've ordered them now.
- Now the median is the middle number.
- We just said that we had seventeen of these numbers. So we want the number --
- And since it's an odd number of numbers, the median actually will be one of these numbers.
- It's actually the middle number. It'll be the number where eight are larger and eight are smaller.
- So 1, 2, 3, 4, 5, 6, 7, 8. It looks like this is our median. It will be the ninth number.
- And then you have 1, 2, 3, 4, 5, 6, 7, 8 that are larger.
- So eight are smaller than 6; eight are larger than 6.
- So 6 is the middle number.
- If we had an even number of numbers here,
- we had two middle numbers, then we would take the average of them.
- But when you have an odd number of data points, then you can just take the middle one.
- So this right over here is our median.
- And then when you do a box plot,
- what you want to do is you want to find the median of the set of numbers that are smaller than the median.
- And you also want want to find the median of the set of numbers that are larger than the median.
- And these are called the first quartile and the second quartile.
- Because when you do that, you split your data into four sections, or quartiles.
- "Quar-" you normally associate with four.
- So let's look at this set that's smaller than 6.
- So we have 1, 2, 3, 4, 5, 6, 7, 8 numbers.
- So if you have eight data points, you're going to actually have two middle numbers.
- So, for example, you have these two right over here are the two middle numbers.
- Three less, three more. You can't just pick the 2,
- because if you just pick the 2, you would have three less and four more.
- And you can't just pick the 3, because then you'd have three more and four less.
- So these are our two middle numbers.
- So the median of this group right over here is 2.5.
- I averaged these two middle numbers.
- And then let's do it over here with this group. (I'll do it in blue.)
- So once again we have eight numbers, we're going to have two middles,
- and so it's going to be the third and the fourth number, which is 11 and 14.
- And if you average 11 and 14 --
- let's see 11+14 is 25, divided by 2 is 12.5.
- So this essentially divides the data into four sections.
- You have everything up to this first quartile -- so you have this first section,
- or this first median of the lower half of the data is 2.5.
- Then you have everything between 2.5 and 6.
- Then 6 to 12.5,
- and then everything more than 12.5.
- And so a box-and-whiskers plot is essentially a graphical depiction of this over here.
- So let's do that. I'm going to set up a number line.
- And let's say that this is 0.
- And let's say that this is 10,
- And this would be 20,
- And this would be 5 (a little bit cleaner than that).
- And this would be 15.
- 25 would be right around there.
- So first thing on the box and whiskers plot, you want to show the entire range of data.
- So our smallest data point starts at 1.
- So 1 is right about here. (This is 2-1/2, so 1 is right about here.)
- And then our largest data point is 22.
- 22 would sit right about there.
- And I'll even label it, although you often won't see it labeled like that.
- That's 1, and then that is 22.
- And the middle half of the data we do in the box.
- So the middle half is this quartile, and this quartile right over there.
- So the second quartile starts at 2.5. So 2.5 is right there.
- This is where we start our box, 2.5.
- And then our third quartile ends at 12.5.
- 12.5 is right over there.
- And then we can draw the box here.
- So the box shows where the middle half of our data is.
- Now I can draw these two arrows.
- So that's the box part.
- And then these two arrows are what you would call the whiskers,
- and that shows where all the other data is.
- It's really showing the spread of the data.
- And then the last thing we need to show is the actual median.
- And the median (I'll do it in purple) we already figured out is 6.
- So the median sits right about
- (so let's see, this is 5, this would be 7-1/2)
- 6 would be right over there.
- So with this one diagram, we've actually depicted all of this information,
- in terms of where is the median? The median is at 6; that is 6 right over there.
- Where is the middle half of the data? Well, it's between 2-1/2 and 12-1/2.
- And all of the data, the entire spread, for all of the customers,
- sits between -- and this is what the whiskers do for us --
- it sits between 1 and 22.
- And if you wanted to color-code it a little bit better, we could do that just 'cause it's fun.
- We could make ---
- So, this data right over here --
- and really, if you think about it, it's kind of inluding this data too --
- that's what this whisker is depicting.
- This data right over here (I'll do that in a different color.)
- is kind of the first half of the box,
- then you have your median in magenta,
- then this data right over here is the second part of the box.
- So that's all of this stuff, right over here.
- And then finally (let me pick a new color that I haven't used yet)
- this data is kind of represented by this part, by this whisker, right over here.
- Now there's one thing I want to leave you with.
- I use a method for this box-and-whisker diagram where for the two halves,
- I got rid of the median and I found the median of this part,
- and I found the median of that part.
- And that's the more typical method for box-and-whisker plots.
- Sometimes, when you find the median for the lower half and the upper half,
- some people will include this median in both sets when they calculate it.
- I just want you to know that's out there.
- But the way I did it is actually more the mainstream way.
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
Have something that's not a question about this content?
This discussion area is not meant for answering homework questions.
Share a tip
When naming a variable, it is okay to use most letters, but some are reserved, like 'e', which represents the value 2.7831...
Have something that's not a tip or feedback about this content?
This discussion area is not meant for answering homework questions.
Discuss the site
For general discussions about Khan Academy, visit our Reddit discussion page.
Flag inappropriate posts
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
- disrespectful or offensive
- an advertisement
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
- a tip or feedback in Questions
- a question in Tips & Feedback
- an answer that should be its own question
about the site