Current time:0:00Total duration:8:18

0 energy points

# Constructing a box plot

Here's a word problem that's perfectly suited for a box and whiskers plot to help analyze data. Let's construct one together, shall we? Created by Sal Khan and Monterey Institute for Technology and Education.

Video transcript

The owner of a restaurant wants to find out where his patrons are coming from. One day he decided to gather data about the distance
(in miles) that people commuted to his restaurant.People reported the following distances travelled. So here are all the distances travelled.He wants to create a graph, that helps them understand the spread of distances. This is the keyword. The spread of distances and the median distance and the median distance that people travelled or that people travel.What kind of graph should he create? So, the answer what kind of a graph he should create that might be a little bit more straightforward than the actual creation of the graph which we will also do. But, he is trying to visualise the spread of information and at the same time he wants the medians. So what graph captures both of that information? Well, a box and whisker plot. So let's actually try to draw a box and whisker plot. and to do that we need to come up with the median and we will also see the median of the two halves of the data as well and whenever we are trying to take the median of something it's really helpful to order our data.So, let's start off by attempting to order our data. So, what is the smallest number here? Well, let's see. There is one 2 so we mark it off. And then we have another two so we've got all the two's and then we have.. this 3. Then we have.....this 3. I think we got all the threes. Then, we have that 4.Then we have...this 4 and do we have any five's. No Do we have any sixes? Yeah we have that 6 and that looks like the only 6. Any seven's? Yep we have this 7 right over here adn I just realised that we missed this 1. So I am gonna put this 1 right at the beginning of our set. So I got that 1 right over there. Actually there is two ones. I missed both of them. So,both of those ones are right over there. So i have ones, twos, three, fours, no fives this is one six. There was one 7, there is one 8 right over here. One 8 and then.... Let's see.. Any nines? No nines. Any tens? Yep.. there is a ten.Any elevens? We have an eleven right over there.Any twelves?.. No 13,14... then we have a 15. Then we have a 20 and then a 22. So we've ordered all our data. Now it should be relatively straight forward to find the middle of our data... the median. So, how many data points do we have? 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 and 17. So the middle number is going to be the number that has eight numbers larger and eight numbers smaller than it. So, let's think about it. 1,2,3,4,5,6,7,8. So the number 6 here is larger than eight of the values. and if I did the calculations right it should be smaller than eight of the values. 1,2,3,4,5,6,7 and 8. So it is indeed, it is indeed the median. Now, when we take a box and whisker.When we are trying to construct a box and whisker plot the convention is o.k. we have our median and it is essentially dividing our data into two sets. Now, let's take the median of each of those sets and the convention is to take our median out and have the sets that are left over. Sometimes people leave it in. But the standard convention take this median out and look seperately at this set and look seperately at this set. So, if we look at this first - the bottom half of our numbers essentially... What's the median of these... numbers? Well we have 1,2,3,4,5,6,7,8 data points.So we are actually going to have two middle numbers. So the two middle numbers are.... this 2 and this 3 .Three numbers less than these 2 and three numbers greater than it and so when we are looking for a median and we have two middle numbers we take the mean of these two numbers. So, half way in between two and three is 2.5.where you say 2 plus 3 is 5 divided by 2 is 2.5. So, here, here we have a median of this bottom half of 2.5 adn then... the middle of the top half once again we have 8 data points. So,our middle two numbers, our middle two numbers are going to be this 11 and this 14. And so if you want to take the mean of these two numbers.. 11+14 is 25. Half way between the two is 12.5. So 12.5 is exactly the half way inbetween 11 and 14. And now we have figured out all of the information we need to actually plot or actually create or actually draw our marks on whisker plot. So let me draw a number line. So... my best attempted number line. So that's my number line and let's say that this right over here is a zero and I'm going to make sure that i get all the way upto 22 or beyond 22. So, let's say that's a zero.Let's say this is 5, this 10, that could be 15 and that could be 20 this could be 25. O.k. keep going... 30, maybe 35. So the first thing we might want to think about, there are several ways to draw it. We want to think about the box part of the box in whisker essentially represents the middle half of our data. So, its' essentially trying to represent, trying to represent this data right over here. So, the data between the two, between the meadians of the two halves. So this the part that we would represent attempt to represent with the box. So, we would start right over here at this lower, this.. this, 2.5. This is essentially seperating the first quartile from the second quartile. The first quarter of our numbers from the second quarter of our numbers. So let's put it right over here.So, this is 2.5. 2.5 - half way between 0 and 5 that is 2.5 and then up here we have 12.5 and, 12.5 ... is right over...., see this is 10, ....10, so this right over here would be... this half-way between, well, half-way between 10 and 15 is 12.5. So, what we do is, this is 12.5 right over here. 12.5 so that seperates the third quartile from the fourth quartile and there are boxes everything in between so this is in between the middle half of our numbers, the middle half of our numbers. And we'd want to show where the actual median is and that is what actually one of the things we wanted to be able to think about when our original, when, when when the owner of the restaurant wanted to think about how far people are travelling from. So the median is 6. So we can plot it right over here. So this this is right over here. Looks, this is box 6. So that is into that same pink colour. So let's just over here is 6 and then the whiskers of the box in the whisker plot essentially shows the range of our data and so we'd do that i could do this in a different colour that i haven't used anything else in orange. So essentially if you want to see look the numbers go all the way up to 22. So, they go all the way up to so let's say that this is 22 right over here Our numbers go all the way upto 22. Our numbers go all the way upto 22 and they go as low as 1. So they go 1 is right about here. They go as low.We'll label that so that's 1 and they go as low as 1. So there you have it. We have our box and whisker plot and you can see if you have a plot like this just visually you can immediately see O.K what is the median? It's the middle of the box essentially. It shows you the middle half. It shows you how far the spread or kind of the mean of the spread is and then it shows well beyond that we have the range. It goes well beyond that. It goes or or how far the total spread of our data is.