If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

# Comparing dot plots, histograms, and box plots

CCSS.Math:

## Video transcript

what I want to do with this video is look at some examples of data represented different ways and think about which representation is the best or can help us answer different questions so we see this first example a statistician recorded the length of each of Pixar's first 14 films the statistician made a dot plot each dot is a film a histogram and a box plot to display the running time data which display could be used to find the median to find the median all right so let's look at look at let's look at these displays so over here we see the 40 this is the this is the dot plot we have a dot for each of the 14 films so one film had a running time of 81 minutes we see that there one film had a running time of 92 one had a running time of 93 we see one had a running time of 95 we see two had running times of 96 minutes and so on and so forth so I claim that I could use this to figure out the median because I could make a list of all of the running times of the films I could order them and then I could find the middle value I could literally make a list I could write down 81 and then write down 92 then write down 93 then write down 95 then I could write down 96 twice and then I could write down 98 then I could write down 100 I see where you should go were easy I think you see where this is going I could write out the entire list and then I could find the middle value so the dot plot I could definitely use to find the median now what about the histogram this is the histogram right over here and and the key here is for median to figure out a meeting I just need to figure out a list of numbers I forgot a list of numbers so here I don't know you know they say I have one film that's between 80 and 85 but I don't know it's exact running time and might it's running time it might have been 81 minutes its running time might have been 84 minutes so I don't know here and so I can't really make a list of the running times of the films and find the middle value so I don't think I'm going to be able to do it using the histogram now with the with the box plot right over here so I'm not going to click histogram with a box plot over here I might I'd be able to make a list of all the values but the boxplot explicitly tells us what the median is this middle line in the in the middle of the box that tells us the median is what is this this median is this is a hundred this is ninety nine so this is 95 96 97 98 99 it explicitly tells us the median is 99 this is actually the easiest for calculating the median so I'll go with the box plot so the histogram is of no use to me I want to calculate the median let's do a couple more of these nom owns a used-car lot he checked the Adamo ters of the cars and recorded how far they had driven he then created both a histogram and a box plot to display the same data both diagrams are shown below which display can be used to find how many vehicles had driven more than two hundred thousand kilometers so how many vehicles had driven more than two hundred thousand kilometers so it looks like here in this in this histogram I have three vehicles that were between 200 and 250 and then I have two vehicles that are between 250 and 300 so it looks pretty clear that I have five vehicles three they were had a mileage between 200,000 and 250,000 and then I had two that we had mileage between 250,000 and 300,000 so I'm able to answer the question five vehicles had a mileage more than 200,000 and so I would say that the histogram is pretty useful but let's verify that the box plot isn't so useful so I want to know how many vehicles had a mileage more than 200,000 well I know I know that if I have a mileage more than 200,000 I'm going to be in the fourth I'm going to be in the fourth quartile but I don't know how many values I have sitting there in the fourth quartile just looking at this data over here so I'm like that's not going to be useful for answering that question let's look at the second question which display can be used to find the median distance which display can be used to find that the median distance was approximately 140,000 kilometers well to calculate the median you essentially want to be able to list all of the numbers and then find the middle number and over here I can't list all of the numbers I know that there's three Val that are between 0 and 50,000 kilometers but I don't know what they are could be 10,000 10,000 10,000 it could be 10,000 15,000 and 40,000 I don't know what they are is if I can't if I can't list all of these things and put them in order I really am going to have trouble finding the middle value the middle values going to be it's going to be in this in this range right around here but I don't know exactly what it's going to be the histogram is not useful because throwing all the values into these buckets while on the box plot it explicitly it directly tells me the median value this line right over here the middle of the box this tells us the media the median value and it's we see that the median value here this is 140,000 kilometers right this is 100 110 120 130 140 thousand kilometers is the median mileage for the cars and so the box plot clearly clearly gives us clearly gives us that data