Current time:0:00Total duration:6:08
0 energy points
Histograms let you see the frequency distribution of a data set. Learn how and when to use histograms to visualize data. Created by Sal Khan and CK-12 Foundation.
Video transcript
In this video we're going to talk about another way of visualizing data called the histogram, which is a very fancy word for a not so fancy thing. I think it's probably fair to say that the histogram is the most used way of representing statistical data. Let me just show you how to figure out a histogram for some data, and I think you're going to get the point pretty easily. So I have some data here and I want to represent it with a histogram. What we're going to see is how frequent are each of these numbers. And in order to figure that out, let me just write the numbers down, let me just categorize them in their respective buckets. So I have a 1 here. I have a 4, so I want to leave space for the 2, the 3, and put a 4 there. I have a 2. I have a 1. Let me put that 1 on a stack right above that 1. I have a 0-- let me put the 0 to the left of the 1. I want to put them in order. I have a 2, another 2. Let me stack that above my first 2. I have another 1. Let me stack that above my other two 1's. I have another 0. Let me stack it there. I have another 1. Then I have another 2. Another 1. I have two more 0's. 0, 0. I have two more 2's. I have a 3. I have two more 1's. Another 3. And then I have a 6. So no 5's, and then I have a 6. And that space right there was unnecessary. But right here I've listed-- I've just rewritten those numbers and I've essentially categorized them. Now what I want to do is calculate how many of each of these numbers we have. So let me go down here. So I want to look at the frequency of each of these numbers. So I have one, two, three, four 0's. I have one, two, three, four, five, six, seven 1's. I have one, two, three, four, five 2's. I have one, two 3's. I have one 4, and one 6. So we could write it this way. We could write the number, and then we can have the frequency. So I have the numbers 0, 1, 2, 3, 4-- we could even throw 5 in there, although 5 has a frequency of 0. And we have a 6. So the 0 showed up four times in this data set. 1 showed up seven times in this data set. 2 showed up five times, 3 showed up to two times, 4 showed up one time, 5 didn't show up, and 6 showed up one time. All I did is I counted this data set, and I did this first. But you could say how many times do I see a 0? I see it one, two, three, four times. How many times do I see a 1? One, two, three, four, five, six, seven times. That's what we mean by frequency. Now a histogram is really just a plot, kind of a bar graph, plotting the frequency of each of these numbers. It's going to look a lot like this original thing that I drew. So let me draw some axes here. So the different buckets here are the numbers. And that worked out because we're dealing with very clean integers that tend to repeat. If you're dealing with things that the exact number doesn't repeat, oftentimes people will put the numbers into buckets or ranges. But here they fit into nice little buckets. You have the numbers 0, 1, 2, 3, 4, 5, and 6. This is the actual numbers. And then on the vertical axis we're going to plot the frequency. So one, two, three, four, five, six, seven. So that's 7, 6, 5, 4, 3, 2, 1. So 0 shows up four times. So we'll draw a little bar graph here. 0 shows up four times. Draw it just like that. 0 shows up four times. That is that information right there. 1 shows up seven times. So I'll do a little bar graph. 1 shows up seven times. Just like that. I want to make it a little bit straighter than that-- 1 shows up seven times. 2-- I'll do it in a different color-- 2 shows up five times. Do a bar graph, go all the way up to five. 2 shows up five times. 3 shows up two times. We have one 3, two 3's. 4 shows up one time here. 5 doesn't show up at all. So it doesn't even get any height there. And then finally, 6 shows up one time. So I'll do 6 showing up one time. What I just plotted here, this is a histogram. This right here is a histogram. Very fancy word, but I think you will agree it's a fairly simple idea. Figure out the frequency of each of these numbers and then plot the frequency of each of these numbers and you get yourself a histogram.