Main content

### Course: High school statistics > Unit 1

Lesson 2: Histograms# Creating a histogram

A histogram is a graphical display of data using bars of different heights. In a histogram, each bar groups numbers into ranges. Taller bars show that more data falls in that range. A histogram displays the shape and spread of continuous sample data.

## Want to join the conversation?

- In my mind, histograms and bar graphs are very similar. What situations would histograms work better than bar graphs? I feel like you could just organize the categories into buckets and then just use a bar graph. Why choose the histogram?(112 votes)
- well histograms would be grouping numbers and bar graphs would be grouping categorical things like cats or dogs not the age of cats or dogs.(61 votes)

- So please tell me the difference between a bar chart and a histogram because I thought that histograms had frequency density as the y-axis with interval on the x-axis.(14 votes)
- On a bar chart, the bars are not connected. On a histogram, they are connected!(19 votes)

- So, if I did a histogram, I could record ages of people in my school and in the primary?

P.s- upvote for the question if you think it's a good way to teach kids. I might do it(20 votes) - Is a histogram the same as a histograph?

*is a histograph actually a thing?*(10 votes)- um,yes they are they may be written differently but still they the same thing.Im not fully sure but search it up!(2 votes)

- Do the bucket intervals need to have the same value? For example, do they all need to go by the same number, or can they have different ranges?(12 votes)
- You can set the bucket size however you like, but you'll get much better clarity with equal sized buckets. Remember that the purpose of making a histogram (or scatter plot or dot plot) is to tell a story, using the data to illustrate your point. Using equal-sized buckets will make your histogram easy to read, and make it more useful.(15 votes)

- Who is still alive?(14 votes)
- is it a histogram or histograph?(8 votes)
- How do you find the interval?(8 votes)
- Find the interval? I don't get it. If you meant the domain, it's from the lowest number to the highest number.(7 votes)

- If you have numbers in a set of data that are decimals, should you round them. For example, say the minimum is 1.1 and the maximum is 138. Could you round the minimum to 1 and leave 138 as 138.(10 votes)
- yes and no. if it says to give a estimite then it would be fine. of if all the answers are rounded(0 votes)

- fruits or vegetables?

(cannot do both)(7 votes)

## Video transcript

- [Voiceover] So let's say
you were to go to a restaurant and just out of curiosity you want to see what the makeup of the
ages at the restaurant are. So you go around the restaurant and you write down everyone's age. And so these are the ages of everyone in the restaurant at that moment. And so you're interested
in somehow presenting this, somehow visualizing the
distribution of the ages, because you want just say, well, are there more young people? Are there more teenagers? Are there more middle-aged people? Are there more seniors here? And so when you just look at these numbers it really doesn't give
you a good sense of it. It's just a bunch of numbers. And so how could you do that? Well one way to think about it, is to put these ages
into different buckets, and then to think about how many people are there in each of those buckets? Or sometimes someone might say how many in each of those bins? So let's do that. So let's do buckets or categories. So, I like, sometimes it's called a bin. So the bucket, I like to think
of it more of as a bucket, the bucket and then the
number in the bucket. The number in the bucket. Number, I'll just write the number, oops. It's the, oops. It's the number (laughing), it's the number in the bucket. Alright. So let's just make buckets. Let's make them 10 year ranges. So let's say the first
one is ages zero to nine. So how many people... Why don't we just define
all of the buckets here? So the next one is ages 10 to 19, then 20 to 29, then 30 to 39, and 40 to 49, 50 to 59, let me make sure you
can read that properly, then you have 60 to 69. And I think that covers everyone. I don't see anyone 70
years old or older here. So then how many people fall into the zero to nine-year-old bucket? Well it's gonna be one, two, three, four, five, six people
fall into that bucket. How many people fall into the... How many people fall into
the 10 to 19-year-old bucket? Well, let's see. One, two, three. Three people. And I think you see where this is going. What about 20 to 29? So that's one, two, three, four, five people. Five people fall into that bucket. Alright, what about 30 to 39? We have one, and that's it. Only one person in that 30 to
39 bin or bucket or category. Alright, what about 40 to 49? We have one, two people. Two people are in that bucket. And then 50 to 59. Let's see, you have one, two people. Two people. And then finally, finally, ages 60-69. Let me do that in a different color. 60 to 69. There is one person, right over there. So this is one way of thinking about how the ages are distributed,
but let's actually make a visualization of this. And the visualization
that we're gonna create, this is called a histogram. Histogram. Histogram. We're taking data that can take on a whole bunch of different values, we're putting them into categories, and then we're gonna plot how many folks are in each category. How big are each of those? How big are each of those categories? And actually, I wrote histogram. I wrote histograph, I should
have written histogram. So a histogram. So let's do this. Alright. So on this axis, let's see,
the largest category has six. So this the number, number of folks. And it's gonna go one, two, three, four, five, six. One, two, three, four, five, six. This is the number. And on this axis I'm
gonna make the buckets. The buckets, and let me
scroll up a little bit. Now that I have my data here, I don't have to look at my data set again. So I have one bucket. This is going to be the
zero to nine bucket, right over here. Zero to nine. Then I'm going to have the three... Actually, let me just plot them, since I have my pen that color. So in zero to nine there are six people. Zero to nine, there are six people. So I'll just plot it like that. And then we have the 10 to 19. There are three people. So 10 to 19, there are three people. So I'll do a bar, like this. Then, 20 to 29, I have five people. 20 to 29, which is gonna be this one, just getting, I'm writing too big. So 20 to 29 is gonna be this bar. There's five people. Five people there. So it'll look like this. I should have made the bars wide enough so I could write below them. But I've already, that
train has already left. (laughing) Alright, alright. Then 30 to 39, I'll try to write smaller. 30 to 39, that's gonna be
this bar right over here. We have one person. One person. And then we have 40 to 49. We have two people. 40 to 49, two people. So, it looks like this. 40 to 49, two people. Almost there. 50 to 59. We have two people. 50 to 59, we also have two people. So that's that right over there. That's this category. And then finally, 60 to
69 we have one person. 60 to 69, we have one person. We have one person. And what I have just constructed, I took our data. I took our data. I put it into buckets that
are kind of representative of the categories I care about. Zero to nine is kind of young kids. 10 to 19, I guess you
could call them adolescents or roughly teenagers, although, obviously if you're 10 you're not
quite a teenager yet. And then all the different age groups. And then when I counted
the number in each bucket and I plotted it, now I
can visually get a sense of how are the ages
distributed in this restaurant. This must be some type of a restraunt that gives away toys or something, because there's a lot of younger people. Maybe it's very family-friendly. So every adult that comes in, maybe there's a lot of
young adults with kids, or maybe grandparents up here, and they just bring a lot
of kids to this restaurant. So it gives you a view
of what's going on here. Just a lot of kids here, a
lot fewer senior citizens. So once again, this is just
a way of visualizing things. We took a lot of data that
can take multiple data points. Instead of plotting each data point, like we might do in a dot plot, instead of saying how many
one-year-olds are there? Well there's only one one-year old. How many three-year-olds are there? There's only one three-year old. That wouldn't give us much information. We would just have these single dots if we were doing a dot plot. But as a histogram, we're
able to put them into buckets. Everybody was like,
hey, you know generally between the ages zero and
nine we have six people. And so you see that plotted
out, just like that. And obviously this doesn't apply just to ages of people in a restaurant, it applies to all sorts
of data that you might want to collect and observe.