Main content

## Describing and comparing distributions

Current time:0:00Total duration:4:28

# Classifying shapes of distributions

AP Stats: UNC‑1 (EU), UNC‑1.H (LO), UNC‑1.H.3 (EK), UNC‑1.H.4 (EK)

## Video transcript

- [Instructor] What we have here are six different distributions. And what we're gonna do with this video is think about how to classify them, or use the words people typically use to classify distributions. So, let's first look at this
distribution right over here, it's the distribution of
the lengths of houseflies. So, someone went out there and measured a bunch of houseflies. And then said, "Hey look, there's many houseflies that are between six tenths of a centimeter and six and a half
tenths of a centimeter." Looks like there's about
40 houseflies there. And then if you say between six
and a half and seven tenths, there's about 30 houseflies. And if you were to say
between five and a half tenths and six tenths, it looks like
it's about the same amount. This type of distribution
is usually described as being symmetric. Why is it called that? Because if you were to draw a line down the middle of this distribution, both sides look like mirror
images of each other. This one looks pretty exactly symmetric. But more typically when
you're collecting data, you'll see roughly
symmetric distributions. Now, here we have a distribution that gives us the dates on pennies. So, someone went out there, observed a bunch of pennies, looked at the dates on them. They saw many pennies, looks like a little bit
more than 55 pennies, had a date between 2010 and 2020. While very few pennies had a date older than 1980 on them. And this type of distribution when you have a tail to the left, you can see it right over here, you have a long tail to the left, this is known as a
left-skewed distribution. Left-skewed. Now in future videos,
we'll come up with more technical definitions of
what makes it left-skewed, but the way that you can recognize it is, you have the high points of
your distribution on the right, but then you have this long tail that skews it to the left. Now, the other side of a left-skewed, you might say, well, that would be a right-skewed distribution, and that's exactly what
we see right over here. This is a distribution
of state representatives, and as you can see, most of
the states in the United States have between zero and ten representatives. It looks like it's a little over 35. None of them actually have zero, they all have at least one representative, but they would fall into this bucket, while very few have more
than 50 representatives. So, here where the bulk of our
distribution is to the left, where we have this tail
that skews us to the right, this is known as a
right-skewed distribution. Now, if we look at this next distribution, what would this be? Pause this video and think about it. Well, this could be a
distribution of maybe someone went around
the office and surveyed how many cups of coffee each person drank, and if they found someone who drank one cup of coffee per day, maybe this would be them. If they found another person who drinks one cup of coffee, that's them, then they found three people who drank two cups of coffee. Well, this is a very similar situation to what we saw on the dates on pennies. A large amount of our data
fell into this right bucket of three cups of coffee, but then we have this tail to the left. So, this would be left-skewed. Now, these right two
distributions are interesting. One could argue that
this distribution here, which is telling us the number of days that we had different high temperatures, that this looks roughly symmetric, or actually even looks exactly symmetric. 'Cause if you did that little exercise of drawing a dotted line down the middle, it looks like the two sides are
mirror images of each other. Now, that would not be
technically incorrect. But typically when you
see these two peaks, this would typically be called
a bi-modal distribution. So, even though bi-modal distributions can sometimes be symmetric
or roughly symmetric, you wanna be more precise, and here when you have these two peaks, that's where the bi comes from. You'd call it bi-modal,
and this makes sense because you have a lot of days that are warm that might
happen during the summer and you might have a lot
of days that are cold that are happening during the winter. Now, this last distribution here, the results from die rolls, one could argue as well that
this is roughly symmetric. It's not exact, it's
not perfectly symmetric, but when you look at this dotted line here on the left and the right sides it looks roughly symmetric. But a more exact classification here would be that it looks
approximately uniform. So, rather than calling it
a symmetric distribution, or a roughly symmetric distribution, most people would classify this as an approximately uniform distribution.