Main content

## Statistics and probability

### Unit 3: Lesson 4

Variance and standard deviation of a population- Measures of spread: range, variance & standard deviation
- Variance of a population
- Population standard deviation
- The idea of spread and standard deviation
- Calculating standard deviation step by step
- Standard deviation of a population
- Mean and standard deviation versus median and IQR
- Concept check: Standard deviation
- Statistics: Alternate variance formulas

© 2022 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Measures of spread: range, variance & standard deviation

Range, variance, and standard deviation all measure the spread or variability of a data set in different ways. The range is easy to calculate—it's the difference between the largest and smallest data points in a set. Standard deviation is the square root of the variance. Standard deviation is a measure of how spread out the data is from its mean. Created by Sal Khan.

## Video transcript

In the last video we talked
about different ways to represent the central tendency
or the average of a data set. What we're going to do in this
video is to expand that a little bit to understand
how spread apart the data is as well. So let's just think about
this a little bit. Let's say I have negative
10, 0, 10, 20 and 30. Let's say that's one data
set right there. And let's say the other data
set is 8, 9, 10, 11 and 12. Now let's calculate the
arithmetic mean for both of these data sets. So let's calculate the mean. And when you go further on in
statistics, you're going to understand the difference
between a population and a sample. We're assuming that
this is the entire population of our data. So we're going to be dealing
with the population mean. We're going to be dealing
with, as you see, the population measures
of dispersion. I know these are all
fancy words. In the future, you're not going
to have all of the data. You're just going to have some
samples of it, and you're going to try to estimate
things for the entire population. So I don't want you to worry too
much about that just now. But if you are going to go
further in statistics, I just want to make that
clarification. Now, the population mean, or
the arithmetic mean of this data set right here, it is
negative 10 plus 0 plus 10 plus 20 plus 30 over-- we have
five data points-- over 5. And what is this equal to? That negative 10 cancels out
with that 10, 20 plus 30 is 50 divided by 5, it's
equal to 10. Now, what's the mean
of this data set? 8 plus 9 plus 10 plus 11 plus
12, all of that over 5. And the way we could think about
it, 8 plus 12 is 20, 9 plus 11 is another 20, so
that's 40, and then we have a 50 there. Add another 10. So this, once again, is
going to be 50 over 5. So this has the exact same
population means. Or if you don't want to worry
about the word population or sample and all of that, both
of these data sets have the exact same arithmetic mean. When you average all these
numbers and divide by 5 or when you take the sum of these
numbers and divide by 5, you get 10, some of these numbers
and divide by 5, you get 10 as well. But clearly, these sets of
numbers are different. You know, if you just looked at
this number, you'd say, oh, maybe these sets are very
similar to each other. But when you look at these two
data sets, one thing might pop out at you. All of these numbers are
very close to 10. I mean, the furthest number
here is two away from 10. 12 is only two away from 10. Here, these numbers are
further away from 10. Even the closer ones are still
10 away and these guys are 20 away from 10. So this right here, this data
set right here is more disperse, right? These guys are further away from
our mean than these guys are from this mean. So let's think about different
ways we can measure dispersion, or how far
away we are from the center, on average. Now one way, this is
kind of the most simple way, is the range. And you won't see it used too
often, but it's kind of a very simple way of understanding how
far is the spread between the largest and the
smallest number. You literally take the largest
number, which is 30 in our example, and from that, you
subtract the smallest number. So 30 minus negative 10, which
is equal to 40, which tells us that the difference between the
largest and the smallest number is 40, so we have a range
of 40 for this data set. Here, the range is the largest
number, 12, minus the smallest number, which is 8, which
is equal to 4. So here range is actually
a pretty good measure of dispersion. We say, OK, both of these
guys have a mean of 10. But when I look at the range,
this guy has a much larger range, so that tells me this
is a more disperse set. But range is always not going to
tell you the whole picture. You might have two data sets
with the exact same range where still, based on how things
are bunched up, it could still have very different
distributions of where the numbers lie. Now, the one that you'll
see used most often is called the variance. Actually, we're going
to see the standard deviation in this video. That's probably what's used most
often, but it has a very close relationship
to the variance. So the symbol for the variance--
and we're going to deal with the population
variance. Once again, we're assuming that
this is all of the data for our whole population, that
we're not just sampling, taking a subset, of the data. So the variance, its symbol is
literally this sigma, this Greek letter, squared. That is the symbol
for variance. And we'll see that the sigma
letter actually is the symbol for standard deviation. And that is for a reason. But anyway, the definition of
a variance is you literally take each of these data points,
find the difference between those data points and
your mean, square them, and then take the average
of those squares. I know that sounds very
complicated, but when I actually calculate it, you're
going to see it's not too bad. So remember, the mean
here is 10. So I take the first
data point. Let me do it over here. Let me scroll down
a little bit. So I take the first
data point. Negative 10. From that, I'm going to subtract
our mean and I'm going to square that. So I just found the difference
from that first data point to the mean and squared it. And that's essentially
to make it positive. Plus the second data point, 0
minus 10, minus the mean-- this is the mean; this is that
10 right there-- squared plus 10 minus 10 squared-- that's
the middle 10 right there-- plus 20 minus 10-- that's
the 20-- squared plus 30 minus 10 squared. So this is the squared
differences between each number and the mean. This is the mean right there. I'm finding the difference
between every data point and the mean, squaring them, summing
them up, and then dividing by that number
of data points. So I'm taking the average
of these numbers, of the squared distances. So when you say it kind of
verbally, it sounds very complicated. But you're taking each number. What's the difference between
that, the mean, square it, take the average of those. So I have 1, 2, 3, 4,
5, divided by 5. So what is this going
to be equal to? Negative 10 minus 10
is negative 20. Negative 20 squared is 400. 0 minus 10 is negative 10
squared is 100, so plus 100. 10 minus 10 squared, that's just
0 squared, which is 0. Plus 20 minus 10 is 10
squared, is 100. Plus 30 minus 10, which
is 20, squared is 400. All of that over 5. And what do we have here? 400 plus 100 is 500, plus
another 500 is 1000. It's equal to 1000/5, which
is equal to 200. So in this situation, our
variance is going to be 200. That's our measure of
dispersion there. And let's compare it to this
data set over here. Let's compare it to the
variance of this less-dispersed data set. So let me scroll over a little
bit so we have some real estate, although I'm
running out. Maybe I could scroll up here. There you go. Let me calculate the variance
of this data set. So we already know its mean. So its variance of this data set
is going to be equal to 8 minus 10 squared plus 9 minus
10 squared plus 10 minus 10 squared plus 11 minus 10-- let
me scroll up a little bit-- squared plus 12 minus
10 squared. Remember, that 10 is just the
mean that we calculated. You have to calculate the mean
first. Divided by-- we have 1, 2, 3, 4, 5 squared
differences. So this is going to be equal
to-- 8 minus 10 is negative 2 squared, is positive 4. 9 minus 10 is negative 1
squared, is positive 1. 10 minus 10 is 0 squared. You still get 0. 11 minus 10 is 1. Square it, you get 1. 12 minus 10 is 2. Square it, you get 4. And what is this equal to? All of that over 5. This is 10/5. So this is going to be--all
right, this is 10/5, which is equal to 2. So the variance here-- let me
make sure I got that right. Yes, we have 10/5. So the variance of this
less-dispersed data set is a lot smaller. The variance of this data set
right here is only 2. So that gave you a sense. That tells you, look, this is
definitely a less-dispersed data set then that there. Now, the problem with the
variance is you're taking these numbers, you're taking
the difference between them and the mean, then you're
squaring it. It kind of gives you a bit of
an arbitrary number, and if you're dealing with
units, let's say if these are distances. So this is negative 10 meters, 0
meters, 10 meters, this is 8 meters, so on and so forth, then
when you square it, you get your variance in terms
of meters squared. It's kind of an odd
set of units. So what people like to do is
talk in terms of standard deviation, which is just the
square root of the variance, or the square root
of sigma squared. And the symbol for the standard deviation is just sigma. So now that we've figured out
the variance, it's very easy to figure out the standard
deviation of both of these characters. The standard deviation of this
first one up here, of this first data set, is going to
be the square root of 200. The square root of
200 is what? The square root of
2 times 100. This is equal to 10
square roots of 2. That's that first data set. Now the standard deviation of
the second data set is just going to be the square root of
its variance, which is just 2. So the second data set has 1/10
the standard deviation as this first data set. This is 10 roots of 2, this
is just the root of 2. So this is 10 times the
standard deviation. And this, hopefully, will make
a little bit more sense. Let's think about it. This has 10 times more the
standard deviation than this. And let's remember how
we calculated it. Variance, we just took each
data point, how far it was away from the mean,
squared that, took the average of those. Then we took the square root,
really just to make the units look nice, but the end result
is we said that that first data set has 10 times the
standard deviation as the second data set. So let's look at the
two data sets. This has 10 times the standard
deviation, which makes sense intuitively, right? I mean, they both have a 10 in
here, but each of these guys, 9 is only one away from
the 10, 0 is 10 away from the 10, 10 less. 8 is only two away. This guy is 20 away. So it's 10 times, on average,
further away. So the standard deviation, at
least in my sense, is giving a much better sense of how far
away, on average, we are from the mean. Anyway, hopefully, you
found that useful.