Current time:0:00Total duration:13:07

0 energy points

# Standard deviation

Learn how to calculate standard deviation, how it relates to variance and mean, and the difference between population and sample standard deviation. Created by Sal Khan.

Video transcript

Let's review a little bit of
everything we learned so far and hopefully it'll make
everything fit together a little bit better. Then we'll do a bunch of
calculations with real numbers and I think it'll really
hit the point home. So, first of all if we're
dealing with a-- let me actually write down, let
me make some columns. So if we're dealing with--
let's see, we could call it the concept and then we'll call it
whether we're dealing with a population or a sample. So the first statistical
concept we came up with was the notion of the mean or the
central tendency and we learned of that was one way to measure
the average or central tendency of a data set. The other ways were the
median and the mode. But the mean tends to show up a
lot more, especially when we start talking about variances
and, as we'll do in this video, the standard deviation. But the mean of a population we
learned-- we use the greek letter Mu-- is equal to the sum
of each of the data points in the population. That's an i. Let me make sure it
looks like an I. So you're going to sum up
each of those data points. You're going to start with the
first one and you're going to go to the nth one. We're assuming that there are n
data points in the population. And then you divide by the
total number that you have. And this is like the average
that you're used to taking before you learned any of
the statistics stuff. You add up all the data
points and you divide by the number there are. The sample is the same thing. We just use a slightly
different terminology. The mean of a sample-- and
I'll do it in a different color-- just write it as
x with a line on top. And that's equal to the
sum of all the data points in the sample. So each of the xi
in the sample. But we're serving the
sample is something less than a population. So you start with the
first one still. And then you go to the lower
case n where we assume that lowercase n is less
than the big N. If this was the same thing then
we're actually taking the average or we're taking the
mean of the entire population. And then you divide by
the number of data points you added. You get to n. Then we said OK, how far-- this
give us the central tendency. It's one measure of
the central tendency. But what if we wanted to know
how good of an indicator this is for the population
or for the sample? Or, on average, how far are the
data points from this mean? And that's where we came up
with the concept of variance. And I'll arbitrarily
switch colors again. Variance. And in a population the
variable or the notation for variance is the sigma squared. This means variance. And that is equal to-- you
take each of the data points. You find the difference between
that and the mean that you calculate up there. You square it so you get
the squared difference. And then you essentially take
the average of all of these. You take the average of all
of these squared distances. So that's-- so you take the
sum from i is equal to 1 to n and you divide it by n. That's the variance. And then the variance of a
sample mean-- and this was a little bit more interesting
and we talked a little bit about it in the last video. You actually want to provide
a-- you want to estimate the variance of the population
when you're taking the variance of a sample. And in order to provide an
unbiased estimate you do something very similar
to here but you end up dividing by n minus 1. So let me write that down. So the variance of a
population-- I'm sorry, the variance of a sample or samples
variance or unbiased sample variance if that's why we're
going to divide by n minus 1. That's denoted by s squared. What you do is you take the
difference between each of the data points in the sample
minus the sample mean. We assume that we don't
know the population mean. Maybe we did. If we knew the population mean
we actually wouldn't have to do the unbiased thing they were
going to do here in the denominator. But when you have a sample the
only way to kind of figure out the population mean is to
estimate it with sample mean. So we assume that we only
have the sample mean. And you're going to square
those and then you're going to sum them up from i is equal to
1 to i is equal to n because you have n data points. And if you want an unbiased
estimator you divide by n minus 1. And we talked a little bit
before why you want this to be a n minus 1 instead of a n. And actually in a couple
of videos I'll actually prove this to you. One, I'll prove it maybe
experimentally using Excel and then I'll-- which wouldn't be a
proof, it'll just give you a little bit of intuition-- and
then I'll actually prove it a little bit more
formally later on. But you don't have to
worry about it right now. The next thing we'll learn is
something that you've probably heard a lot of, especially
sometimes in class, teachers talk about the standard
deviation of a test or-- it's actually probably one of the
most use words in statistics. I think a lot of people
unfortunately maybe use it or maybe use it without fully
appreciating everything that it involves. But the goal we'll eventually
hopefully appreciate all that involves soon. But the standard deviation--
and once you know variance it's actually quite straightforward. It's the square root
of the variance. So the standard deviation of a
population is written as sigma which is equal to the square
root of the variance. And now I think you understand
why a variance is written as sigma squared. And that is equal to just the
square root of all that. It's equal to the square root--
I'll probably run out of space-- of all of that. So the sum-- I won't write at
the top or the bottom, that makes it messy-- if xi minus Mu
squared, everything over n. And then if you wanted the
standard deviation of a sample-- and it actually gets a
little bit interesting because the standard deviation of a
sample, which is equal to the square root of the variance of
a sample-- it actually turned out that this is not an
unbiased estimator for this-- and I don't want to get to
technical for it right now-- that this is actually a very
good estimate of this. The expected value of this
is going to be this. And I'll go into more depth on
expected values in the future. But it turns out that this
is not quite the same expected value as this. But you don't have to
worry about it for now. So why even talk about
the standard deviation? Well, one, the units work
out a little better. If let's say all of our
data points were measured in meters, right? If we were taking a bunch of
measurements of length then the units of the variance
would be meter squared. right? Because we're taking
meters minus meters. This would be a meter. Then you're squaring. You're getting meters squared. And that's kind of a strange
concept if you say you know the average dispersion from the
center is in meter squares. Well first, when you take the
square root of it you get this-- you get something
that's again in meters. So you're kind of saying, oh
well the standard deviation is x or y meters. And then we'll learn a little
bit it if you can actually model your data as a bell curve
or if you assume that your data has a distribution of a bell
curve then this tells you some interesting things about where
all of the probability of finding someone within one or
two standard deviations of the of the mean. But anyway, I don't want to
go to technical right now. Let's just calculate a bunch. Let's calculate. Let's see, if I had numbers
1, 2, 3, 8, and 7. And let's say that
this is a population. So what would its mean be? So I have 1 plus 2 plus 3. So it's 3 plus 3 is 6. 6 plus 8 is 14. 14 plus 7 is 21. So the mean of this
population-- you sum up all the data points. You get 21 divided by the
total number of data points, 1, 2, 3, 4, 5. 21 divided by 5 which
is equal to what? 4.2. Fair enough. Now we want to figure
out the variance. And we're assuming that this
is the entire population. So the variance of this
population is going to be equal to the sum of the squared
differences of each of these numbers from 4.2. I'm going to have to
get my calculator out. So it's going to be 1 minus
4.2 squared plus 2 minus 4.2 squared plus 3 minus 4.2
squared plus 8 minus 4.2 squared plus 7
minus 4.2 squared. And it's going to be all of
that-- I know it looks a little bit funny-- divided by the
number of data points we have-- divided by 5. So let me take the
calculator out. All right. Here we go. Actually maybe I should
have used the graphing calculator that I have. Let me see if I can get this
thing-- if I could get this. There you go. Yeah, I think the graphing
one will be better because I can see everything
that I'm writing. OK, so let me clear this. So I want to take 1 minus 4.2
squared plus 2 minus 4.2 squared plus 3 minus 4.2
squared plus 8 minus 4.2 squared, where I'm just taking
the sum of the squared distances from the mean
squared, one more, plus 7 minus 4.2 squared. So that's the sum. The sum is 38.8. So the numerator is going to be
equal to 38.8 divided by 5. So this is the sum of the
squared distances, right? Each of these-- just so you can
relate to the formula-- each of that is xi minus
the mean squared. And so if we take the sum of
all of them-- this numerator is the sum of each of the xi minus
the mean squared from i equals 1 to n. And that ended up to be 38.8. And I just calculated
like that. I just took each to the
data points minus the mean squared, add them all
up, and I got 38.8. And I went and divided
by n which is 5. So this n up here is
actually also 5. Right? And so 38.8 divided
by 5 is 7.76. So the variance-- let me scroll
down a little bit-- the variance is equal to 7.76. Now if this was a sample of a
larger distribution, if this was a sample-- if the 1, 2,
3, 8, and 7, weren't the population-- if it was a sample
from a larger population, instead of dividing by 5 we
would have divided by 4. And we would have gotten the
variance as 38.8 divided by n minus 1, which is divided by 4. So then we would have gotten
the variance-- we would have gotten the sample variance
9.7 if you divided by n minus 1 instead of n. But anyway, don't worry
about that right now. That's just a change of n. But once you have the variance,
it's very easy to figure out the standard deviation. You just take the
square root of it. The square root of 7.76-- 2.78. Let's say 2.79 is the
standard deviation. So this gives us some measure
of, on average, how far the numbers are away from
the mean which was 4.2. And it gives it in kind
of the units of the original measurement. Anyway, I'm all out of time. I'll see you in the next video. Or actually, let's figure out--
we said if this was a sample, if those numbers were sample
and not the population, that we figured out that the
sample variance was 9.7. And so then the sample standard
deviation is just going to be the square root of that. The square root of 9.7
seven which would be 3.1. 3.11. Anyway, hopefully that makes it
a little bit more concrete. We've been dealing with these
sigma notation variables and all that so far. So when you actually do it
with numbers you see it's hopefully not that difficult. Anyway, see you in
the next video.