Question 1

At 3:30, why square the distance to the mean to get a positive number? Why not just take the absolute value?

Accepted Answer

You're right--we could instead take the absolute value. In fact, what you're describing sounds like what statisticians usually call the "Mean Absolute Deviation", or "MAD", for short (other names sometimes used are the "average deviation" or "mean deviation"). Just as there are different "measures of central tendency" of a set of observations (such as mean, median, mode, etc.) there are also different "measures of dispersion" (besides the MAD and standard deviation, another common "measure of dispersion" is the "Interquartile Range").
However, statisticians usually prefer the variance/standard deviation versus the MAD because the MAD is not as "mathematically tractable" as the variance (i.e. the variance is easier to work with than the MAD--though the exact reasons why this is true is beyond the scope of the video). 
The important thing to remember, however, is that, in general, the MAD and the standard deviation will NOT be equal. Thus, if you're teacher asks for the standard deviation and you calculate the MAD, you will probably get the wrong answer.

For more information, check out:
http://en.wikipedia.org/wiki/Mean_absolute_deviation

Question 2

Ok, so the variance of this population is 20.  But what does that tell me, really, about this population?  What does the number 20 tell me about the experience levels at the Kahn Academy?  I understand that variance is a measure of spread in the data, but is 20 a large spread?  Would we say that the population is very various?  Or are these questions not meaningful given the small size of the sample?

Accepted Answer

It helps you figure out how good an indication the mean is of a typical employee. If there's a large variance, you know that there's a large experience level gap between different employees, if the variance is small you know that all the employees have more or less the the mean experience.

Question 3

What is the purpose of finding population variance? What does this value represent in simple terms?

Accepted Answer

Earlier in the playlist, Khan described different "measures of central tendency", specifically, the mean, median, & mode. The next step, however, is to learn about different "measures of dispersion"--i.e. how dispersed the data is. Of these different "measures of dispersion", the variance (and, hence, the standard deviation) is the most frequently used and, thus, the most important.
An example might help:
If you have a city where the average height is 5'6 it could be the case that every adult in the city is exactly 5'6 or it could be the case that half the adults were exactly 4 feet tall and the other half were exactly 7 feet tall--the average height in both cases is 5'6. Thus, we use the variance to measure how spread out a set of data is.

Or, as another example, in Finance, the standard deviation of returns is often used to represent the "riskiness" of a company's stock (where a high standard deviation would suggest a risky stock).

Does this help??

Question 4

Around 6:10 he's talking about the 20 being the squared distance away from the population mean - is there a time when you would take the square root?

Accepted Answer

Yep.  The square root of the variance is called the standard deviation, which will be another crucial concept that you'll get to pretty soon.

Question 5

In real life, when would you need to know how to solve this problem? Can someone give me an example on how you would use variance in real life?

Accepted Answer

There was an episode of the US television show "Mythbusters" last year where they tested the idea that you can "hold" urination by dancing. The problem is, in their experiment, they did one sample of how long they could "hold it" without dancing, and one sample with dancing. With only one sample, they could not estimate the variance in their data (and thus, they made senseless conclusions).

I assume you're not a television scientist, but the same idea applies any time you collect information about the world. If you're planning a dinner for 50 people, you need to consider the variance in that number: if the variance is large, you'd better have an extra table ready. When I receive post from abroad, it usually comes after one week, but since there's a large variance, I'm not worried if something hasn't yet arrived after 10 days.

Like many things in math, you won't do these explicit calculations every day. Instead, you'll internalize them. You'll understand the main idea, and use mental estimates instead of calculations. But those are estimations you wouldn't have made before you studied these things, and that's why variance is valuable. Hope that helps! :)

Question 6

Where in real life would we use variance? I mean, I understand the equation, but not the concept behind it. Like why do we square the numbers, why is the answer larger than the population mean, and what's the difference between the upper-case and lower-case sigma?

Accepted Answer

Variance is a measure of how much a data set differs from its mean.

Old math joke: Two mathematicians go duck hunting.  One shoots 1 foot in front of the duck, the other shoots 1 foot behind the duck.  The first cries out "on average, we got it"

The mean of their shots was on the duck, but the variance was too large.

If two data sets have the same mean, are they really the same data set (from the same population)?  Variance gives you more information about the distribution of the data.

We square the values to make them all positive... in the duck joke, if you only added the distance from each data point to the mean you would get a variance of zero (-1 + 1 = 0).  So you find the difference between a data point and the mean, then square that difference (to make it positive), then find the mean of all of those squared differences.

If the data is widely distributed the variance can get very large... the reals world is annoying like that.

Upper case sigma (big E) usually means 'sum up a bunch of stuff' while lower case sigma (small o with a tail at the top) means 'standard deviation' which is the square root of the variance.

Question 7

Hello, everyone. I can understand why we compute mean - it represents all numbers in data. But variance is more complicated. I do not understand where and how I might use that number. And I don't know any of life areas where variance is used. So, can you please give me some examples of practical use of variance? Thank you.

Accepted Answer

Variance and standard deviation allow you to quickly understand how close most of a population is to the mean.
For instance, average adult male height in the USA is 70in, the standard deviation being 2in.
Now, about 2/3 of adult males in the US are between 68in and 72in, so the standard deviation tells you that it's normal to be within about 2 inches of the mean. It lets you know that someone 6'4'' is rather tall, although he wouldn't be tall if the deviation was 6 inches.

Question 8

at 1:27 whats that big 'e' ??

Accepted Answer

It's a capital sigma - that's the Greek equivalent of the letter S.  The capital sigma is used to write a sum when all the terms are very similar.  In the video, all the terms are x with some subscript.
This video explains it better than I can in plain text: https://www.khanacademy.org/math/algebra2/sequences-and-series/copy-of-sigma-notation/v/sigma-notation-sum

Question 9

How does the Std. Dev compare to the average of the absolute value of the differences from the mean? In other words, is taking the average of the absolute values of the differences between each data point and the mean a useful number?

Accepted Answer

It used to be that Mean Absolute Deviation (MAD) was the standard way of communicating dispersion in scores.  However, when MAD is calculated for a sample, it tends to be (some argue) a negatively biased estimate of the population MAD.  In other words, if you knew what the population MAD was, you'd find that a sample MAD would more often be lower than the population MAD.  This is not a good characteristic for a sample statistic.  We want a sample statistic to be an unbiased estimator of the population parameter.  The sample statistic can be higher or lower than the actual population parameter (there is always sampling error), but we'd like the sampling error to be random.  It could be too high or too low, but we don't want it to be consistently too low (like sample MAD is).  Standard Deviation is an unbiased estimator in part because the differences are squared.  This means that an occasional outlier counts for more, because that difference is squared and the impact of the outlier on the overall sample statistic is greater.  This helps to make the sample estimate of SD a little bigger, and a better estimate of population SD.

Course: Statistics and probability > Unit 3

Variance of a population

Want to join the conversation?

Video transcript