If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

### Course: Statistics and probability>Unit 3

Lesson 4: Variance and standard deviation of a population

# Calculating standard deviation step by step

## Introduction

In this article, we'll learn how to calculate standard deviation "by hand".
Interestingly, in the real world no statistician would ever calculate standard deviation by hand. The calculations involved are somewhat complex, and the risk of making a mistake is high. Also, calculating by hand is slow. Very slow. This is why statisticians rely on spreadsheets and computer programs to crunch their numbers.
So what's the point of this article? Why are we taking time to learn a process statisticians don't actually use? The answer is that learning to do the calculations by hand will give us insight into how standard deviation really works. This insight is valuable. Instead of viewing standard deviation as some magical number our spreadsheet or computer program gives us, we'll be able to explain where that number comes from.

## Overview of how to calculate standard deviation

The formula for standard deviation (SD) is
$\text{SD}=\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}$
where $\sum$ means "sum of", $x$ is a value in the data set, $\mu$ is the mean of the data set, and $N$ is the number of data points in the population.
The standard deviation formula may look confusing, but it will make sense after we break it down. In the coming sections, we'll walk through a step-by-step interactive example. Here's a quick preview of the steps we're about to follow:
Step 1: Find the mean.
Step 2: For each data point, find the square of its distance to the mean.
Step 3: Sum the values from Step 2.
Step 4: Divide by the number of data points.
Step 5: Take the square root.

## An important note

The formula above is for finding the standard deviation of a population. If you're dealing with a sample, you'll want to use a slightly different formula (below), which uses $n-1$ instead of $N$. The point of this article, however, is to familiarize you with the process of computing standard deviation, which is basically the same no matter which formula you use.
${\text{SD}}_{\text{sample}}=\sqrt{\frac{\sum _{}^{}|x-\overline{x}{|}^{2}}{n-1}}$

## Step-by-step interactive example for calculating standard deviation

First, we need a data set to work with. Let's pick something small so we don't get overwhelmed by the number of data points. Here's a good one:
$6,2,3,1$

### Step 1: Finding $\mu$‍  in $\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}$‍

In this step, we find the mean of the data set, which is represented by the variable $\mu$.
Fill in the blank.
$\mu =$

### Step 2: Finding $|x-\mu {|}^{2}$‍  in $\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}$‍

In this step, we find the distance from each data point to the mean (i.e., the deviations) and square each of those distances.
For example, the first data point is $6$ and the mean is $3$, so the distance between them is $3$. Squaring this distance gives us $9$.
Complete the table below.
Data point $x$Square of the distance from the mean $|x-\mu {|}^{2}$
$6$$9$
$2$
$3$
$1$

### Step 3: Finding $\sum |x-\mu {|}^{2}$‍  in $\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}$‍

The symbol $\sum$ means "sum", so in this step we add up the four values we found in Step 2.
Fill in the blank.
$\sum |x-\mu {|}^{2}=$

### Step 4: Finding $\frac{\sum |x-\mu {|}^{2}}{N}$‍  in $\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}$‍

In this step, we divide our result from Step 3 by the variable $N$, which is the number of data points.
Fill in the blank.
$\frac{\sum |x-\mu {|}^{2}}{N}=$

### Step 5: Finding the standard deviation $\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}$‍

We're almost finished! Just take the square root of the answer from Step 4 and we're done.
Fill in the blank.
$\text{SD}=\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}\approx$

Yes! We did it! We successfully calculated the standard deviation of a small data set.

### Summary of what we did

We broke down the formula into five steps:
Step 1: Find the mean $\mu$.
$\mu =\frac{6+2+3+1}{4}=\frac{12}{4}=3$
Step 2: Find the square of the distance from each data point to the mean $|x-\mu {|}^{2}$.
$x$$|x-\mu {|}^{2}$
$6$$|6-3{|}^{2}={3}^{2}=9$
$2$$|2-3{|}^{2}={1}^{2}=1$
$3$$|3-3{|}^{2}={0}^{2}=0$
$1$$|1-3{|}^{2}={2}^{2}=4$
Steps 3, 4, and 5:

## Try it yourself

Here's a reminder of the formula:
$\text{SD}=\sqrt{\frac{\sum _{}^{}|x-\mu {|}^{2}}{N}}$
And here's a data set:
$1,4,7,2,6$
Find the standard deviation of the data set.
$\text{SD}=$

## Want to join the conversation?

• What are the steps to finding the square root of 3.5? I can't figure out how to get to 1.87 with out knowing the answer before hand.
• without knowing the square root before hand, i'd say just use a graphing calculator
• But what actually is standard deviation? I understand how to get it and all but what does it actually tell us about the data?
• The standard deviation is a measure of how close the numbers are to the mean. If the standard deviation is big, then the data is more "dispersed" or "diverse".

As an example let's take two small sets of numbers:
4.9, 5.1, 6.2, 7.8
and
1.6, 3.9, 7.7, 10.8
The average (mean) of both these sets is 6. But the second set is more dispersed: the numbers are further away from the mean.
This is reflected in the standard deviation: if I calculated correctly (please check!) the first set has a standard deviation of 2.3, the second has 7.05.
• I want to understand the significance of squaring the values, like it is done at step 2. Why actually we square the number values?
• The important thing is that we want to be sure that the deviations from the mean are always given as positive, so that a sample value one greater than the mean doesn't cancel out a sample value one less than the mean. There are two strategies for doing that, squaring the values (which gives you the variance) and taking the absolute value (which gives you a thing called the Mean Absolute Deviation). Even though taking the absolute value is being done by hand, it's easier to prove that the variance has a lot of pleasant properties that make a difference by the time you get to the end of the statistics playlist.
• From the class that I am in, my Professor has labeled this equation of finding standard deviation as the population standard deviation, which uses a different formula from the sample standard deviation. Is there a way to differentiate when to use the population and when to use the sample? Or would such a thing be more based on context or directly asking for a giving one? Why do we use two different types of standard deviation in the first place when the goal of both is the same?
• The population standard deviation is used when you have the data set for an entire population, like every box of popcorn from a specific brand. Having this data is unreasonable and likely impossible to obtain. That's why the sample standard deviation is used. Sample standard deviation is used when you have part of a population for a data set, like 20 bags of popcorn. This is much more reasonable and easier to calculate.
• What is the formula for calculating the variance of a data set? Is it the same as the formula for standard deviation given in this article but without the square root?
In other words, is standard deviation the square root of the variance?
I remember vaguely that one of the two — SD and variance — is the square (or square root) of the other.
• Yes, the standard deviation is the square root of the variance.
• If I have a set of data with repeating values, say 2,3,4,6,6,6,9, would you take the sum of the squared distance for all 7 points or would you only add the 5 different values?
• In the formula for the SD of a population, they use mu for the mean. Is there a difference from the x with a line over it in the SD for a sample?
• No, μ and x̄ mean the same thing (no pun intended). At least when it comes to standard deviation.
• I didn't get any of it. I need help really badly. What does this stuff mean?
• It may look more difficult than it actually is, because
all the different variables that are used are just there to represent the numbers in your equation. Therefore, those variables are just examples of how to solve for Standard Deviation, and are not actually in the equation.
• Hi,
How do I calculate the standard deviation of bivariate data by hand?
Thanks
Sean