If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Calculating standard deviation step by step

Introduction

In this article, we'll learn how to calculate standard deviation "by hand".
Interestingly, in the real world no statistician would ever calculate standard deviation by hand. The calculations involved are somewhat complex, and the risk of making a mistake is high. Also, calculating by hand is slow. Very slow. This is why statisticians rely on spreadsheets and computer programs to crunch their numbers.
So what's the point of this article? Why are we taking time to learn a process statisticians don't actually use? The answer is that learning to do the calculations by hand will give us insight into how standard deviation really works. This insight is valuable. Instead of viewing standard deviation as some magical number our spreadsheet or computer program gives us, we'll be able to explain where that number comes from.

Overview of how to calculate standard deviation

The formula for standard deviation (SD) is
start text, S, D, end text, equals, square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, end square root
where sum means "sum of", x is a value in the data set, mu is the mean of the data set, and N is the number of data points in the population.
The standard deviation formula may look confusing, but it will make sense after we break it down. In the coming sections, we'll walk through a step-by-step interactive example. Here's a quick preview of the steps we're about to follow:
Step 1: Find the mean.
Step 2: For each data point, find the square of its distance to the mean.
Step 3: Sum the values from Step 2.
Step 4: Divide by the number of data points.
Step 5: Take the square root.

An important note

The formula above is for finding the standard deviation of a population. If you're dealing with a sample, you'll want to use a slightly different formula (below), which uses n, minus, 1 instead of N. The point of this article, however, is to familiarize you with the process of computing standard deviation, which is basically the same no matter which formula you use.
start text, S, D, end text, start subscript, start text, s, a, m, p, l, e, end text, end subscript, equals, square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, x, with, \bar, on top, close vertical bar, squared, divided by, n, minus, 1, end fraction, end square root

Step-by-step interactive example for calculating standard deviation

First, we need a data set to work with. Let's pick something small so we don't get overwhelmed by the number of data points. Here's a good one:
6, comma, 2, comma, 3, comma, 1

Step 1: Finding start color #e07d10, mu, end color #e07d10 in square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, start color #e07d10, mu, end color #e07d10, close vertical bar, squared, divided by, N, end fraction, end square root

In this step, we find the mean of the data set, which is represented by the variable mu.
Fill in the blank.
mu, equals
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text

Step 2: Finding start color #e07d10, open vertical bar, x, minus, mu, close vertical bar, squared, end color #e07d10 in square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, start color #e07d10, open vertical bar, x, minus, mu, close vertical bar, squared, end color #e07d10, divided by, N, end fraction, end square root

In this step, we find the distance from each data point to the mean (i.e., the deviations) and square each of those distances.
For example, the first data point is 6 and the mean is 3, so the distance between them is 3. Squaring this distance gives us 9.
Complete the table below.
Data point xSquare of the distance from the mean open vertical bar, x, minus, mu, close vertical bar, squared
69
2
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text
3
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text
1
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text

Step 3: Finding start color #e07d10, sum, open vertical bar, x, minus, mu, close vertical bar, squared, end color #e07d10 in square root of, start fraction, start color #e07d10, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, mu, close vertical bar, squared, end color #e07d10, divided by, N, end fraction, end square root

The symbol sum means "sum", so in this step we add up the four values we found in Step 2.
Fill in the blank.
sum, open vertical bar, x, minus, mu, close vertical bar, squared, equals
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text

Step 4: Finding start color #e07d10, start fraction, sum, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, end color #e07d10 in square root of, start color #e07d10, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, end color #e07d10, end square root

In this step, we divide our result from Step 3 by the variable N, which is the number of data points.
Fill in the blank.
start fraction, sum, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, equals
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text

Step 5: Finding the standard deviation square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, end square root

We're almost finished! Just take the square root of the answer from Step 4 and we're done.
Fill in the blank.
Round your answer to the nearest hundredth.
start text, S, D, end text, equals, square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, end square root, approximately equals
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text

Yes! We did it! We successfully calculated the standard deviation of a small data set.

Summary of what we did

We broke down the formula into five steps:
Step 1: Find the mean mu.
mu, equals, start fraction, 6, plus, 2, plus, 3, plus, 1, divided by, 4, end fraction, equals, start fraction, 12, divided by, 4, end fraction, equals, start color #11accd, 3, end color #11accd
Step 2: Find the square of the distance from each data point to the mean open vertical bar, x, minus, mu, close vertical bar, squared.
xopen vertical bar, x, minus, mu, close vertical bar, squared
6open vertical bar, 6, minus, start color #11accd, 3, end color #11accd, close vertical bar, squared, equals, 3, squared, equals, 9
2open vertical bar, 2, minus, start color #11accd, 3, end color #11accd, close vertical bar, squared, equals, 1, squared, equals, 1
3open vertical bar, 3, minus, start color #11accd, 3, end color #11accd, close vertical bar, squared, equals, 0, squared, equals, 0
1open vertical bar, 1, minus, start color #11accd, 3, end color #11accd, close vertical bar, squared, equals, 2, squared, equals, 4
Steps 3, 4, and 5:
SD=xμ2N=9+1+0+44=144        Sum the squares of the distances (Step 3).=3.5        Divide by the number of data points (Step 4).1.87        Take the square root (Step 5).\begin{aligned} \text{SD} &= \sqrt{\dfrac{\sum\limits_{}^{}{{\lvert x-\mu\rvert^2}}}{N}}\\\\\\\\ &= \sqrt{\dfrac{9 + 1 + 0 + 4}{4}} \\\\\\\\ &= \sqrt{\dfrac{{14}}{4}} ~~~~~~~~\small \text{Sum the squares of the distances (Step 3).} \\\\\\\\ &= \sqrt{{3.5}} ~~~~~~~~\small \text{Divide by the number of data points (Step 4).} \\\\\\\\ &\approx 1.87 ~~~~~~~~\small \text{Take the square root (Step 5).} \end{aligned}

Try it yourself

Here's a reminder of the formula:
start text, S, D, end text, equals, square root of, start fraction, sum, start subscript, end subscript, start superscript, end superscript, open vertical bar, x, minus, mu, close vertical bar, squared, divided by, N, end fraction, end square root
And here's a data set:
1, comma, 4, comma, 7, comma, 2, comma, 6
Find the standard deviation of the data set.
Round your answer to the nearest hundredth.
start text, S, D, end text, equals
  • Your answer should be
  • an integer, like 6
  • a simplified proper fraction, like 3, slash, 5
  • a simplified improper fraction, like 7, slash, 4
  • a mixed number, like 1, space, 3, slash, 4
  • an exact decimal, like 0, point, 75
  • a multiple of pi, like 12, space, start text, p, i, end text or 2, slash, 3, space, start text, p, i, end text

Want to join the conversation?

  • aqualine ultimate style avatar for user Tais Price
    What are the steps to finding the square root of 3.5? I can't figure out how to get to 1.87 with out knowing the answer before hand.
    (22 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user akanksha.rph
    I want to understand the significance of squaring the values, like it is done at step 2. Why actually we square the number values?
    (11 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user Matthew Daly
      The important thing is that we want to be sure that the deviations from the mean are always given as positive, so that a sample value one greater than the mean doesn't cancel out a sample value one less than the mean. There are two strategies for doing that, squaring the values (which gives you the variance) and taking the absolute value (which gives you a thing called the Mean Absolute Deviation). Even though taking the absolute value is being done by hand, it's easier to prove that the variance has a lot of pleasant properties that make a difference by the time you get to the end of the statistics playlist.
      (20 votes)
  • blobby green style avatar for user Shannon
    But what actually is standard deviation? I understand how to get it and all but what does it actually tell us about the data?
    (14 votes)
    Default Khan Academy avatar avatar for user
    • purple pi pink style avatar for user ZeroFK
      The standard deviation is a measure of how close the numbers are to the mean. If the standard deviation is big, then the data is more "dispersed" or "diverse".

      As an example let's take two small sets of numbers:
      4.9, 5.1, 6.2, 7.8
      and
      1.6, 3.9, 7.7, 10.8
      The average (mean) of both these sets is 6. But the second set is more dispersed: the numbers are further away from the mean.
      This is reflected in the standard deviation: if I calculated correctly (please check!) the first set has a standard deviation of 2.3, the second has 7.05.
      (2 votes)
  • blobby green style avatar for user jkcrain12
    From the class that I am in, my Professor has labeled this equation of finding standard deviation as the population standard deviation, which uses a different formula from the sample standard deviation. Is there a way to differentiate when to use the population and when to use the sample? Or would such a thing be more based on context or directly asking for a giving one? Why do we use two different types of standard deviation in the first place when the goal of both is the same?
    (11 votes)
    Default Khan Academy avatar avatar for user
    • starky tree style avatar for user sarah ehrenfried
      The population standard deviation is used when you have the data set for an entire population, like every box of popcorn from a specific brand. Having this data is unreasonable and likely impossible to obtain. That's why the sample standard deviation is used. Sample standard deviation is used when you have part of a population for a data set, like 20 bags of popcorn. This is much more reasonable and easier to calculate.
      (2 votes)
  • duskpin seedling style avatar for user origamidc17
    If I have a set of data with repeating values, say 2,3,4,6,6,6,9, would you take the sum of the squared distance for all 7 points or would you only add the 5 different values?
    (7 votes)
    Default Khan Academy avatar avatar for user
  • duskpin ultimate style avatar for user chung.k2
    In the formula for the SD of a population, they use mu for the mean. Is there a difference from the x with a line over it in the SD for a sample?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • mr pink red style avatar for user ANGELINA569
    I didn't get any of it. I need help really badly. What does this stuff mean?
    (6 votes)
    Default Khan Academy avatar avatar for user
    • piceratops seed style avatar for user Sergio Barrera
      It may look more difficult than it actually is, because
      all the different variables that are used are just there to represent the numbers in your equation. Therefore, those variables are just examples of how to solve for Standard Deviation, and are not actually in the equation.
      (4 votes)
  • sneak peak green style avatar for user G. Tarun
    What is the formula for calculating the variance of a data set? Is it the same as the formula for standard deviation given in this article but without the square root?
    In other words, is standard deviation the square root of the variance?
    I remember vaguely that one of the two — SD and variance — is the square (or square root) of the other.
    (5 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Madradubh
    Hi,
    How do I calculate the standard deviation of bivariate data by hand?
    Thanks
    Sean
    (7 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user cossine
      You would have a covariance matrix. You could find the Cov that is covariance.

      E.g. Cov(X, X) = Var(X) = standard_deviation_x^2

      Similarly we could do the same thing for Y.

      We can also find Cov(X, Y). Just use definition. If you are not able find it on khan academy just go to Wikipedia.
      (1 vote)
  • aqualine sapling style avatar for user Epifania Ortiz
    Why does the formula show n and not n-1?
    (7 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user cossine
      n is the denominator for population variance. In contrast n-1 is the denominator for sample variance.

      Depending on the context we use n or n-1. Using n can result in underestimation.

      Because we don't exact the mean and mean is used in the formula for variance this mean when we don't have population data we will most likely underestimate the variance if we use n.

      In contrast using n-1 adjust for this.

      I highly recommend you read Probability and Statistic for Engineering Science by Jay L. Devore. They go through a proof showing n-1 is an unbiased estimate for sample data in contrast to n.
      (1 vote)