Main content

# Variance and standard deviation of a discrete random variable

AP Stats: VAR‑5 (EU), VAR‑5.C (LO), VAR‑5.C.1 (EK), VAR‑5.C.2 (EK), VAR‑5.C.3 (EK), VAR‑5.D (LO), VAR‑5.D.1 (EK)

## Video transcript

- [Instructor] In a previous video, we defined this random variable x. It's a discrete random variable. It can only take on a finite number of values, and I defined it as the number of workouts I might do in a week. And we calculated the expected value of our random variable x, which we could also denote as the mean of x, and we use the Greek letter mu, which we use for population mean. And all we did is, it's the probability-weighted sum of the various outcomes. And we got for this random variable with this probability distribution, we got an expected value or a mean of 2.1. What we're gonna do now is extend this idea to measuring spread. And so we're going to think about what is the variance of this random variable, and then we could take the square root of that to find what is the standard deviation. The way we are going to do this has parallels with the way that we've calculated variance in the past. So the variance of our random variable x, what we're going to do is take the difference between each outcome and the mean, square that difference, and then we're gonna multiply it by the probability of that outcome. So for example for this first data point, you're going to have zero minus 2.1 squared times the probability of getting zero, times 0.1. Then you're going to get plus one minus 2.1 squared times the probability that you get one, times 0.15. Then you're going to get plus two minus 2.1 squared times the probability that you get a two, times 0.4. Then you have plus three minus 2.1 squared times 0.25. And then last but not least you have plus four minus 2.1 squared times 0.1. So once again, the difference between each outcome and the mean, we square it and we multiply times the probability of that outcome. So this is going to be negative 2.1 squared, which is just 2.1 squared, so I'll just write this as 2.1 squared, times .1. That's the first term. And then we're going to have plus one minus 2.1 is negative 1.1, and then we're going to square that, so that's just going to be the same thing as 1.1 squared, which is 1.21 but I'll just write it out, 1.1 squared times .15. And then this is going to be two minus 2.1 is negative .1. When you square it is going to be equal to. So plus .01. If you have negative .1 times negative .1, it's .01 times 0.4, times .4. And then plus we this is going to be 0.9 squared, so that is .81 times .25. And then we're almost there. This is going to be plus 1.9 squared, 1.9 squared times .1. And we get 1.19. So this is all going to be equal to 1.19. And if we wanna get the standard deviation for this random variable, we would denote that with the Greek letter sigma. The standard deviation for the random variable x is going to be equal to the square root of the variance. Square root of 1.19, which is equal to, just get the calculator back here, so we are just going to take the square root of what we just, let's type it again, 1.19. And that gives us, so it's approximately 1.09. Approximately 1.09. So let's see if this makes sense. Let me put this all on a number line right over here. So you have the outcome zero, one, two, three, and four. So you have a 10% chance of getting a zero. So I will draw that like this, let's just say this is a height of 10%. You have a 15% chance of getting one, so that would be 1 1/2 times higher. So it would look something like this. You have a 40% chance of getting a two. That's going to be like this. You have a 40% chance of getting a two. You have a 25% chance of getting a three. Like this. And then you have a 10% chance of getting a four. So like that. So this is a visualization of this discrete probability distribution where I didn't draw the vertical axis here, but this would be .1, this would be .15, this would be .25, and that is .4. And then we see that the mean is at 2.1. The mean is, the mean is at 2.1, which makes sense. Even though this random variable only takes on integer values, you can have a mean that takes on a non-integer value. And then the standard deviation is 1.09. So 1.09 above the mean is going to get us close to 3.2, and 1.09 below the mean is gonna get us close to one. And so this all at least intuitively feels reasonable. This mean does seem to be indicative of the central tendency of this distribution. And the standard deviation does seem to be a decent measure of the spread.