# Bernoulli distribution mean and variance formulas

## Video transcript

In the last video we figured out the mean, variance and standard deviation for our Bernoulli Distribution with specific numbers. What I want to do in this video is to generalize it. To figure out really the formulas for the mean and the variance of a Bernoulli Distribution if we don't have the actual numbers. If we just know that the probability of success is p and the probability a failure is 1 minus p. So let's look at this, let's look at a population where the probability of success-- we'll define success as 1-- as having a probability of p, and the probability of failure, the probability of failure is 1 minus p. Whatever this might be. And obviously, if you add these two up, if you view them as percentages, these are going to add up to 100%. Or if you add up these two values, they are going to add to 1. And that needs to be the case because these are the only two possibilities that can occur. If this is 60% chance of success there has to be a 40% chance of failure. 70% chance of success, 30% chance of failure. Now with this definition of this-- and this is the most general definition of a Bernoulli Distribution. It's really exactly what we did in the last video, I now want to calculate the expected value, which is the same thing as the mean of this distribution, and I also want to calculate the variance, which is the same thing as the expected squared distance of a value from the mean. So let's do that. So what is the mean over here? What is going to be the mean? Well that's just the probability weighted sum of the values that this could take on. So there is a 1 minus p probability that we get failure, that we get 0. So there's 1 minus p probability of getting 0, so times 0. And then there is a p probability of getting 1, plus p times 1. Well this is pretty easy to calculate. 0 times anything is 0. So that cancels out. And then p times 1 is just going to be p. So pretty straightforward. The mean, the expected value of this distribution, is p. And p might be here or something. So once again it's a value that you cannot actually take on in this distribution, which is interesting. But it is the expected value. Now what is going to be the variance? What is the variance of this distribution? Remember, that is the weighted sum of the squared distances from the mean. Now what's the probability that we get a 0? We already figured that out. There's a 1 minus p probability that we get a 0. So that is the probability part. And what is the squared distance from 0 to our mean? Well the squared distance from 0 to our mean-- let me write it over here-- it's going to be 0, that's the value we're taking on-- let me do that in blue since I already wrote the 0-- 0 minus our mean-- let me do this in a new color-- minus our mean. That's too similar to that orange. Let me do the mean in white. 0 minus our mean, which is p plus the probability that we get a 1, which is just p-- this is the squared distance, let me be very careful. It's the probability weighted sum of the squared distances from the mean. Now what's the distance-- now we've got a 1-- and what's the difference between 1 and the mean? It's 1 minus our mean, which is going to be p over here. And we're going to want to square this as well. This right here is going to be the variance. Now let's actually work this out. So this is going to be equal to 1 minus p. Now 0 minus p is going to be negative p. If you square it you're just going to get p squared. So it's going to be p squared. Then plus p times-- what's 1 minus p squared? 1 minus p squared is going to be 1 squared, which is just 1, minus 2 times the product of this. So this is going to be minus 2p right over here. And then plus negative p squared. So plus p squared just like that. And now let's multiply everything out. This is going to be, this term right over here is going to be p squared minus p to the third. And then this term over here, this whole thing over here, is going to be plus p times 1 is p. p times negative 2p is negative 2p squared. And then p times p squared is p to the third. Now we can simplify these. p to the third cancels out with p to the third. And then we have p squared minus 2p squared. So this right here becomes, you have this p right over here, so this is equal to p. And then when you add p squared to negative 2p squared you're left with negative p squared minus p squared. And if you want to factor a p out of this, this is going to be equal to p times, if you take p divided p you get a 1, p square divided by p is p. So p times 1 minus p, which is a pretty neat, clean formula. So our variance is p times 1 minus p. And if we want to take it to the next level and figure out the standard deviation, the standard deviation is just the square root of the variance, which is equal to the square root of p times 1 minus p. And we could even verify that this actually works for the example that we did up here. Our mean is p, the probability of success. We see that indeed it was, it was 0.6. And we know that our variance is essentially the probability of success times the probability of failure. That's our variance right over there. The probability of success in this example was 0.6, probability of failure was 0.4. You multiply the two, you get 0.24, which is exactly what we got in the last example. And if you take its square root for the standard deviation, which is what we do right here, it's 0.49. So hopefully you found that helpful, and we're going to build on this later on in some of our inferential statistics.