# Bernoulli distribution mean and varianceÂ formulas

## Video transcript

In the last video we figured
out the mean, variance and standard deviation for our
Bernoulli Distribution with specific numbers. What I want to do in this video
is to generalize it. To figure out really the
formulas for the mean and the variance of a Bernoulli
Distribution if we don't have the actual numbers. If we just know that the
probability of success is p and the probability a failure
is 1 minus p. So let's look at this, let's
look at a population where the probability of success-- we'll
define success as 1-- as having a probability of p, and
the probability of failure, the probability of failure
is 1 minus p. Whatever this might be. And obviously, if you add these
two up, if you view them as percentages, these are
going to add up to 100%. Or if you add up these
two values, they are going to add to 1. And that needs to be the case
because these are the only two possibilities that can occur. If this is 60% chance of success
there has to be a 40% chance of failure. 70% chance of success, 30%
chance of failure. Now with this definition of
this-- and this is the most general definition of a
Bernoulli Distribution. It's really exactly what we did
in the last video, I now want to calculate the expected
value, which is the same thing as the mean of this
distribution, and I also want to calculate the variance, which
is the same thing as the expected squared distance of
a value from the mean. So let's do that. So what is the mean over here? What is going to be the mean? Well that's just the probability
weighted sum of the values that this
could take on. So there is a 1 minus p
probability that we get failure, that we get 0. So there's 1 minus
p probability of getting 0, so times 0. And then there is a p
probability of getting 1, plus p times 1. Well this is pretty
easy to calculate. 0 times anything is 0. So that cancels out. And then p times 1 is
just going to be p. So pretty straightforward. The mean, the expected value
of this distribution, is p. And p might be here
or something. So once again it's a value that
you cannot actually take on in this distribution,
which is interesting. But it is the expected value. Now what is going to
be the variance? What is the variance of
this distribution? Remember, that is the weighted
sum of the squared distances from the mean. Now what's the probability
that we get a 0? We already figured that out. There's a 1 minus p probability
that we get a 0. So that is the probability
part. And what is the squared distance
from 0 to our mean? Well the squared distance from
0 to our mean-- let me write it over here-- it's going to be
0, that's the value we're taking on-- let me do that in
blue since I already wrote the 0-- 0 minus our mean-- let
me do this in a new color-- minus our mean. That's too similar
to that orange. Let me do the mean in white. 0 minus our mean, which is p
plus the probability that we get a 1, which is just p-- this
is the squared distance, let me be very careful. It's the probability weighted
sum of the squared distances from the mean. Now what's the distance-- now
we've got a 1-- and what's the difference between
1 and the mean? It's 1 minus our mean, which
is going to be p over here. And we're going to want to
square this as well. This right here is going
to be the variance. Now let's actually
work this out. So this is going to be
equal to 1 minus p. Now 0 minus p is going
to be negative p. If you square it you're just
going to get p squared. So it's going to be p squared. Then plus p times-- what's
1 minus p squared? 1 minus p squared is going to be
1 squared, which is just 1, minus 2 times the
product of this. So this is going to be minus
2p right over here. And then plus negative
p squared. So plus p squared
just like that. And now let's multiply
everything out. This is going to be, this term
right over here is going to be p squared minus p
to the third. And then this term over here,
this whole thing over here, is going to be plus
p times 1 is p. p times negative 2p is
negative 2p squared. And then p times p squared
is p to the third. Now we can simplify these. p to the third cancels out
with p to the third. And then we have p squared
minus 2p squared. So this right here becomes,
you have this p right over here, so this is equal to p. And then when you add p squared
to negative 2p squared you're left with negative p
squared minus p squared. And if you want to factor a p
out of this, this is going to be equal to p times, if you take
p divided p you get a 1, p square divided by p is p. So p times 1 minus p, which is
a pretty neat, clean formula. So our variance is p
times 1 minus p. And if we want to take it to the
next level and figure out the standard deviation, the
standard deviation is just the square root of the variance,
which is equal to the square root of p times 1 minus p. And we could even verify that
this actually works for the example that we did up here. Our mean is p, the probability
of success. We see that indeed it
was, it was 0.6. And we know that our variance is
essentially the probability of success times the probability
of failure. That's our variance
right over there. The probability of success
in this example was 0.6, probability of failure
was 0.4. You multiply the two, you get
0.24, which is exactly what we got in the last example. And if you take its square
root for the standard deviation, which is what we
do right here, it's 0.49. So hopefully you found that
helpful, and we're going to build on this later on in some
of our inferential statistics.