If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Mean and variance of Bernoulli distribution example

Sal calculates the mean and variance of a Bernoulli distribution (in this example the responses are either favorable or unfavorable). Created by Sal Khan.

## Want to join the conversation?

• At Sal defines the variance as the probability weighted sum of the squared distances from the mean or the expected valued of the squared distances from the mean. What is the relation of this formula to what we learned in earlier videos about calculating variance as the sum of differences of a sample minus the mean of the sample squared divided by (n-1)? •   You have all the right concepts in play, you just have to relate them. At the start of the video Sal remarks "[Imagine] we can survay every member of the population." This indicates we are doing the population variance not sample variance. In a previous video he stated that population variance is the sum of the squared distances from the mean divided by N. This is much like the population mean which is simply the sum of all the values divided by N. Now if you remember back to the video on expected value, we can express the population mean not only as the traditional formula (sum / N), but also as the sum of each value multiplied by its frequency (also called the weighted sum). This frees us from having a set size for N and we can take the expected value of an infinite set. Essentially what is happening here for the variance is the same process. Instead of dividing the square distances by N to arrive at the variance we are multiplying each by its weight (i.e. frequency, i.e. probability) in the distribution. With this method we can calculate the variance of an infinite population.
• ( - ) So if you assigned the unfavorable as 1 and favorable as 0, you'd end up with a different mean...? How do you know what number to assign to each variable? • In fact, you could choose -1 for unfavorable and +1 for favorable. That way, a 0 mean would represent a neutral overall favorability rating, a negative number would yield a negative mean sentiment and a positive number would yield a positive mean sentiment. However, choosing a 0 to represent one of the values simplifies the math.
• What if the population had a third choice? Let's say that part of the population didn't have a clear opinion about it and didn't vote. How would that affect the example mentioned above? • what is the main differnce between Bernoulli and Binomial Distribution. For Bernoulli case, can I apply Binomial on it? I mean that for the flipping coin, there are also 2 options, head or tail, the same for Bernoulli with 2 options: yes or no, right? • I thought the mean is a sum of numbers divided by the total number of data points. How can you use a mean that is not divided? • When Sal says that 40% of the answers were unfavorable and 60% were favorable, that information is already calculated from the data points.
For example, suppose the population was 1000 people. Then to get 40% unfavorable, that means that 400 people answered unfavorable. Similarly, 600 people answered favorable. Then we could multiply 400*0 and add it to 600*1, then divide by 1000 to get 0.6.

If we know the percentage (or proportion) of the population in each category, that gives us enough information to calculate the mean even if we do not have access to the raw data. I can show you the algebra:
Let u be the number of people who answered unfavorable.
Let f be the number of people who answered favorable.
Let n be the number of people in the population.
We are given that u/n = 40% = 0.4 & f/n = 60% = 0.6
We calculate the mean:
mu = (u*0 + f*1)/n = (u*0)/n + (f*1)/n = (u/n)*0 + (f/n)*1 = 0.4 *0 + 0.6 * 1 = 0 + 0.6 = 0.6.
• So a Bernoulli distribution is just a situation where there are only 2 options? Like Yes and No or Success and Failure or Positive and Negative? And do they have to be opposites from each other necessarily? So like if the question was: do you like chocolate or vanilla ice cream better, would the responses follow a Bernoulli distribution by definition, or no? • What happened to the (n-1) value in the denominator? • how can you just decide to define u and f as 0 and 1?
why did you choose those numbers? • if mean and variance of bionominal distribution are 3 and 1.5 respectively, find the probablity of (1) at least one success (2) exactly 2 success. • Nice problem!
If n represents the number of trials and p represents the success probability on each trial, the mean and variance are np and np(1 - p), respectively.
Therefore, we have np = 3 and np(1 - p) = 1.5.
Dividing the second equation by the first equation yields 1 - p = 1.5/3 = 0.5.
So p = 1 - 0.5 = 0.5, and n = 3/p = 3/0.5 = 6.

P(at least one success) = 1 - P(no successes) = 1 - (1 - p)^n = 1 - (0.5)^6 = 0.984375.
P(exactly 2 successes) = (n choose 2) p^2 (1-p)^(n-2) = [(6*5)/(1*2)] (0.5)^2 (0.5)^4 = 0.234375.

Have a blessed, wonderful day! 