If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Probability without equally likely events

Up until now, we've looked at probabilities surrounding only equally likely events. What about probabilities when we don't have equally likely events? Say, we have unfair coins? Created by Sal Khan.

Want to join the conversation?

  • leaf green style avatar for user Amoeba
    At , he says that the probability of getting heads is 60%. How does he get the 60%? Or is it just an example?
    (10 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user leony16111710
    Assuming every time you get a job interview, your chance of getting an offer is 25%, how many interviews must you get before you can be certain of getting one job offer?
    (14 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user pfoomp
      (1/4)^n represents the chance of getting an offer from each one of n interviews e.g. n=3 -> 3 offers from 3 interviews. But we'd be satisfied with only at least 1 offer from n interviews. This question combines concepts of "unfair coins" from this video (not 50% chance) and "at least 1 heads" (at least 1 offer) from the previous video, wherein Khan explains why it is calculated as 1 - (3/4)^n. If N represents an outcome of "no offer" and O represents "an offer" then the desired outcome with n=3 interviews is reached by OOO, NOO, ONO, NNO, OON, NON, or ONN and only fails with outcome NNN. It is therefore much easier to calculate the probability of NNN and subtract this from 1.0 = 100% to calculate the chance of not failing than it is to calculate the chances of the 7 distinct possibilities for succeeding and adding them all together, which would also work.

      @glenn.searby I'm just getting into probability now, so I'm not familiar with the standard terminology, but wouldn't it make more sense to call this a 95% degree of confidence than a 5%? The 5% is the chance of being incorrect, 95% is the confidence level of correctness. If this is standard terminology, then I guess I'll just have to get used to it.
      (3 votes)
  • starky seedling style avatar for user Alma Ionescu
    The Gambler's fallacy states that in a large number of throws, the probability for each event remains the same, but a gambler will tend to believe that after let's say 3 heads, the probability of getting a tail increases (which is wrong). On the other hand, if we calculate the probability to get the same thing many times in a row, this probability 1/(2^n). Therefore the probability for the other event to happen increases. An argument would be this happens because the Gambler's fallacy assumes the events are correlated when they are not. But calculating the probability for the tenth throw in a row assumes the same thing, correlated events. How does one get past this paradox?
    (10 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Arun Kumar Nagarajan
      That's a very interesting question. I'll try explaining as best as I could. Let's assume that we've tossed a fair coin five times already and that all were heads. Now, the paradox for the sixth try is between
      a) that we tend to believe that it is more likely to get a tails now as we've got a series of heads already and because the probability of getting an all heads in a row is always less
      and
      b) the probability for each event remains the same (Gambler's fallacy)

      Now, to get past this paradox, instead of thinking that the probability of getting all heads 6 times is very less(1/64 to be exact) and it's likely to be a tail the sixth time, we must think that we've already gotten 5 heads in a row, the probability of which is 1/32 and irrespective of the sixth try being a tails or a heads, the probability will be 1/64. (ie), the most unlikely event component of the very small probability has already occured.

      "But calculating the probability for the tenth throw in a row assumes the same thing, correlated events"

      This assumption is wrong. The probability for the tenth throw in a row does NOT assume correlated events. If you see the sample space, the probability of any combination of tails and heads for ten times is 1/512. (ie) P(HHHHHHHHHH) as well as P(HHHHHHHHHT) or even P(HTHTHHTTHT) is the same 1/512. Hope this answers your question.
      (9 votes)
  • mr pink red style avatar for user Forrest T
    I have been looking for a good mathematical explanation as to why these probabilities are multiplied. The two answers are always; "If you add, the answer eventually is greater than 1", or the other answer is, "because that's how you do it". Now, there are all sorts of formulas and functions in probability class, and some are very complex and well explained here, but the simple multiplication of probabilities is opaque. Why? It must be harder than it looks.
    (6 votes)
    Default Khan Academy avatar avatar for user
  • leafers tree style avatar for user Nick
    What would the probability of a coin landing on it's side be? Let's say the coin is a quarter, with a thickness of 1.75mm and a diameter of 24.3mm. Would you calculate surface area or would this involve some physics? The stickiness of the flat surface it lands on would also affect it landing on it's edge. Thanks in advance!
    (2 votes)
    Default Khan Academy avatar avatar for user
    • purple pi pink style avatar for user ZeroFK
      Mathematically speaking, a coin is nothing but a cylinder. The surface area of a cylinder - excluding the top and bottom - is the circumference of the top circle (or bottom, they're equal of course) times the height of the cylinder.
      So, for your coin, the area of the side of the coin is pi*24.3 mm * 1.75 mm = 133.6 mm^2. The area of its top and bottom is pi*R^2, with R = radius = half the diameter: pi*12.15^2 = 463.8 mm^2.
      To calculate the actual probability of the coin landing on this side would take some fairly complicated physics though. A naive approximation would be this:
      The coin has a top and bottom, each of 463.8 mm^2, and a side area of 133.6 mm^2. The chance of landing on the side area is 133.6 / (2*463.8+133.6) = 0.1259, or 12.59%. Of course the real probability is much less, since this completely disregards things like equilibrium, kinetic energy, and all that fun stuff.
      (6 votes)
  • blobby green style avatar for user hms99sun
    How would you calculate, given 13 flips of a fair coin, the probability of getting a palindromic flip?

    A palindrome reads the same both ways, forwards and backwards. Examples of 13-flip palindromes are:

    TTTTTTTTTTTTT
    TTTHHHTHHHTTT
    HHHHHHHHHHHHH
    HHHTTTHTTTHHH

    ...

    Is there an easier way to do this than calculating the probabilities of all 128 possibilities? I know you can write out and calculate for smaller numbers of flips, but what about much larger ones?

    Thanks in advance.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user upretee.rudhir
      This problem can be extended in the case of unfair coins also. Lets assume that the coins have :
      P(H) = 60% = 3/5 and
      P(T) = 40% = 2/5.

      Using the same logic as Glenn's, the 1st and 13th flip, 2nd and 12th flip etc should match for palindromic flip. For the 1st and 13th flip to match, it should be either HH or TT. So we need to calculate P(HH or TT).
      = P(HH or TT)
      = P(HH) + P(TT)- P(HH and TT)
      = [P(H).P(H)] + [P(T).P(T)] - 0
      = 3/5*3/5 + 2/5*2/5
      = 9/25 + 4/25
      = 13/25

      Hence the probability of palindromic flip in 13 coin flips
      = (13/25).(13/25).(13/25).(13/25).(13/25).(13/25)
      = 4826809/244140625
      = 1.977%.

      Please correct me if I am wrong.
      (4 votes)
  • leafers ultimate style avatar for user ZenTeapot
    At around , I got a little bit confused. All probabilities should add up to 1, right? Is the reason that one specific example (such as the one at ) doesn't add up to 1 (or equal one on its own) because there is more than one combination in the sample space?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Agent Smith
      Here 3 coins are being flipped. Since they are unfair the calculation is slightly complicated.
      P(H) = 60% = 0.6 and P(T) = 40% = 0.4
      Possibilities are
      HHH P(HHH) = 0.6 x 0.6 x 0.6 = 0.216
      TTT P(TTT) = (0.4)^3 = 0.064
      HHT, HTH, THH P(2 heads and a tail) = 3 x (0.6)^2 x (0.4) = 0.432
      TTH, THT, HTT P(2 tails and a head) = 3 x (0.4)^2 x (0.6) = 0.288
      Add all the probabilities = 0.216 + 0.064 + 0.432 + 0.288 = 1
      We have to know which probabilities when added = 1
      Here we are flipping 3 coins or the same coin 3 times so the events and the sample space is different.
      (3 votes)
  • female robot grace style avatar for user supernova 1999
    between -19, sal talked about flipping the coin many times as a trial. but, even if he did so and got more heads in the trial,what is the guarantee he would get more heads in the real event. i mean, a coin doesn;t have any memory of coming heads many times in a trial
    (2 votes)
    Default Khan Academy avatar avatar for user
    • leaf blue style avatar for user John Angelo
      Supernova: There is a statistical concept known as "The Law of Large Numbers". In layman's terms, essentially that in this case if you were to flip this coin 1,000,000 times and it came up heads 60% of the time, you could be VERY confident that this coin was biased towards heads and that the probability of flipping a heads is 60%.

      Think of a baseball player at the beginning of a season. Let's say he gets 10 hits in his first 20 at-bats. Would you say he is a .500 hitter? No way -- the sample size is way too small. When the number of at-bats starts getting large, you can then make the type of determination of what type of average this hitter truly is.

      It's the same with the coin. The coin may be biased where it will fall on heads 60% of the time. But if you flip it 10 times, it could reasonably fall on tails 7 times. 10 times is not enough. How much is enough? Well, the more the better, but you can usually say that 1000 or more is enough to give you a true picture of probability.
      (4 votes)
  • leaf green style avatar for user Sami
    Why do we multiply p(a) and p(b) when we want to determine p(ab) where a and b are two independent events and p(ab) is the probability of the succession of the events a and b ?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Calvin Ly
    How come for each independent event we have to multiply the probability of each events happening to get the probability of all of them happening? For example, P(H,H), why do we multiply if the events are independent? Thank you
    (3 votes)
    Default Khan Academy avatar avatar for user

Video transcript

So far, we've been dealing with one way of thinking about probability, and that was the probability of A occurring is the number of events that satisfy A over all of the equally likely events. And this is all of the equally likely events. And so in the case of a fair coin, the probability of heads-- well, it's a fair coin. So there's two equally likely events, and we're saying one of them satisfies being heads. So there's a 1/2 chance of you having a heads. The same thing for tails. If you took a die, and you said the probability of getting an even number when you roll the die. Well, there's six equally likely events, and there's three even numbers you could get. You could get 2, a 4, or a 6. So there's three even numbers. So once again, you have a 1/2 chance of that happening. And this is a really good model where you have equally likely events happening. Now I'm going to change things up a little bit. So I'm going to draw a line here because this was just one way of thinking about probability. Now we're going to introduce another one that's more helpful when we can't think about equally likely events. And in particular, I'm going to set up an unfair coin. So this right over here is going to be my unfair coin. So that is my coin. Well, I could draw the coin. So it's a gold coin this time. It is unfair. One side of that coin is a little heavier than the other, even though it's meant to look fair. So it still has that picture of some president or something on one side of it. So this is the head side. This is heads, and then, obviously, on the back, you have tails. But as I mentioned, this is an unfair coin. And I'm going to make it interesting statement about this unfair coin and one that really doesn't fit into the mold that I set up over here, and this interesting statement is that we have more than a 50/50 chance of getting heads or more than a 50% chance or more than a 1/2 chance of getting heads. I'm going to say that the probability of getting heads for this coin right over here is 60%. Or another way to say it, it's 0.6. Or another way to say it, it is 6 out of 10. Or another way to say it, it is 3/5. And this might make intuitive sense to you and hopefully it does a little bit, but I want you to realize that this is fundamentally different than what we were saying before because now we can't say that there are two equally likely events. There are two possible events. You can either get heads or tails. We're assuming that the coin won't fall on its edge. That's impossible. So you're either going to get heads or tails, but they're not equally likely anymore. So we really can't do this kind of counting the number of events that satisfy something over all of the possible events. In this situation, in order to visualize the probability, we have to kind of take what's called a "frequentist approach" or think about it in terms of frequency probability. And the way to conceptualize a 60% of getting heads is to think, if we had a super large number of trials, if we were to just flip this coin a gazillion times, we would expect that 60% of those would come up heads. It's unclear how I determined that this is 60%. Maybe I ran a computer simulation. Maybe I know exactly all of the physics of this, and I could completely model how it's going to fall every time. Or maybe I've actually just run a ton of trials. I've flipped the coin a million times, and I said, wow, 60% of those, 600,000 of those, came up heads. And then, we could make a similar statement about tails. So if the probability of heads is 60%, the probability of tails-- well, there's only two possibilities, heads or tails. So if I say the probability of heads or tails, it's going to be equal to 1 because you're going to get one of those two things. You have 100% chance of getting a heads or a tails, and these are mutually exclusive events. You can't have both of them. The probability of tails is going to be 100% minus the probability of getting heads, and this, of course, is 60%. So it's 100% minus 60%, or 40%, or as a decimal, 0.4, or as a fraction, 4/10, or as a simplified fraction, 2/5. So, once again, this probability is saying-- we can't say equally likely events. We could say that, if we're going to do a gazillion of these, we would expect, as we get more and more and more trials, more and more flips, 40% of those would be heads. Now, with that out of the way, let's actually do some problems with this. So let's think about the probability of getting heads on our first flip and heads on our second flip. So, once again, these are independent events. The point has no memory. Regardless of what I got on the first flip, I have an equal chance of getting heads on the second flip. It doesn't matter if I got heads or tails on the first. So this is the probability of heads on the first flip times the probability of heads on the second flip, and we already know. The probability of heads on any flip is going to be 60%. I'll write it as a decimal. It makes the math a little bit easier, 0.6, 0.6, and we can just multiply. I'll do it right over here. So this is 0.6 times 0.6. Now, it's always good to do a reality check. One way to think about it is I'm taking 6/10 of 6/10, so it should be a little bit more than half of 6/10 or probably a little bit more than 3/10. And we've explain this in detail where we talk about multiplying decimals, but we essentially just multiply the numbers, not thinking about the decimals at first. 6 times 6 is 36. And then you count the number of digits we have to the right of the decimal. We have one, two to the right of the decimal. So we're going to have two to the right of the decimal in our answer. So it is 0.36, and that makes sense. We're taking 60% of 0.6. We're taking 0.6 of 0.6, a little bit more than half of 0.6. And, once again, it's a little bit more than 0.3. So this also makes sense. So it's 0.36. Or another way to think about it is there's a 36% probability that we get two heads in a row, given this unfair coin. Remember, if it was a fair coin, it would be 1/2 times 1/2, which is 1/4, which is 25%, and it makes sense that this is more than that. Now, let's think about a slightly more complicated example. Let's say the probability of getting a tails on the first flip, getting a heads on the second flip, and then getting a tails-- I'm going to do this in a new color-- and then getting a tails on the third flip. So this is going to be equal to the probability of getting a tails on the first flip because these are all independent events. If you know that you had a tail on the first flip, that doesn't affect the probability of getting a heads on the second flip. So times the probability of getting a heads on the second flip, and then that's times the probability of getting a tails on the third flip. And the probability of getting a tails on any flip we know is 0.4. The probability of getting a heads on any flip is 0.6, and then the probability of getting tails on any flip is 0.4. And so, once again, we can just multiply these. So 0.4 times 0.6. There's actually a couple of ways we can think about it. Well, we could literally say, look, we're multiplying 4 times 6 times 4, and then we have three numbers behind the decimal point. So let's do it that way. 4 times 6 is 24. 24 times 4 is 96. So we write a 96, but remember, we have three numbers behind the decimal point. So it's one to the right of the decimal there, one to the right of the decimal there, one to the right of decimal there. So three to the right. So we need three to the right of the decimal in our answer. So one, two-- we need one more to the right of the decimal. So our answer is 0.096. Or another way to think about it is-- write an equal sign here-- this is equal to a 9.6% chance. So there's a little bit less than 10% chance, or a little bit less than 1 in 10 chance, of, when we flip this coin three times, us getting exactly a tails on the first flip, a heads on the second flip, and a tails on the third flip.