Prisoners' dilemma and Nash equilibrium Why two not-so-loyal criminals would want to snitch each other out
Prisoners' dilemma and Nash equilibrium
- On the same day, police have made two at first unrelated arrests.
- They arrest a gentleman named Alan.
- They caught him red-handed selling drugs.
- So it's an open-and-shut case.
- And in the same day they catch a gentleman named Bill,
- and he is also caught red-handed dealing drugs.
- And they bring them separately to the police station
- and they tell them, "look, this is an open-and-shut case
- you're going to get convicted for drug dealing
- and you`re going to get two years."
- And they tell this to each of them individually.
- They are selling the same type of drugs, just happened to be that.
- But they were doing it completely independently.
- Two years for drugs is what's going to happen,
- assuming nothing else.
- But then the District Attorney has the chance
- to chat with each of this gentlemen separately.
- And while he's chatting with them, he reinforces the idea that
- this is an open-and-shut case for the drug dealing.
- They're each going to get 2 years if nothing else happens.
- But then he starts to realize that
- these 2 characters look like.
- He starts to have a suspicion for whatever reason
- that these were the 2 characters that actually committed
- a much more serious offence, that they had committed
- a major armed robbery a few weeks ago.
- And all the District Attorney has to go on
- is his hunch, his suspicion. He has no hard evidence.
- So what he wants to do is try to get a deal
- with each of these guys, so that they have an incentive
- to, essentially, snitch on each other.
- So what he tells each of them is
- "look, you're gonna get two years for drug dealing,
- that's kind of guaranteed". But he says
- "look, if you confess, and the other doesn't,
- then you will get 1 year
- and the other guy will get 10 years".
- So he's telling Al, "look, we caught Bill too just randomly today,
- if you confess that it was you and Bill who performed that armed robbery,
- your term is actually going down from 2 years to 1 year.
- But Bill is obviously going to have to spend a lot more time in jail,
- especially because he's not cooperating with us,
- he's not confessing".
- But then the other statement is also true:
- If you deny and the other confesses
- now it switches around.
- You will get 10 years because you're not cooperating,
- and the other, your co-conspirator will get a reduced sentence,
- will get the 1 year. So this is like telling Al
- "look, if you deny that you were the armed robber
- and Bill snitches you out,
- then you're gonna get 10 years in prison
- and Bill is only going to get 1 year in prison".
- And if both of you essentially confess, both confess,
- you will both get 3 years.
- So this scenario is called "The Prisoner's Dilemma".
- Because we'll see in a second
- there is a globally optimal scenario for them
- where they both deny, and they both get 2 years.
- But we'll see, based on their incentives,
- assuming they don't have any unusual loyalty to each other,
- and these are, you know, these are hardened criminals here.
- They're not brothers or related to each other in any way.
- They don't have any kind of loyalty pack.
- We'll see that they will rationally pick a non,
- or they might rationally pick a non-optimal scenario.
- And to understand that I'm going to draw something
- called the "pay-off matrix", a pay-off matrix.
- So let me do it right here for Bill.
- So Bill has two options, he can confess to the armed robbery
- or he can deny that he had anything,
- that he knows anything about the armed robbery.
- And Al has the same two options.
- Al can confess and Al can deny.
- And since it's called the pay-off matrix,
- let me draw some grids here.
- And let's think about all of the different scenarios
- and what the pay-offs would be.
- If Al confesses and Bill confesses then they're in scenario 4,
- they both get 3 years in jail, they both would get
- 3 for Al, and 3 for Bill.
- Now, if Al confesses and Bill denies
- then we are in scenario 2, from Al's point of view,
- Al is only going to get 1 year,
- but Bill is going to get 10 years.
- Now if the opposite thing happens,
- that Bill confesses and Al denies
- then it goes the other way around.
- Al is going to get 10 years for not cooperating and
- Bill is going to have a reduced sentence of 1 year for cooperating.
- And if they both deny, they're in scenario 1, where
- they're both just going to get their time for the drug dealing.
- So Al would get 2 years and Bill would get 2 years.
- Now I alluded to this earlier in the video:
- what is the globally optimal scenario for them?
- Well, it's this scenario, where
- they both deny having anything to do with the armed robbery,
- then they both get 2 years.
- But what we'll see is that it is actually somewhat rational,
- assuming that they don't have any strong loyalties to each other,
- a strong level of trust with the other party,
- to not go there, it's actually rational for both of them to confess.
- And a confession is actually a "Nash equilibrium".
- And we'll talk more about this.
- But a Nash equilibrium is where each party has picked a choice
- given the choices of the other party.
- So when we think of, or each party's picked the optimal choice
- given the choices of, or given whatever choice the other party picks.
- And so from Al's point of view he says, well look,
- I don't know whether Bill, or Bill is confessing or denying,
- so let me, let's say he confesses, what's better for me to do?
- If he confesses and I confess, then I get 3 years.
- If he confesses and I deny, I get 10 years.
- So if he confesses it's better for me to confess as well.
- So this is a preferable scenario to this one down here.
- Now I don't know that Bill confessed, he might deny.
- If I assume Bill denied, is it better for me to confess
- and get 1 year, or deny and get 2 years?
- Well once again, it's better for me to confess.
- And so, regardless of whether Bill confesses or denies,
- so this once again, the optimal choice for Al to pick,
- taking into account Bill's choices, is to confess.
- If Bill confesses, Al's better off confessing,
- If Bill denies, Al's better off confessing.
- Now we look at it from Bill's point of view,
- and it's completely symmetric.
- If Bill, Bill says, well I don't know if Al's confessing or denying.
- If Al confesses, I can confess and get 3 years,
- or I can deny and get 10 years.
- Well, 3 years in prison is better than 10,
- so I would go for the 3 years.
- If I know Al is confessing.
- But I don't know that Al's definitely confessing, he might deny.
- If Al's denying, I could confess and get 1 year,
- or I could deny and get 2 years.
- Well, once again, I would want to confess and get the 1 year.
- So Bill, taking into account each of the scenarios that Al might take,
- it's always better for him to confess.
- And so this is interesting.
- They're rationally deducing that they should get to this scenario,
- this Nash equilibrium state,
- as opposed to this globally optimal state.
- They're both getting 3 years by both confessing
- as opposed to both of them getting 2 years by both denying.
- The problem with this one is this is an unstable state.
- If one of them assumes that the other one has,
- if one of them assumes that
- they're somehow in that state temporarily.
- They say "well, I can always improve my scenario
- by changing my, by changing what I wanna do".
- If Al thought that Bill was definitely denying
- Al can improve his circumstance by moving out of that state
- and confessing and only getting 1 year.
- Likewise, if Bill had thought that maybe Al is likely to deny
- he realizes that he can optimize by moving in this direction
- instead of denying and getting 2 and 2
- he could move in that direction right over there.
- So this is an ustable optimal scenario,
- but this Nash equilibrium, this state right over here
- is actually very, very, very stable.
- If they assume... this is, it's better for each of them to confess
- regardless of what the other one does,
- and assuming all of the other actors have chosen their strategy,
- there's no incentive for Bill.
- So... if assuming everyone else has changed the strategy
- you can only move in that direction, if you're Bill you can either...
- you can go from the Nash equilibrium of confessing to denying,
- but you're worse off, so you won't wanna do that.
- Or you could move in this direction,
- where it would be Al changing his decision.
- But once again that gets a worse outcome for Al
- you're going from 3 years to 10 years.
- So this is the equilibrium state, the stable state,
- that both people would pick something
- that it's not optimal globally.
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
Have something that's not a question about this content?
This discussion area is not meant for answering homework questions.
Share a tip
When naming a variable, it is okay to use most letters, but some are reserved, like 'e', which represents the value 2.7831...
Thank the author
This is great, I finally understand quadratic functions!
Have something that's not a tip or thanks about this content?
This discussion area is not meant for answering homework questions.
At 2:33, Sal said "single bonds" but meant "covalent bonds."
For general discussions about Khan Academy, visit our Reddit discussion page.
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
- disrespectful or offensive
- an advertisement
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
- a tip or thanks in Questions
- a question in Tips & Thanks
- an answer that should be its own question