If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:11:01

Video transcript

let's say you're some type of traffic engineer and what you're trying to figure out is how many cars pass by a certain point on the street at any given point in time and you want to figure out the probabilities that 100 cars pass or five cars pass in a given hour so a good place to start is just to define a random variable that that essentially represents what you care about so let's say the number of cars of cars that pass in some amount of time let's say in an hour in an hour and your goal is to figure out the probability distribution of this random variable and then once you know the probability distribution then you can figure out you know what's the probability that 100 cars pass in an hour or the problem is that no cars pass an hour and you'd be unstoppable so and just to a little aside just to move forward in this video there's two assumptions we need to make because we're going to study the process on distribution in order to study it there's two assumptions we have to make that any hour on at this point on the street is no different than any other hour and we know that that's probably false during rush hour in a real situation you probably would have more cars and in another rush hour and you know if you wanted to be more realistic maybe we do it in a day right because in a day any period of time actually no I shouldn't do a day where we have to assume that every hour is completely you just like any other hour and actually even within the hour there's really no differentiation from one second to the other in terms of the probabilities that a car arrives so that's a little bit of a simplifying assumption that might not truly apply to traffic but I think we can we can make that assumption and the other assumption we need to make is that if a bunch of cars passed in one hour that doesn't mean that fewer cars will pass in the next that in no way does the number of cars that pass in one period effect or you know correlate or somehow influence the number of cars that pass in the next set they're really independent given that we can then at least try using the skills we have to model out some type of a distribution the first thing you do and I'd recommend doing this for any distribution is that you know figure maybe we can estimate the mean let's sit out on that curb and measure what this variable is over a bunch of hours and then average it up and that's going to be a pretty good estimator for the actual mean of our population or since it's a random variable the expected value of this random let's say you do that and you get your best estimate of the expected value the expected value of this random variable is I'll use the letter lambda so this could be you know this could be nine cars per hour you sat out there it could be nine point three cars prior you sat out there over hundreds of hours and you just counted the number of cars each hour and you averaged them all up you said on average there are nine point three cars per hour and you feel that's a pretty good estimate so that's what you have there and let's see what we could do we know the binomial the distribution right the binomial distribution tells us that the expected value of a random variable is equal to the number of trials that that that random variable is kind of composed of right before in the previous videos we were counting the number of heads in a coin toss so this would be the number of coin tosses times the probability of success over each toss right this is what we did with the binomial distribution so maybe we can model our traffic situation something similar this is a number of cars that pass in an hour right so maybe we could say you know lambda cars per hour cars per hour is equal to I don't know let's see let's listen let's make each let's make each experiment or each toss of the coin equal to whether a car passes in a given minute so there's 60 minutes per hour and then so that there will be 60 trials and then the probability that we have success in each of those trials if we model this as a binomial distribution would be would be lambda over 60 cars per minute and this would be a probability this would be n and this would be the probability if we said that this is a binomial distribution and this might actually this probably wouldn't be that bad of an approximation if you actually then said oh this is a binomial distribution so the probability the probability that our random variable equals some given value K you know the probability that three cars exactly three cars pass in a given hour it would then be equal to n so n would be 60 choose K and you know I three cars times the probability of so the probability that a car passes in any minute so it would be lambda over 62 the number of successes we need so to the K power times the probability of no success or that no cars pass to the N minus K right if we have K successes we have to have 60 minus K failures there are 60 minus K minutes where no car passed you know this actually wouldn't be that bad of an approximation where you have 60 intervals and you say this is a binomial distribution and and you probably get reasonable results but there's a core issue here in this model where we model it as a binomial distribution what happens if more than one car passes in an hour right or more than one car passes in a minute the way we have it right now we call it a success if one car passes in a minute and if you're kind of counting it counts as one success even if five cars pass in that minute and so you say oh okay Sal I know the solution there I just have to get more granular instead of doing it instead of dividing it into minutes why don't I divide it into seconds so the probability that I have K successes I'll do instead of 60 intervals I'll do 3,600 intervals and so the probability of K successful seconds so a second where car is passing at that moment out of 3,600 possible seconds so that's 3,600 choose K times the probability that car passes in any given second well that's the probability that's the expected number of cars in an hour divided by number of seconds in an hour and we're going to have K successes and then we're going to have and these are the failures the probability of failure you're going to have 3,600 minus K failures and this would be even a better approximation this actually would not be so bad but still you have the situation where you know two cars can come within a half a second of each other and you say okay so I see the pattern here we just have to get more and more granular we have to just make this number larger and larger and larger and and your intuition is correct and if you do that you'll end up getting the Poisson distribution and this is really interesting cuz a lot of times people give you the formula for the Prasad distribution and you can kind of just plug in the numbers and use it it's neat to know that it really is just the binomial distribution and the binomial distribution really did come from kind of the common sense of you know flipping coins that's where everything is coming from but before we kind of prove that this if we take the limit is let me change colors before we prove that as we take the limit as this number right here the number of intervals approaches infinity that this becomes the Poisson distribution I'm going to make sure we have a couple of of mathematical tools in our belt so the first is something that you're probably reasonably familiar with by now but I just want to make sure that the limit as X approaches infinity of one plus a over X to the X power is equal to e to the a X no no sorry is equal to e to the a and I just do just to prove this to you let's make a let's make a little substitution here let's say that n is equal to let me let me say 1 over N is equal to a over X and then what would be X would be X would be equal to n a right x times 1 is equal to n times a and so the limit is X approaches infinity as X approaches infinity what is a approach a is sorry as X approaches infinity with an N approach well n is X divided by a right so n would also approach infinity so this thing would be the same thing as just making our substitution the limit as n approaches infinity of 1 plus I over X I made the substitution is 1 over N and X is by this substitution n times a and this is going to be the same thing as the limit as n approaches infinity of 1 plus 1 over N to the N all of that to the a and since there's no N out here we could just take the limit of this and then take that to the a power so that's going to be equal to the limit as n approaches infinity of 1 plus 1 over N to the nth power all of that to the a and this is our definition or one of the ways to get to e if you watch the videos on compound interest and all that this is how we got to e and if you try it out in your calculator just try larger and larger ends here and you'll get e so this is equal to this inner part is equal to e so and we race to the eighth power so it's equal to e to the a so hopefully you're pretty satisfied that this limit is equal to e to the a and then one other toolkit I want in our belt and I'll probably actually do the proof in the next video the other toolkit is to recognize that X factorial let me do this X factorial over X minus K factorial is equal to x times X minus 1 times X minus 2 all the way down 2 times X minus K plus 1 and we've done this a lot of times but this is the most abstract we've ever written it I can give you a couple of sin and it will be exactly just so you know they'll be exactly K terms here 1 2 3 this is once the first term second term third term all the way and this is the K term and this is important to our derivation of the Poisson distribution but just to make this in real numbers lets you know if if if I had 7 factorial over 7 minus 2 factorial that's equal to 7 times 6 times 5 times 4 times 3 times 2 times 1 over 2 times no sorry 7 minus 2 this is 5 so it's over 5 times 4 times 3 times 2 times 1 these cancel out and you just have 7 times 6 and so it's 7 and then the last term is 7 minus 2 plus 1 which is 6 7 minus 2 plus 1 and you had in this example K was 2 and you had exactly two terms so once we know those two things were now ready to derive the Poisson distribution and I'll do that in the in the next video see you soon