Current time:0:00Total duration:4:07

# Conditional probability and independence

## Video transcript

- [Instructor] James is
interested in weather conditions and whether the downtown train he sometimes takes runs on time. For a year, James records
whether each day is sunny, cloudy, rainy or snowy, as
well as whether this train arrives on time or is delayed. His results are displayed
in the table below. Alright, this is interesting. These columns, on time,
delayed and the total, so for example, when it was sunny, there's a total of 170
sunny days that year, 167 of which the train was on time, three of which the train was delayed, and we can look at that
by the different types of weather conditions, and
then they say for these days, are the events delayed
and snowy independent? So to think about this, and remember, we're only going to be able to figure out experimental probabilities,
and you should always view experimental probabilities
as somewhat suspect. The more experiments you're
able to take, the more likely it is to approximate the
true theoretical probability, but there's always some
chance that they might be different or even quite different. Let's use this data to try to calculate the experimental probability. So the key question here is what is the probability that the train is delayed? And then we wanna think
about what is the probability that the train is delayed
given that it is snowy? If we knew the theoretical probabilities and if they were exactly the same, if the probability of being
delayed was exactly the same as the probability of
being delayed given snowy, then being delayed or being
snowy would be independent, but if we knew the
theoretical probabilities and the probability of
being delayed given snowy were different than the
probability of being delayed, then we would not say that
these are independent variables. Now, we don't know the
theoretical probabilities. We're just going to calculate
the experimental probabilities and we do have a good
number of experiments here, so if these are quite different, I would feel confident saying
that they are dependent. If they are pretty close with
the experimental probability, I would say that it would be
hard to make the statement that they are dependent,
and that you would probably lean towards independence,
but let's calculate this. What is the probability that
the train is just delayed? Pause this video and
try to figure that out. Well, let's see. If we just think in general,
we have a total of 365 trials, or 365 experiments, and of them, the train was delayed 35 times. Now, what's the probability that the train is delayed given that it is snowy? Pause the video and
try to figure that out. Well, let's see. We have a total of 20 snowy days and we are delayed 12
of those 20 snowy days, and so this is going to be a probability, 12/20 is the same thing
as, if we multiply both the numerator and the denominator by five, this is a 60% probability, or
I could say a 0.6 probability of being delayed when it is snowy. This is, of course, an
experimental probability, which is much higher than this. This is less than 10% right over here. This right over here is less than 0.1. I could get a calculator
to calculate it exactly. It'll be nine point something percent or zero point nine something, but clearly, this, you are much more likely, at least from the experimental data, it seems like you have
a much higher proportion of your snowy days are delayed than just general days in general, than just general days,
and so based on this data, because the experimental probability of being delayed given
snowy is so much higher than the experimental probability
of just being delayed, I would make the statement
that these are not independent, so for these days, are the events delayed and snowy independent? No.