If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:11:36

Video transcript

- [Voiceover] "Giovanna usually takes bus B to work, "but now she thinks that bus A gets her to work faster. "She randomized 50 workdays between a treatment group "and a control group. "For each day from the treatment group, she took bus A; "and for each day from the control group, she took bus B. "Each day she timed the length of her drive." This is really interesting what she did, it's very important, she randomized the 50 work days. Before she did this, instead of just kinda waking up in the morning and just deciding on her own which bus to take. Because humans are infamously bad at being random. Even when we think we're being random, we're actually not that random. She might inadvertently be taking bus A earlier in the week. Or maybe the commute times are shorter. Or maybe she inadvertently takes bus A when the weather is better, when there's less traffic. Remember, there's a natural tendency for human beings to want to confirm their hypothesis. So, if she thinks that bus A is faster maybe she'll want to pick the days where she'll get data to confirm her hypothesis. It's really important that she randomize the 50 workdays. What I could imagine she did is maybe she wrote each of the work days, the dates, on a piece of paper. She would have 50 pieces of paper and then she turned them all upside down or maybe she closed her eyes and then she moved them all over her table. Then with her eyes closed she randomly moved them to either the left or the right of the table. If they moved to the left of the table then those are the days she'll take bus A, if she moves them to the right of the table those are the days she takes bus B. That's how she can make sure that this is truly random. So then they tell us, this is important, "The results of the experiment showed that the median "travel duration for bus A is eight minutes less than "the median travel duration for bus B." Or one way to think about, if we said, "The treatment group "median minus the control group median. "What would we get?" Well, the treatment group is eight minutes less than the control group? Right? This is A, this is B, so if this is eight less than this, then this is going to be equal to negative eight. This is just another way of restating what I have underlined right over here. Someone's car alarm went off, hope you're not hearing that. Anyway, I'll try to pay attention while it's going off (chuckles). "To test whether the results could be explained "by random chance, she created the table below, "Which summarizes the results of 1000 re-randomizations "of the data, with differences between medians "rounded to the nearest five minutes." What is going on over here? You might say well look, "She got her result that she "wanted to get, this data seems to confirm that "bus A gets her to work faster. "What's all this other business with re-randomization "she's doing?" The important thing to realize is, and she realizes this, is that she might have just gotten this data that I underlined, by random chance. There's some chance maybe A and B are completely similar, in terms of how long they take in reality. She just happened to pick bus A on days where bus A got to work faster. Maybe bus B is faster but she just happened to take bus A on the days that it was faster. The days it just happened to have less traffic. What she's doing here is she re-randomized the data and she wants to see that with all this re-randomized data, out of these 1000 re-randomizations, what fraction of them do I get a result like this? Do I get a result where A is eight minutes or more faster? Or you could say that the median travel duration for bus A is eight minutes less, or even less than that, than the median travel for bus B. So if it was nine minutes less, or 10 minutes less, or 15 minutes less, those are all the interesting ones. Those are the ones that confirm our hypothesis, that bus A gets to work faster. Let's look at this table, it's not below, it's actually to the right. Let's just remind ourselves what she did here, cause the first time you try to process this it can seem a little bit daunting. So, in her experiment, let me write this down, experiment... The car alarm outside which you probably, hopefully are not hearing, it's actually a surprisingly pleasant sounding car alarm, sounds like a slightly obnoxious bird, but anyway (laughs). Her experiment is, the way I described it, 25 days she would take bus A, 25 days she would take bus B. She would record all the travel times and let's say that I have 25 data points in each column. Let's say they get 12 minutes, 20 minutes, 25 minutes, and you just keep going, there's 25 data points. Let's just say that there are 12 data points less than 20 minutes and 12 data points more than 20 minutes. In this circumstance, her median time for bus A would be 20 minutes and I just made this number up. So in order for this to be eight minutes less than the median time for bus B, the median for bus B would have to be 28 and maybe you have data points here. Maybe this is 18 and you have 12 more that are less than 28. Then you have 12 more that are greater than 28. So the median time for bus B would be 28, once again I just made this data up. If you took treatment group median. I 'll just write TGM for short. TGM minus control group median. What do you get? 20 minus 28 is negative eight. This is the actual results of.... These are theoretical, potential results, hypothetical results for her actual experiment. Now what's all of this business over here? What she did is she took these times and she said, "You know what, let's just imagine a world where I could "have gotten any of these times randomly on either bus." So she just randomly re-sorted them between A and B, she did that a thousand times. The first time, the second time, the third time. She does this 1000 times. I'm assuming she used some type of computer program to do it and each time, once again, she just took the data that she had and she just rearranged it, she just reshuffled it. Maybe A on one day. Maybe it got this 18. Maybe it gets the 25. Maybe it gets a 30. Once again, I got the 18, the 25, the 30 and maybe B gets the... You know she's reshuffling all these other data points that I just have with dots and maybe B... Let's see she had the 18, 25, 30, maybe 12, 20, and 28. So in this circumstance, this random reshuffling and she keeps doing it over and over again. In this random reshuffling, the treatment group median minus the control group median is going to be what? It's going to be equal to positive five. In this random shuffling, this hypothetical scenario, Bus A's median would have been five minutes longer than bus B's. If she gets this result with this random re-sorting, this would have been... She would have had a column here for five. Then she would have put one notch right over here. It looks like she classified things or maybe she didn't even get the data but she classified them by multiples of two. If she got this again then she would have put a two here. Then she would have said, "Okay, in how many of these "random reshufflings am I getting a scenario where "there's a five minute difference? "Or where the treatment group was five minutes longer?" What is this saying? For example, this is saying that 18 out of the 1000 reshufflings, which she just randomly re-shuffled the data, 18 out of those 1000 times, she found a scenario where her treatment group median was 10 minutes longer than her control group. Where bus A's median was in this hypothetical re-randomization where the treatment group is 10 minutes slower than the control group. There were 159 times where the treatment group... Once again, in her random reshuffling, these aren't based on observations, these are random reshufflings. There's 159 times where her treatment group is four minutes slower than her control group. The whole reason for doing this is she says, "Okay, what's the probability of getting a result "like this or better?" I say, "better", as one that even more confirms her hypothesis, that the treatment group is faster than the control group. Well, the scenario, this scenario is this one right over here and then another one that the treatment group is even faster, is this right over here. Here, the treatment group median is 10 less than the control group median. In how many of these scenarios, out of the thousands, is this occurring? Well, this one occurs 85 times, this one occurs eight. If you add these two together, 93 out of the thousand times, out of her re-randomization or I guess you could say 9.3 percent of the time, the data... 9.3 percent of the randomized, the 1000 re-randomizations, 9.3 percent of the time she got data that was as validating of a hypothesis or more than the actual experiment. One way to think about this is, the probability of randomly getting the results from her experiment or better results from her experiment are 9.3 percent. They're low, it's a reasonably low probability that this happened purely by chance. Now, a question is, "What's the threshold?" If it was a 50 percent you say, "Okay, this was very "likely to happen by chance." If this was a 25 percent you're like, "Okay, it's less "likely to happen by chance but it could happen." 9.3 percent, it's roughly 10 percent. For every 10 people who do an experiment like she did, even if it was random, one person would get data like this? What typically happens amongst statisticians is they draw a threshold and the threshold for statistical significance is usually five percent. One way to think about it, the probability of her getting this result by chance, this result or a more extreme result? One that more confirms her hypothesis by chance is 9.3 percent. If you're cut-off for significance is five percent. If you said, "Okay, this has to be five percent or less." Then you say, "Okay, this is not statistically significant." There's more than a five percent chance that I could have gotten this result purely through random chance. Once again, that just depends on where you have that threshold. When we go back, I think we've already answered the final question, "According to the simulations, "what is the probability of the treatment group's median "being lower than the control group's median "by eight minutes or more?" Which once again, eight minutes or more, that would be negative eight and negative 10. We just figured that out, that was 93 out of the 1000 re-randomizations, so it's a 9.3 percent chance. If you set five percent as your cut-off for statistical significance, you say, "Okay, this doesn't quite meet my "cut-off so maybe this is not a statistically "significant result."