Inference and experiments
Statistical significance of experiment
Voiceover:In an experiment aimed at studying the effect of advertising on eating behavior in children, a group of 500 children, seven to 11 years old were randomly assigned to two different groups. After randomization, each child was asked to watch a cartoon in a private room, containing a large bowl of goldfish crackers. The cartoon included two commercial breaks. The first group watched food commercials, mostly snacks while the second group watched non-food commercials, games and entertainment products. Once the child finished watching the cartoon, the conductors of the experiment weighed the cracker bowls to measure how many grams of crackers the child ate. They found that the mean amount of crackers eaten by the children who watched food commercials is 10 grams greater than the mean amount of crackers eaten by the children who watched non-food commercials. Let's just think about what happens up to this point. They took 500 children and then they randomly assigned them to two different groups. You have group one over here and you have group two. Let's say that this right over here is the first group. The first group watched food commercials. This is group number one. They watched food commercials. We could call this the treatment group. We're trying to see what's the effect of watching food commercials and then they tell us. The second group watched non-food commercials, so this is the control group. Number two, this is non-food commercials. This is the control right over here. Once the child finished watching the cartoon, for each child they weighed how much of the crackers they ate and then they took the mean of it and they found that the mean here that the kids ate 10 grams greater on average than this group right over here which just looking at that data makes you believe that okay, well something maybe happened over here. That maybe the treatment from watching the food commercials made the students eat more of the goldfish crackers but the question that you always have to ask yourself in a situation like this. Well, isn't there some probability that this would have happened by chance that even if you didn't make them watch the commercials. If these were just two random groups and you didn't make either group watch a commercial, you made them all watch the same commercials. There's some chance that the mean of one group could be dramatically different than the other one. It just happened to be in this experiment that the mean here that it looks like the kids ate 10 grams more. How do you figure out, what's the probability that this could have happened, that the 10 grams greater in mean amount eaten here that that could have just happened by chance. Well the way you do it is what they do right over here. Using a simulator, they re-randomize the results into two new groups and measure the difference between the means of the new groups. They repeated the simulation 150 times and plotted the differences given. The resulting difference is as given below. What they did is they said, okay, they have 500 kids and each kid, they had 500 children. Number one, two, three, all the way up to 500. For each child they measured how much was the weight of the crackers that they ate? Maybe child one ate two grams and child two ate four grams and child three ate, I don't know, ate 12 grams all the way to child number 500 ate, I don't know, maybe they didn't eat anything at all, ate zero grams. We already know, let's say the first time around. The first half was in the treatment group when we're just ranking them like this and then the second, they're randomly assigned into these groups and at the second half was in the control group. What they're doing now is they're taking the same results and they're re-randomizing it. Now they're saying, okay, let's maybe put this person in group number two and this person in group number two and this person stays in group number two and this person stays in group number one and this person stays in group number one. Now they're completely mixing up all of the results that they had. It's completely random of whether the student had watched the food commercial or the non-food commercial and then they're testing what's the mean of the new number one group and the new number two group. They're saying well, what is the distribution of the differences in means. They see when they did this way when they're essentially just completely randomly taking these results and putting them into two new buckets. You have a bunch of cases where you get no difference in the means. Out of the 150 times that they repeated the simulation doing this little exercise here. One, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13, 14, 15. I'm having trouble counting this let's see. One, two, three, four, five, six, seven, eight, nine, 10, 11, 12. It's so small, I'm aging but it looks like there's about, I don't know. High teens about 20 times when there's actually no noticeable difference in the means of the groups where you just randomly allocate the results amongst the two groups. When you look at this, if it was just, if you just randomly put people into two groups, the probability or the situations where you get a 10 gram difference are actually very unlikely. Let's see, is this the difference? The difference between the means of the new groups. It's not clear whether this is group one minus group two or group two minus group one but in either case the situations where you have a 10 gram difference in mean. It's only two out of the 150 times. When you do it randomly, when you just randomly put these results into two groups, the probability of the means being this different, it only happens two out of the 150 times. There's a 150 dots here. That is on the order of 2% or actually it's less than 2%, it's between one and 2%. Let's say the situation we're talking about. Let's say that this is group one minus group two in terms of how much was eaten and so you're looking at this situation right over here that that's only one out of a 150 times. It happened less frequently than one in a 100 times. It happened only one in a 150 times. If you look at that, you say well, the probability this was just random. The probability of getting the results that you got is less than 1%. To me and then to most statisticians, that tells us that our experiment was significant, that the probability of getting the results that you got. The children who watched food commercials being 10 grams greater than the mean amount of crackers eaten by the children who watched non-food commercials. If you just randomly put 500 kids into two different buckets based on the simulation results it looks like there's only, if you'd run the simulation a 150 times, that only happened one out of the 150 times. It seems like this was very, it's very unlikely that this was purely due to chance. If this was just a chance event, this would only happen roughly one in 150 times but the fact that this happened in your experiment, it makes you feel pretty confident that your experiment is significant. In most studies, in most experiments, the threshold that they think about is the probability of something statistically significant. If the probability of that happening by chance is less than 5%, this is less than 1%. I would definitely say that the experiment is significant.
AP® is a registered trademark of the College Board, which has not reviewed this resource.