If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## Statistics and probability

### Course: Statistics and probability>Unit 14

Lesson 2: Chi-square tests for relationships

# Contingency table chi-square test

Sal uses the contingency table chi-square test to see if a couple of different herbs prevent people from getting sick. Created by Sal Khan.

## Want to join the conversation?

• Why do we use the data for both 'sick' and 'not sick' in computing our chi squared statistic? It seems like it will make our result seem more deviant than it really is since in each group the number of 'not sick' people is directly related to the number of 'sick' people. • Isn't there the potential for one herb to be really effective and the other to be ineffective? I feel like those scores could cancel each other out, leading to a Type II error (failing to reject the null when it is false). I don't understand why, if you're interested in testing several conditions, it makes sense to mix all the data up. • You wouldn't conduct a Chi-square test to answer the question, "are the herbs better than the placebo?" The correct null hypothesis in this example (that can be assessed by a chi-square test) is: Ho: There is no relationship between taking pills and getting sick.
• At he says that 21% did not get sick and then writes 21% in the row labeled "sick". Did he make a mistake, or am I missing something here? • Why use the Chi square statistics to address this problem instead of the Bernoulli one and continue inferring the data as before? Basically, how does one decide which approach to apply? • Well, a Bernoulli hypothesis test with two samples would work... if we had two samples :) But in this case we have 3 samples( herb 1, herb 2, and placebo). You just can't compare 3 things to see if they are the same. If you say x1-x2 = 0 it means that x1 & x2 are the same. But if you say x1-x2-x3 = 0 you can't really say anything about them, they could be any numbers that add up to zero. So the best way to do it is to use a contingency table with a chi-square test.
• Throughout this hypothesis test the actual values were used, resulting in a chi-square value of 2.53, but I decided to try the whole calculation from scratch using the percentages of each subgroup instead. The result was a chi-square value of 2.08 . So this makes me wonder: is it possible to manipulate the parameters of the study (i.e. obtain a larger sample of the population) to where it would result in a chi-square value greater than our critical chi-square value? • Actually, Chi-square test statistics are extremely sensitive to the sample size - and it is not because larger samples are inherently better. The chi-square always gets larger with larger sample size, and it always gets small with smaller sample sizes (Daniel's math is correct). Thus, you can have a strong statistical association but fail to find it significant with a chi-square test if you have a small sample size (and the reverse). The chi-square test has many limitations - still it is one of the most useful tests in social statistics. The key is to learn to use it appropriately and to learn to interpret your findings in light of the limitations of a chi-square. For more information, see your friendly stats textbook. I suggest (because I use this book in my own stats classes and have it handy) The Essentials of Social Statistics for a Diverse Society, page 210 for a discussion of this specific issue (sample size and Chi-square test statistics).
• Shouldn't we be performing a two tailed test here? the null hypothesis says that the effect of the herbs is nothing and the alternate hypothesis says that the effect is not nothing. • Why would you include the number of people who got sick and also took an herb in the expected percentage of who would get sick with no interference? I would think the whole point of having a control group would be to get the actual percentage of people who would get sick with no interference and then test the observed for the two herb groups. Based on this test you could see if there was a difference between observed for the herb and expected for no interference. Then you could answer if the herb made a difference or not. What Sal did seems to be was say that if we are assuming that there is no difference we can just include the sick people from the herb categories in the percentage that get sick with no interference. • But don't we do here an overcalculation, counting one the same error two times (because second row = 100% - first row)?   