If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Introduction to the chi-square test for homogeneity

We investigate a scenario that includes applying the chi-squared statistic for a test of homogeneity, setting up null and alternative hypotheses, calculating expected values, verifying conditions for conducting a chi-squared test, and determining degrees of freedom based on the structure of a contingency table.

Want to join the conversation?

• Hi could someone explain how the chi-square test for homogeneity is conceptually DIFFERENT from the two-sample inference for the difference between groups? I am having a hard time conceptually wrapping my mind around how this test is different from the one that we learned before. Thank you!
• The first difference is that Chi-Square Tests are used for CATEGORICAL variables rather than Z and T which use QUANTITATIVE Variables. Another difference is that Chi-Square homogeneity is used to compare how data compares to the true KNOWN value and basic (observed-expected)^2/expected is used based on CELL COUNTS not means. On the other hand, 2 sample t or z is used to see if the means of 2 separate groups are equal, greater, or smaller than each other.
• Why do we calculate just one `χ²` that includes both the data for the left-handed and right-handed people? Coming from the previous videos, I would think we would have to compute two `χ²`'s, one for the right-handed data and one for the left-handed, and then compare those by taking the difference between them.
• Does someone have a resource that talks in detail about degrees of freedom? I understand what it is, but I don't exactly get why it's applicable in many situations.
• Is there a video/playlist explaining at length the reason/s for the large expected counts and 10% sample requirements?
(1 vote)
• Is this 'special' Chi-square test considered a non-parametric test? why or why not?
• If I'm understanding this correctly, the "expected" result essentially creates a baseline (or a baseline for each population). The test then calculates the distance each population has from that baseline.

If the distance is small enough to not be significant (i.e. less than p-value), then the samples can be deemed to be homogenous, i.e. the populations being tested don't have an impact on the variables being looked at.

Is this correct?