If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Pearson's chi square test (goodness of fit)

Sal uses the chi square test to the hypothesis that the owner's distribution is correct. Created by Sal Khan.

## Want to join the conversation?

• A few things were unclear to me here. First, is chi squared always calculated as a difference between expected and observed divided by expected? Where is the derivation or explanation of this?

Secondly, in what scenarios should we use chi-squared vs. other statistics? Is there a limit on number of data points (or in other words, degrees of freedom) for this calculation? I think there should be more explanation on the use cases for this statistic and how its calculated. • I understand that if the chi-square value exceeds the appropriate minimum value in the chi-square distribution table, taken into account the degrees of freedom, you can reject the null hypothesis. (And that the same is true of the reverse, if the chi-square value does not exceed the appropriate minimum value in the chi-square distribution you will accept the null hypothesis). Can some explain to me why this is? I do not understand the theory about the minimum chi-square value to understand why we reject a chi-square value that exceeds the value in the distribution table. • The question you answer with the test can be rephrased like this: "if the shop owner's theory is right (i.e. what percentage of customers come each day), what is then the probability to see the given observations (30 on monday, 14 on Tuesday, etc) or something more unlikely?"
This is the question you answer with the test, and you can calculate that probability exactly (or you can use tables). In this case it is just below 5%.
So, if the hypothesis is right and you make observations for for a weak, then there is almost 5% chance that you see what you see or something even less likely.
The 5% significance criteria is a subjective choice. Some use 1%, some 5%, some 0.5%. If I generally trusted the shop owner, and new that he had kept track of customers for a long period, and was a clever guy, then I would still believe his hypothesis. I mean, after all, 5% corresponds to about 1/20 - it is not a veeery rare observation. On the other hand, if I knew the shop owner was sloppy with numbers, and had a tendency to lie, etc., then I would be more likely to reject the hypothesis on basis of my observations. However, before I confronted him, I think I would observe another week to get more certain knowledge.

A lot of talking, sorry! My point here: you get the probability from the test. That is your result! What significance level to chose depends on the situation. Sometimes it might be life changing - if it was a test for some disease, I would never be satisfied with a 5% risk. Say, the docter tells me: there is only a 5% chance that you have that life threatening disease, given the test result, so you can go home. Then I would ask for another test! But if I was in the line for a super discount offer on black friday, and a clever person had calculated that there were only 5% chance that I would get the item before it was sold out, then I would step out of the line immediately. It depends on consequences, risk, what I already know and many other things!

• Had we counted legs of visitors instead of visitors, and assuming each has two, our chi-squared would be twice bigger for the same effective statistical question. It is thus incorrect to count legs. They indeed do not get odd values. It also seems that values which are not discrete, such as ammoung of food eaten each day, will result differently depending on the physical units of mass. It is thus also incorrect to use continbuous random variables for chi-statistics? • How the formula we get, Chi-square = Sum of all (Observed frequency-Expected frequency)2 / Expected frequency • • Can someone introduce me a Statistics book that is written in plain English, in a way that a novice like me can understand and apply in real, practical situtation. Also if it can give me some insights and intuative feelings why statistics tests are the ways they are. It's even better. Thanks a lot! • Are the charts mentioned at the marker something that is given or would I have to fill it in myself while solving the problem. If so, how would i go about doing that? • Just food for thought.

Using a calculator and using chi2_Cdf[11,44, infinity, 5] (the chance of it being on 11,44 or higher on the function) results in a 4.33% chance. And if you think about it, if 11.07 gives 0.05, a higher number (namely 11.44) results in a lower p-value.
So at what point should we start approximating? Because in this example that fact results in a different answer.
(1 vote) • I'm not sure what the problem is, if in fact you're trying to point out a problem. It is natural that a higher chi-square value - in your case, 11.44 - has a lower p-value than , say, 11.07. This is because the p-value is the probability that you calculate a statistic as extreme as the one you did purely by chance instead of sample differences. In other words, you are taking the area to the right of the value, and since the distribution decreases as you move to the right, the area should also decrease as you move the left bound closer to the right.
• I want to know more.
I'm interested in goodness of fit test about Poisson& Normal distribution.
:D
(1 vote) • That's a great question! Think about the Poisson distribution for a bit: we have only non-negative integer values, right? So we'd hypothesize a mean, and then use that hypothesized value to calculate the probability of 0 events, 1 event, 2 events, 3 events, etc etc. Then we'd sort out how many times we observed a 0, or a 1, or a 2, and so on. In this way, we'd have our hypothesized probabilities, and the observed values. From there we can pick up pretty much where the video starts, or at about if you want to skip some of the initial explanation.

For the Normal distribution, the process is largely the same. You first hypothesize a the distribution (normal with specified mean and standard deviation). But since the Normal distribution is continuous, you need to define bins for your random variable, such as 0-1, 1-2, 2-3, 3-4, etc, and then calculate the probability of those bins using the hypothesized mean and standard deviation.
• OK. But my problem is that I don't believe the answer: When you compare the observed and the expected, they seem to be "pretty close". For example, the order of the days is the same (Friday being the heaviest, Tuesday the Lightest, and matching closely in between). Admittedly, my eyeballed "pretty close" is hardly a scientific test, and there is more to fit than biggest-to-smallest arrangement. But still, something seems to be off, because typically what happens is the other way around: the numbers look crappy (they do not appear to fit the expected distribution), but come to find out, they do. In this case... if I were to see the provided data and the expected... It seems so fitting, I wouldn't even test it!
(1 vote) • Here's the thing about hypothesis tests, including the Chi-square test: they doesn't always lend itself to 'eyeballing.' Sometimes, sure, you can look at the data (or the observed vs expected) and see that the null hypothesis will be rejected. The same thing is true of a lot of procedures. Sometimes the descriptive statistics are clear enough that we can anticipate the results.

But it's not always so. And the formal hypothesis test provides an objective approach based on probabilities to make the decision.

In this case, while there's not one really big mis-fit, there are multiple smaller ones. Those smaller ones, taken together, are enough to make us think that the owner's hypothesized distribution is wrong. It might not be extremely wrong, but there's enough evidence to make us not believe it.

The owner over-estimated Tuesday and Saturday, while underestimating Monday, Wednesday, and Thursday. So he gave us this distribution:

10 10 15 20 30 15

Maybe in reality it should have been:
15 8 15 22 29 11

Or maybe like this:
12 8 17 23 29 11

The point is, the hypothesized distribution doesn't have to be radically different than the true one, it just has to be different enough.