Main content

## Chi-square test for goodness of fit

# Chi-square statistic for hypothesis testing

AP.STATS:

DAT‑3 (EU)

, DAT‑3.I (LO)

, DAT‑3.I.1 (EK)

, DAT‑3.J (LO)

, DAT‑3.J.1 (EK)

, DAT‑3.J.2 (EK)

, VAR‑1 (EU)

, VAR‑1.J (LO)

, VAR‑1.J.1 (EK)

, VAR‑8 (EU)

, VAR‑8.A (LO)

, VAR‑8.A.1 (EK)

, VAR‑8.A.2 (EK)

, VAR‑8.A.3 (EK)

, VAR‑8.B (LO)

, VAR‑8.B.1 (EK)

, VAR‑8.C (LO)

, VAR‑8.C.1 (EK)

, VAR‑8.D (LO)

, VAR‑8.D.1 (EK)

, VAR‑8.F (LO)

, VAR‑8.F.1 (EK)

, VAR‑8.F.2 (EK)

, VAR‑8.G (LO)

, VAR‑8.G.1 (EK)

## Video transcript

- [Instructor] Let's
say there's some type of standardized exam where every question on the test has four choices, choice A, choice B, choice C, and choice D. And the test makers assure
folks that, over many years, there's an equal probability
that the correct answer for any one of the items is A, B, C, or D. It essentially is a 25%
chance of any of them. Now, let's say you have a hunch that, well, maybe it is skewed
towards one letter or another. How could you test this? Well, you could start with a null and alternative hypothesis, and then we can actually
do a hypothesis test. So let's say that our null
hypothesis is equal distribution, equal distribution of correct choices, correct choices. Or another way of thinking about it is A would be correct 25% of the time, B would be correct 25% of the time, C would be correct 25% of the time, and D would be correct 25% of the time. Now, what would be our
alternative hypothesis? Well, alternative hypothesis would be not equal distribution, not equal distribution. Now, how are we going
to actually test this? Well, we've seen this show before, at least the beginnings of the show. You have the population of all
of your potential items here, and you could take a sample. And so let's say we take
a sample of 100 items. So n is equal to 100. And let's write down the data that we get when we look at that sample. So this is the correct choice, correct choice. And then this would be the expected number that you would expect. And then this is the actual number. And if this doesn't make sense yet, we'll see it in a second. So there's four different
choices, A, B, C, D and a sample of 100. Remember, in any hypothesis test, we start assuming that the
null hypothesis is true. So the expected number
where A is a correct choice would be 25% of this 100. So you would expect 25 times
the A to be the correct choice, 25 times B to be the correct choice, 25 times C to be the correct choice, and 25 times D to be the correct choice. But let's say our actual results, when we look at these 100 items, we get that A is the
correct choice 20 times, B is the correct choice 20 times, C is the correct choice 25 times, and D is the correct choice 35 times. So if you just look at this, just look, hey, maybe there's
a higher frequency of D, but maybe you'd say, well,
this is just a sample. And just through random chance, it might have just
gotten more Ds than not. There's some probability
of getting this result, even assuming that the
null hypothesis is true. And that's the goal of
these hypothesis tests, is say what's the probability
of getting a result at least this extreme? And if that probability
is below some threshold, then we tend to reject the null hypothesis and accept an alternative. And those thresholds you have seen before. We've seen these significance levels. Let's say we set a
significance level of 5%, 0.05. So if the probability
of getting this result or something even more
different than what's expected is less than the significance level, then we'd reject the null hypothesis. But this all leads to one
really interesting question. How do we calculate a probability of getting a result this
extreme or more extreme? How do we even measure that? And this is where we're going
to introduce a new statistic and also, for many of
you, a new Greek letter, and that is the capital Greek letter chi, which might look like an X to you. But it's a little bit curvier, and you could look up more on that. You kind of curve that part of the X. But it's a chi, not an X. And the statistic is called chi-squared, and it's a way of taking the difference between the actual and the expected and translating that into a number. And the chi-squared distribution is, well, I really should say
distributions are well studied. And we can use that to figure
out what is the probability of getting a result this
extreme or more extreme? And if that's lower than
our significance level, we reject the null hypothesis, and it suggests the alternative. But how do we calculate the
chi-squared statistic here? Well, it's reasonably intuitive. What we do is, for each
of these categories, in this case, it's for
each of these choices, we look at the difference between the actual and the expected. So for choice A, we'd say 20 is the actual
minus the expected. And then we're going to square that. And then we're going to
divide by what was expected. And then we're going to
do that for choice B. So we're going to say the
actual was 20, expected is 25, so 20 minus 25 squared, over the expected, over 25. Plus then we do that for choice C. 25 minus 25, we know where that one will end up, squared, over the expected, over 25. And then finally, for choice D, which is going to get us 35 minus 25 squared, all of that over 25. And we are now, let's
see, if we calculate this, this is going to be negative five squared. So that's going to be 25. This is going to be 25. This is going to be zero. 35 minus 25 is 10, squared, that is 100. So this is one plus one, plus zero, plus four. So our chi-squared
statistic, in this example, came out nice and clean, this won't always be the case, at six. So what do we make of this? Well, what we can do is then look at a chi-squared distribution for the appropriate degrees of freedom, and we'll talk about that in a second, and say what is the probability of getting a chi-squared
statistic six or larger? And to understand what a chi-squared distribution even looks like, these are multiple
chi-squared distributions for different values for
the degrees of freedom. And to calculate the degrees of freedom, you look at the number of categories. In this case, we have four categories, and you subtract one. Now, that makes a lot of sense because, if you knew how many
As, Bs, and Cs there are, if you knew the proportions,
even the assumed proportions, you can always calculate the fourth one. That's why it is four minus
one degrees of freedom. So in this case, our degrees of freedom are going to be equal to three. Over here, sometimes you'll
see it described as k, so k is equal to three. So if we look at, that's
that little light blue, so we're looking at this
chi-squared distribution where the degree of freedom is three. And we want to figure out
what is the probability of getting a chi-squared
statistic that is six or greater? So we would be looking at
this area right over here. And you could figure it
out using a calculator, or, if you're taking some type of a test, like an AP Statistics exam, for example, you could use their tables they give you. And so a table like this
could be quite useful. Remember, we're dealing with the situation where we have three degrees of freedom. We had four categories, so
four minus one is three. And we got a chi-squared value. Our chi-squared statistic was six. So this right over here tells us the probability of getting a 6.25 or greater for our
chi-squared value is 10%. If we go back to this chart, we just learned that this
probability from 6.25 and up, when we have three degrees of freedom, that this right over here is 10%. Well, that's 10%. Then the probability, the probability of getting
a chi-squared value greater than or equal to six is going to be greater than 10%, greater than 10%. And we could also view
this as our P-value. And so for our probability,
assuming the null hypothesis is greater than 10%, well,
it's definitely going to be greater than our significance level. And because of that,
we will fail to reject, fail to reject. And so this is an example of,
even though in your sample, you just happened to get more Ds, the probability of getting
a result at least as extreme as what you saw is going to
be a little bit over 10%.

AP® is a registered trademark of the College Board, which has not reviewed this resource.