why is the expected value 5, how did he come up with that number?

This is part of the large counts condition. It states that the expected number in each category (supposing that the null hypothesis is true) must be at least five. In this case, yes, the expected value is 8. Since it's greater than 5, the sample succeeds at the large counts condition.

AT 5:22 I dont understand why Sal said the probability(chi-square) is equal or bigger than 5.25 because the descriptions say that Kenny wants to know if his games follow the pattern (win=lose=tie). Shouldn't it be P(chi-square) /= 5.25 and that we should look at the two tailed probability...?

We will reject the null hypothesis if the differences between the observed and expected frequencies are large. Thus the test of goodness of fit will always be an upper-tail test.

What is P-value and what is it dependent on... like how do you get it

The p-value for any statistical test in the probability that the null hypothesis will be true. For Chi-Square GOF is found by comparing the Calculated Chi-square test statistic with k-1 degrees of freedom and comparing it to the chi-square table which gives the approximate p-value. For a more accurate p-value, you can use a calculator or statistical software.

Main content

Course: AP®︎/College Statistics > Unit 12

Lesson 1: Chi-square test for goodness of fit

Chi-square goodness-of-fit example

Name: Chi-square goodness-of-fit example
Uploaded: 2018-04-17T15:33:01Z
Description: Chi-square goodness-of-fit example

Google Classroom

Chi-square goodness-of-fit example.

Want to join the conversation?

Sort by:

Elizabeth
Posted 6 years ago. Direct link to Elizabeth's post “why is the expected value...”
why is the expected value 5, how did he come up with that number?
Button navigates to signup pageComment on Elizabeth's post “why is the expected value...”
(6 votes)
Answer
- Shishir Iyer
  Posted 6 years ago. Direct link to Shishir Iyer's post “This is part of the large...”
  This is part of the large counts condition. It states that the expected number in each category (supposing that the null hypothesis is true) must be at least five. In this case, yes, the expected value is 8. Since it's greater than 5, the sample succeeds at the large counts condition.
  Comment on Shishir Iyer's post “This is part of the large...”
  (6 votes)
Loay Gouda
Posted 4 years ago. Direct link to Loay Gouda's post “What should I do if the L...”
What should I do if the Large Count condition is not met? can I still solve it using Chi-square ?
Button navigates to signup pageButton navigates to signup page
(6 votes)
Answer
alexiawpy
Posted 5 years ago. Direct link to alexiawpy's post “AT 5:22 I dont understand...”
AT
5:22
I dont understand why Sal said the probability(chi-square) is equal or bigger than 5.25 because the descriptions say that Kenny wants to know if his games follow the pattern (win=lose=tie). Shouldn't it be P(chi-square) /= 5.25 and that we should look at the two tailed probability...?
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- fchisowsky
  Posted 5 years ago. Direct link to fchisowsky's post “We will reject the null h...”
  We will reject the null hypothesis if the differences between the observed and expected frequencies are large. Thus the test of goodness of fit will always be an upper-tail test.
  Button navigates to signup page
  (5 votes)
ricardoadam_
Posted 4 years ago. Direct link to ricardoadam_'s post “In this exemple, couldn't...”
In this exemple, couldn't we just try to perform a hypothesis test for the probability of winning being equal to 1/3.
Ho: p = 0.33
Hp: p != 0.33
And perform hypothesis testing for the distribution of proportions as we have been doing in later chapters?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- Chuck B
  Posted 3 months ago. Direct link to Chuck B's post “You could do that, althou...”
  You could do that, although with the results you'd only be able to make meaningful inferences about the probability of winning. You wouldn't be able to say anything about the other two outcomes, losses and ties.
  Button navigates to signup page
  (1 vote)
GSingh
Posted 2 years ago. Direct link to GSingh's post “Wow everyone graduated 2 ...”
Wow everyone graduated 2 years ago.
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
sachindubey2
Posted 4 years ago. Direct link to sachindubey2's post “In terms of Machine learn...”
In terms of Machine learning this test is used in Model Evaluation process in that we train the model on training dataset and run this on testing dataset and then analyse the Chi Square value??
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
prisha037
Posted 5 years ago. Direct link to prisha037's post “What is P-value and what ...”
What is P-value and what is it dependent on... like how do you get it
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- Saivishnu Tulugu
  Posted 5 years ago. Direct link to Saivishnu Tulugu's post “The p-value for any stati...”
  The p-value for any statistical test in the probability that the null hypothesis will be true. For Chi-Square GOF is found by comparing the Calculated Chi-square test statistic with k-1 degrees of freedom and comparing it to the chi-square table which gives the approximate p-value. For a more accurate p-value, you can use a calculator or statistical software.
  Button navigates to signup page
  (2 votes)
E Q
Posted 6 years ago. Direct link to E Q's post “At 5:35, the degrees of f...”
At
5:35
, the degrees of freedom are two. But is there a situation where the chi-square test makes sense for DF=1? Why is the DF 1 curve supplied for chi square tables? Should we just revert to hypothesis testing on a z statistic for DF 1?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- Savandreas
  Posted 6 years ago. Direct link to Savandreas's post “The chi-square where k is...”
  The chi-square where k is the number of categories (k=2 in this case, meaning 1 df) will be the square of the two-tailed one-sample proportions Z statistic and will reject exactly the same cases. (Sometimes the p-values differ a little because different approximations/statistics are used.)
  
  http://rinterested.github.io/statistics/chi_square_same_as_z_test.html
  Button navigates to signup page
  (1 vote)
Karina Jakku
Posted 9 months ago. Direct link to Karina Jakku's post “Hi, I'm confused with the...”
Hi, I'm confused with the Chi-Square table. The table that my university gave me is different to your chi-square table. For example the degrees of freedom that my table shows is between 0.900 (4.605) and 0.950 (5.991)

Can you explain how to interpret using their table?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
Minh Quan Le Hoang
Posted 4 years ago. Direct link to Minh Quan Le Hoang's post “What do you mean by assum...”
What do you mean by assuming the null hypothesis is true, if the P of getting results at least that extreme is LOW enough then we can REJECT the null hypo? Logically, I thought it has to be HIGHER, therefore much more inaccurate, then we can reject.
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- Snoopydoggocrazy
  Posted 4 years ago. Direct link to Snoopydoggocrazy's post “P stands for probability,...”
  P stands for probability, thus, if the probability of getting such extreme data is super low, we can assume that there is an issue with our hypothesis. That means we can reject the null hypothesis (the idea that the discrepancies between the expected and observed data are due to chance AKA that the data is not statistically significant)
  Button navigates to signup page
  (1 vote)

Video transcript

- [Instructor] In the game rock-paper-scissors, Kenny expects to win, tie, and lose with equal frequency. Kenny plays rock-paper-scissors often, but he suspect his own games were not following that pattern. So he took a random sample of 24 games and recorded their outcomes. Here are his results. So out of the 24 games, he won four, lost 13, and tied seven times. He wants to use these results to carry out a chi-squared goodness-of-fit test to determine if the distribution of his outcomes disagrees with an even distribution. What are the values of the test statistic, the chi-squared test statistic, and P-value for Kenny's test? So pause this video and see if you can figure that out. Okay, so he's essentially just doing a hypothesis test using the chi-squared statistic. Because it's a hypothesis that's thinking about multiple categories. So what would his null hypothesis be? Well, his null hypothesis would be that he has that all of the outcomes are equal probability. Outcomes equal equal probability. And then his alternative hypothesis would be that his outcomes have not equal not equal probability. Remember we assume that the null hypothesis is true. And then assuming if the null hypothesis is true, the probability of getting a result at least this extreme is low enough, then we would reject our null hypothesis. Another way to think about it is if our P-value is below threshold, we would reject our null hypothesis. And so what he did is he took a sample of 24 games, so n is equal to 24. And then this was the data that he got. Now before we even calculate our chi-squared statistic, and figure out what's the probability of getting a chi-squared statistic that large or greater, let's make sure we meet the conditions for inference for a chi-squared goodness-of-fit test. So you've seen some of them, but some of them are a little bit different. One is the random condition. I'll write 'em up here. The random condition. And that would be that there's truly a random sample of games. And it tell us right here, he took a random sample of his 24 games. So we meet that condition. The second condition, when we're talking about chi-squared hypothesis testing, is the large counts. Large counts condition. And this is an important one to appreciate. This is that the expected number of each category of outcomes is at least equal to five. Now you might say, hey, wait, wait, I only got four wins. Or Kenny only got four wins out of his sample of 24. But that does not violate the large counts condition. Remember, what is the expected number of wins, losses, and ties? Well if you were assuming the null hypothesis, where the outcomes have equal probability, so the expected the expected I could write right over here. It would be that it's 1/3, 1/3, 1/3. And so 1/3 of 24 is 8, 8, and 8. That's what Kenny would expect. And since because all of these are at least equal to five, we meet the large counts condition. And then the last condition is the independence condition. If we aren't sampling with replacement, then we just have to feel good that our sample size is no more than 10% of the population. And he can definitely play more than 240 games in his life, so we would assume that we meet that condition as well. And so with that out of the way, we can actually calculate our chi-squared statistic and try to make some inference based on it. And so, let's see, our chi-squared statistic is going to be equal to, so for each category, it's going to be the difference between the expected and what he got in that sample squared divided by the expected. So the first category is wins. So that's going to be four minus eight, four minus eight, squared over an expected number of wins of eight, plus losses, so that's 13 minus eight. 13 is how many he got, how many he lost, minus eight expected, squared, over the number expected plus he got seven ties, he would have expected eight, squared, all of that over eight. And so let's see what is this. Four minus eight is negative four, you square that, you get 16. 13 minus eight is five, you square that, you get 25. Seven minus eight is negative one. Square that you get one. And 16 divided by eight is going to be two. 25 divided by eight is going to be, let's see, that's three and 1/8th, so that's 3.125, and then 1/8th is 0.125. 0.125. You add these together, so let's see, it's gonna be two plus 3.125, 5.125 plus another 0.125 so that's going to be 5.25. So our chi-squared statistic is 5.25. And now to figure out our P-value, our P-value is going to be equal to the probability of getting a chi-squared statistic greater than or equal to 5.25. And you could use a chi-squared table for that. And we always have to think about our degrees of freedom. We have one, two, three categories. So our degrees of freedom is going to be one less than that, or three minus one, which is two. So our degrees of freedom is going to be equal to two. And that makes sense, because you know for a certain number of games, if you know the number of wins, and you know there's a certain number of losses, you can figure out the number of ties. Or if you know any two of these categories, you can always figure out the third. So that's why you have two degrees of freedom. And so let's get out our chi-squared table. So we have two degrees of freedom, so we are in this row. And where is 5.25? So, 5.25 is right over there. And so our probability is going to be between 0.10 and 0.05. So our P-value is going to be greater than 0.05 and less than 0.10. And so for example, if ahead of time, and he should have done this ahead of time, he set a significance level of 5% and our P-value here is greater than 5% which we just saw, he would fail to reject in this situation the null hypothesis. But they're not asking us that here. All they're asking us is what is our chi-squared value and what range is our P-value in. Well, let's see, 5.25 are both of these values, and we saw we got a P-value between 5% and 10%. So it is choice A right over there.