If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

# Idea behind hypothesis testing

## Video transcript

let's say that you have a cholesterol test and you know you somehow magically know that the probability that it is accurate that it gives the correct results is 99 99 percent you have a 99 out of a hundred chance that anytime you apply this test that it is going to be accurate now let's say that you and you just magically know that we're just assuming that now let's just say that you get a hundred folks into this room and you apply this test to all hundred of them so apply apply test one hundred times so what are some of the possible outcomes here is it for sure that 99 exactly 99 out of the hundred are going to be accurate and that one other hundred is going to be inaccurate well that's definitely a likely possibility but it's also possible you get a little lucky and all hundred are accurate or you get a little unlucky and that 98 are accurate and that two are inaccurate and actually I calculated the probabilities ahead of time and the goal of this video isn't to go into the probability in combinatorics of it but if you're curious about it there's a lot of good videos on probability in combinatorics on Khan Academy but I calculated ahead of time and the probability if you have if you have something that has a 99% chance of being accurate and you apply it 100 times the probability that is accurate that it is accurate 100 out of the hundred times is approximately equal to approximately equal to 36 0.6% I rounded to the nearest tenth of a percent so it's a little better than a third chance that you'll actually get all of all the people are going to get an accurate result even though for any one of them there's a 99% chance that it's accurate and we keep going the probability that it is accurate I'm just going to put these quotes here so don't have to rewrite accurate over and over again the probability that is accurate ninety-nine out of a hundred times I calculated it ahead of time it is approximately 37 point 0 percent so this is what you would expect getting 100 out of 100 doesn't seem that unlikely if each of the times you apply it has a 99% chance of being accurate but it makes sense that you that you would expect 99 out of 100 to be even more likely slightly more likely and we can of course keep going the probability that it is accurate 98 out of 100 times is approximately eighteen point five percent and I'm just going to do a few more the probability that it is accurate 97 out of a hundred times and once again I calculated all of these ahead of time is six percent so it's definitely in the realm of possibility but it's the probability is much lower than getting having 99 out of 100 or 100 out of 100 being accurate and then the probability we put the double quotes here the probability that is accurate 96 out of 100 times is approximately 1.5 percent and then the probability and I'll just do one more I could keep going the probably there's some probability that even though each test has a 99% chance you just get super unlucky and that you know very few of them are accurate but I'll just and you see you see what's happening to the probabilities as we have as we have fewer and fewer of them being accurate it because it becomes less and less probable so the probability that 95 out of 100 are accurate is is 0 approximately 0.3% so this was just kind of a I guess you could say a thought experiment if we had a test that we know for sure that every time you administer the probability that it's accurate is 99% then these are the probabilities that if you administered it a hundred times that you get a hundred out of a hundred accurate the probability that you get 99 out of 100 accurate and so on and so forth so let's just keep that in mind and then think a little bit about hypothesis testing and and how we can use this framework so let's let's put all that in the back of our minds and let's say that you have you have devised a new test you have a new test and you don't know how accurate it is you have a new cholesterol test you don't know how accurate it is you know that in order for it to be approved by whatever governing body it has to be accurate 99 the probability of it being accurate has to be 99% so needs needs to have have probability of accurate accurate equal to 99% you don't know if this is true you just know that that's what it needs to be and so you have your test and let's say you set up a hypothesis and your hypothesis could be a lot of things and once you get deeper into statistics there's null hypothesis and alternate hypotheses but let's just start with this a simple hypothesis you're you're hopeful your hypothesis is that your prop that the probability that your new test you test is accurate is accurate is this is your hypothesis because you you want that to be your hypothesis because if you do feel good about it then you're like okay maybe I'll get approved by the appropriate governing body so you say hey I my hypothesis is that my new test is accurate 99 the probability of it being accurate is 99 percent so then you go off and you apply it a hundred times so you apply your new test you don't know the actual probability of it being accurate you you apply the test you apply the test 100 times and let's say out of those hundred times you get that they are accurate you get that it is accurate and you're able to use some other test that you know some for sure test some super accurate tests to verify your own test results and you see that as an accurate 95 out of the hundred times so the question you have is well does the hypothesis make sense to you will you accept this hypothesis well what you say is well if my hypothesis was true if my if my test were accurate 99 if the probability of my test being accurate is 99 percent what's the probability of me getting in this outcome well we figure that out if it really was accurate 99 percent of the time then the probability of getting this outcome is only 0.3% so if you assume true if you assume if you assume hypothesis hypothesis I'll just write hype if you assume the hypothesis is true the probability of the outcome you got probability of observed observed observed outcome is approximately 0.3% and so to say look you know maybe it's definitely possible that I just got very very very very unlikely but based on this I probably should reject my hypothesis because it the probability of me getting this outcome if the hypothesis was true is very very very very low and as we go deeper in statistics you'll see that they're their thresholds that people often set for you know if it's if the probability of something happening or not happening is above or below some threshold then we might reject a a certain hypothesis but in in this world you could see that look if my test really was accurate 99% of the time for me to get when I apply it to 100 people it's only accurate 95 out of 100 if my hypothesis is true that would have only that there's only 0.3 percent chance that I would have seen this observation so based on that it might be completely reasonable to say you know what I might reject my hypothesis look for a new test I don't feel good about this new cholesterol test that I constructed