Main content

# Hypothesis testing and p-values

## Video transcript

A neurologist is testing the effect of a drug on response time by injecting 100 rats with a unit dose of the drug, subjecting each to neurological stimulus and recording its response time. The neurologist knows that the mean response time for rats not injected with the drug is 1.2 seconds. The mean of the 100 injected rats response times is 1.05 seconds with the sample standard deviation of 0.5 seconds. Do you think that the drug has an affect on response time? So to do this we're going to set up two hypotheses. We're going to say, one, the first hypothesis is we're going to call it the null hypothesis, and that is that the drug has no effect on response time. And your null hypothesis is always going to be-- you can view it as a status quo. You assume that whatever your researching has no effect. So drug has no effect. Or another way to think about it is that the mean of the rats taking the drug should be the mean with the drug-- let me write it this way-- with the mean is still going to be 1.2 seconds even with the drug. So that's essentially saying it has no effect, because we know that if you don't give the drug the mean response time is 1.2 seconds. Now, what you want is an alternative hypothesis. The hypothesis is no, I think the drug actually does do something. So the alternative hypothesis, right over here, that the drug has an effect. Or another way to think about it is that the mean does not equal 1.2 seconds when the drug is given. So how do we think about this? How do we know whether we should accept the alternative hypothesis or whether we should just default to the null hypothesis because the data isn't convincing? And the way we're going to do it in this video, and this is really the way it's done in pretty much all of science, is you say OK, let's assume that the null hypothesis is true. If the null hypothesis was true, what is the probability that we would have gotten these results with the sample? And if that probability is really, really small, then the null hypothesis probably isn't true. We could probably reject the null hypothesis and we'll say well, we kind of believe in the alternative hypothesis. So let's think about that. Let's assume that the null hypothesis is true. So if we assume the null hypothesis is true, let's try to figure out the probability that we would have actually gotten this result, that we would have actually gotten a sample mean of 1.05 seconds with a standard deviation of 0.5 seconds. So I want to see if we assumed the null hypothesis is true, I want to figure out the probability-- and actually what we're going to do is not just figure out the probability of this, the probability of getting something like this or even more extreme than this. So how likely of an event is that? To think about that let's just think about the sampling distribution if we assume the null hypothesis. So the sampling distribution is like this. It'll be a normal distribution. We have a good number of samples, we have 100 samples here. So this is the sampling distribution. It will have a mean. Now if we assume the null hypothesis, that the drug has no effect, the mean of our sampling distribution will be the same thing as the meaning of the population distribution, which would be equal to 1.2 seconds. Now, what is the standard deviation of our sampling distribution? The standard deviation of our sampling distribution should be equal to the standard deviation of the population distribution divided by the square root of our sample size, so divided by the square root of 100. We do not know what the standard deviation of the entire population is. So what we're going to do is estimate it with our sample standard deviation. And it's a reasonable thing to do, especially because we have a nice sample size. The sample size is greater than 100. So this is going to be a pretty good approximator. This is going to be a pretty good approximator for this over here. So we could say that this is going to be approximately equal to our sample standard deviation divided by the square root of 100, which is going to be equal to our sample standard deviation is 0.5, 0.5 seconds, and we want to divide that by square root of 100 is 10. So 0.5 divided by 10 is 0.05. So the standard deviation of our sampling distribution is going to be-- and we'll put a little hat over it to show that we approximated it with-- we approximated the population standard deviation with the sample standard deviation. So it is going to be equal to 0.5 divided by 10. So 0.05. So what is the probability-- so let's think about it this way. What is the probability of getting 1.05 seconds? Or another way to think about it is how many standard deviations away from this mean is 1.05 seconds, and what is the probability of getting a result at least that many standard deviations away from the mean. So let's figure out how many standard deviations away from the mean that is. Now essentially we're just figuring out a Z-score, a Z-score for this result right over there. So let me pick a nice color-- I haven't used orange yet. So our Z-score-- you could even do the Z-statistic. It's being derived from these other sample statistics. So our Z-statistic, how far are we away from the mean? Well the mean is 1.2. And we are at 1.05, so I'll put that less just so that it'll be a positive distance. So that's how far away we are. And if we wanted it in terms of standard deviations, we want to divide it by our best estimate of the sampling distribution's standard deviation, which is this 0.05. So this is 0.05, and what is this going to be equal to? This result right here, 1.05 seconds. 1.2 minus 1.05 is 0.15. So this is 0.15 in the numerator divided by 0.05 in the denominator, and so this is going to be 3. So this result right here is 3 standard deviations away from the mean. So let me draw this. This is the mean. If I did 1 standard deviation, 2 standard deviations, 3 standard deviations-- that's in the positive direction. Actually let me draw it a little bit different than that. This wasn't a nicely drawn bell curve, but I'll do 1 standard deviation, 2 standard deviation, and then 3 standard deviations in the positive direction. And then we have 1 standard deviation, 2 standard deviations, and 3 standard deviations in the negative direction. So this result right here, 1.05 seconds that we got for our 100 rat sample is right over here. 3 standard deviations below the mean. Now what is the probability of getting a result this extreme by chance? And when I talk about this extreme, it could be either a result less than this or a result of that extreme in the positive direction. More than 3 standard deviations. So this is essentially, if we think about the probability of getting a result more extreme than this result right over here, we're thinking about this area under the bell curve, both in the negative direction or in the positive direction. What is the probability of that? Well we go from the empirical rule that 99.7% of the probability is within 3 standard deviations. So this thing right here-- you can look it up on a Z-table as well, but 3 standard deviation is a nice clean number that doesn't hurt to remember. So we know that this area right here I'm doing and just reddish-orange, that area right over is 99.7%. So what is left for these two magenta or pink areas? Well if these are 99.7% and both of these combined are going to be 0.3%. So both of these combined are 0.3-- I should write it this way or exactly-- are 0.3%. 0.3%. Or is we wrote it as a decimal it would be 0.003 of the total area under the curve. So to answer our question, if we assume that the drug has no effect, the probability of getting a sample this extreme or actually more extreme than this is only 0.3% Less than 1 in 300. So if the null hypothesis was true, there's only a 1 in 300 chance that we would have gotten a result this extreme or more. So at least from my point of view this results seems to favor the alternative hypothesis. I'm going to reject the null hypothesis. I don't know 100% sure. But if the null hypothesis was true there's only 1 in 300 chance of getting this. So I'm going to go with the alternative hypothesis. And just to give you a little bit of some of the name or the labels you might see in some statistics or in some research papers, this value, the probability of getting a result more extreme than this given the null hypothesis is called a P-value. So the P-value here, and that really just stands for probability value, the P-value right over here is 0.003. So there's a very, very small probability that we could have gotten this result if the null hypothesis was true, so we will reject it. And in general, most people have some type of a threshold here. If you have a P-value less than 5%, which means less than 1 in 20 shot, let's say, you know what, I'm going to reject the null hypothesis. There's less than a 1 in 20 chance of getting that result. Here we got much less than 1 in 20. So this is a very strong indicator that the null hypothesis is incorrect, and the drug definitely has some effect.