Main content

## Statistics and probability

### Course: Statistics and probability > Unit 12

Lesson 5: More significance testing videos# One-tailed and two-tailed tests

Sal continues his discussion of the effect of a drug to one-tailed and two-tailed hypothesis tests. Created by Sal Khan.

## Want to join the conversation?

Shouldn't the sum of your rejection regions in your two tailed test (0.3%) should be the same as the rejection region in your one tailed test (0.3%) and not the 0.15% as stated in the video? So the area of a rejection region in a one tailed test is alpha, but in a two tailed test is alpha/2? Also note that the two hypothesis have to be mutually exhaustive. In other words the null hypothesis should be greater than or equal to the default mean of 1.2 seconds and not just simply equal to. Thanks!(13 votes)- Yes, I noticed this problem too. The sum of the two tails should be .3% but when using a single tailed test, the region of rejection (alpha) should equal .3% as well.(13 votes)

- To me, it seems like we would get the same answer if had the alternative hypothesis that the drug is raising response time. But this seems very counter-intuitive since the sample mean from the rats on drugs is lower than the population mean when the rats are not on drugs. Wouldn't this indicate that it more probable that the drug lowers response time rather than raising response time? Is P(drug lowers response time) = P(drug raises response time)? And why/why not?(7 votes)
- I totally agree, if we use the same logic we would end up with the same probability that drug raises response time.

Any idea how to resolve this problem?(4 votes)

- So, would it be fair to say that doing a one-tailed test leaves you more room to conclude that the alternative hypothesis is true? That is, when you did the one tailed test the p value wound up being even more extreme than with the two tailed test, so we could have actually had a slower response time (closer to the mean) and still have had the same p-value as in the two tailed test.(5 votes)
- Off the bat it could be said that a one-tailed test leaves more room to conclude that the alternative hypothesis is true. To decide if a one-tailed test can be used, one has to have some extra information about the experiment to know the direction from the mean (H1: drug lowers the response time). If the direction of the effect is unknown, a two tailed test has to be used, and the H1 must be stated in a way where the direction of the effect is left uncertain (H1: the drug has an effect on the response time).

A one tailed test does not leave more room to conclude that the alternative hypothesis is true. The benefit (increased certainty) of a one tailed test doesn't come free, as the analyst must know "something more", which is the direction of the effect, compared to a two tailed test.(3 votes)

- what identification of z-test and t-test?(1 vote)
- It's important to state that the t-test is used when the sample number is less than 30 since there is a z-test that's used when you only have the sample standard deviation but the sample number is equal or greater than 30.(5 votes)

- Hi, everyone!

Quick question: What if we decided that our alternative hypothesis claimed that the drug increases the time response. Wouldn't we have still had a p-value of 0.0015 and, thus, rejected the null hypothesis and accepted the alternative? Isn't this inconsistent? I appreciate all the help that I can get! Thanks!(4 votes)- I don't feel confused about this. If we were that wrong about our experiment, I think we ought to simply change to a two tail framing of the problem - that is: What's the probability that the drug has some effect (which allows for either increase or decrease in response time).(1 vote)

- How would you know when to use the left tail test or the right tail test.

For example if a company claims a drug makes you lose atleast 20 pounds in a month. A sample of 20 were used sith mean 15 and standard deviation 4! Test the company claim at 1 percent?

How would i know its a left tail test and how would u write the null hypothesis and solve it?(3 votes)- the word (atleast ) means that the value will be greater then or equal to 20 , And in alternative hypothesis we only use "<" , ">" and not equal sign . so in this question the given information tell us that this is :

Null Hypothesis : μ ≥ 120

alternative hypothesis : μ < 120(3 votes)

- When comparing one-tailed and two-tailed p-values, would the area under the curve for the one-tailed not be 0.3 and then the two-tailed be 0.15? Why would your one sided only be 0.15 if the actual total area under the curve is equivalent in both one-tailed and two-tailed?

Just confused because in class, we were taught that if your p-value for significance in less than 0.05, for a two tailed test, your areas in the two sides are 0.025 which collectively make up 0.05. Thanks!(3 votes)- The 2 tails are 1 for z < -3, and 1 for z > 3. Which is larger - both added together or either one by itself? What is larger - .0030 or .0015?(1 vote)

- what can be said about the 2 sided p-value for testing the null hypothesis of no change in cholesterol levels, if on average after three months the cholesterol levels among 100 patients decreased by 15.0 and standard deviation of the changes in cholesterol was 40.

thanks.(2 votes)- sample std=40

std sampling distribution=4

sample mean=15/4=3.75std less than population mean

ie outside the critical region or the 99.87% confidence interval and one can reject the null hypothesis . P value= .0013 one sided test.(2 votes)

- Why is the standard deviation of the mean different than the mean of the sample? Isn't sampling distribution just a distribution of the sample? why does that change the mean?(2 votes)
- I'm not sure what you are asking. I'll answer the question "Why is the standard deviation of the mean different than the mean of the sample?" and maybe you can rephrase what you are asking if this doesn't answer it.

The mean of the sample is a measure of the average response time in the particular sample. So they took the response time of each individual, added them together and divided by 100.

The standard deviation of the mean is a measure of how much individual measures varied in relation to the mean.

They are related, the standard deviation is calculated using the mean, but they aren't the same thing.(1 vote)

- 7 out of 8500 people vaccinated against a certain disease later developed the disease. 18 of 10,000 people vaccinated with a placebo later developed the disease. Test the claim that the vaccine is effective in lowering the incidence of the disease. use significan level of 0.01(2 votes)
- These are two samples. Using 1 for "did not develop disease" and 0 for "did", the difference between the sample means is .0009765. The only way I can think of to test significance is to use "re-randomization" (permutation test) described in a previous video ( see https://www.khanacademy.org/math/probability/statistical-studies/hypothesis-test/v/statistical-significance-experiment ). I'm not prepared to do that. I personally suspect that the results (meager as they are) are insignificant - that if you re-randomize 150 times, and plot the results, the tails of the resulting PDF will be quite fat. (Also, this experiment doesn't include any information about side effects.)(1 vote)

## Video transcript

In the last video, our
null hypothesis was the drug had no effect. And our alternative hypothesis
was that the drug just has an effect. We didn't say whether the drug
would lower the response time or raise the response time. We just said the drug had an
effect, that the mean when you have the drug will not
be the same thing as the population mean. And then the null hypothesis
says no, your mean with the drug's going to be the same
thing as the population mean, it has no effect. In this situation where we're
really just testing to see if it had an effect, whether an
extreme positive effect, or an extreme negative effect,
would have both been considered an effect. We did something called a
two-tailed test. This is called eight two-tailed test.
Because frankly, a super high response time, if you had a
response time that was more than 3 standard deviations,
that would've also made us likely to reject the
null hypothesis. So we were dealing with
kind of both tails. You could have done a similar
type of hypothesis test with the same experiment where you
only had a one-tailed test. And the way we could have done
that is we still could have had the null hypothesis be that
the drug has no effect. Or that the mean with the drug--
the mean, and maybe I could say the mean with the
drug-- is still going to be 1.2 seconds, our mean
response time. Now if we wanted to do a
one-tailed test, but for some reason we already had maybe a
view that this drug would lower response times, then our
alternative hypothesis-- and just so you get familiar with
different types of notation, some books or teachers will
write the alternative hypothesis as H1, sometimes
they write it as H alternative, either
one is fine. If you want to do one-tailed
test, you could say that the drug lowers response time. Or that the mean with the drug
is less than 1.2 seconds. Now if you do a one-tailed test
like this, what we're thinking about is, what we want
to look at is, all right, we have our sampling
distribution. Actually, I can just use the
drawing that I had up here. You had your sampling
distribution of the sample mean. We know what the mean of that
was, it's 1.2 seconds, same as the population mean. We were able to estimate its
standard deviation using our sample standard deviation, and
that was reasonable because it had a sample size of greater
than 30, so we can still kind of deal with a normal
distribution for the sampling distribution. And using that we saw that the
result, the sample mean that we got, the 1.05 seconds,
is 3 standard deviations below the mean. So if we look at it-- let me
just re-draw it with our new hypothesis test. So this is
the sampling distribution. It has a mean right over
here at 1.2 seconds. And the result we got
was 3 standard deviations below the mean. 1, 2, 3 standard deviations
below the mean. That was what our 1.05
seconds were. So when you set it up like this
where you're not just saying that the drug has an
effect-- in that case, and that was the last view, you'd
look at both tails. But here we're saying we only
care is does the drug lower our response time? And just like we did before, you
say OK, let's say the drug doesn't lower our
response time. If the drug doesn't lower our
response time, what was the probability or what is the
probability of getting a lowering this extreme
or more extreme? So here it will only be one
of the tails that we could consider when we set our
alternative hypothesis like that, that we think it lowers. So if our null hypothesis is
true, the probability of getting a result more extreme
than 1.05 seconds, now we are only considering this tail
right over here. Let me just put it this way. More extreme than 1.05 seconds,
or let me say, lower. Because in the last video we
cared about more extreme because even a really high
result would have said, OK, the mean's definitely
not 1.2 seconds. But in this case we care about
means that are lower. So now we care about the
probability of a result lower than 1.05 seconds. That's the same thing as
sampling-- of getting a sample from the sampling distribution
that's more than 3 standard deviations below the mean. And in this case, we're only
going to consider the area in this one tail. So this right here would be a
one-tailed test where we only care about one direction
below the mean. If you look at the one-tailed
test-- this area over here-- we saw last time that both of
these areas combined are 0.3%. But if you're only considering
one of these areas, if you're only considering this one over
here it's going to be half of that, because the normal
distribution is symmetric. So it's going to the 0.13%. So this one right here is going
to be 0.15%, or if you express it as a decimal, this
is going to be 0.0015. So once again, if you set up
your hypotheses like this, you would have said, if your null
hypothesis is correct, there would have only been a 0.15%
chance of getting a result lower than the result we got. So that would be very unlikely,
so we will reject the null hypothesis and go
with the alternative. And in this situation
your P-value is going to be the 0.0015.