Main content
AP®︎/College Statistics
Course: AP®︎/College Statistics > Unit 10
Lesson 7: Potential errors when performing tests- Introduction to Type I and Type II errors
- Examples identifying Type I and Type II errors
- Type I vs Type II error
- Introduction to power in significance tests
- Examples thinking about power in significance tests
- Error probabilities and power
- Consequences of errors and significance
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Introduction to power in significance tests
Introduction to power in significance tests.
Want to join the conversation?
- Hey, I don't quite understand why shown two overlapped distribution graphs?(27 votes)
- If the null hypothesis is in fact correct, then the hypothesized and actual sampling distributions are one and the same, centered on μ1. In this event, there is only one distribution. If however the null hypothesis is false, then there must exist a different [second] sampling distribution, centered on μ2, which is likely to overlap to some extent with the hypothetical distribution which was centered on μ1.(4 votes)
- Atshouldn't Sal write α/2 marking a rejection region since we are doing a two-tailed test and the total probability corresponding to our significance level is split between two tails? 4:20(4 votes)
- How does increasing alpha level increase power?More area for alpha level and less for power right? 6:52(1 vote)
- Because if you draw the 2 curves, where they intersect is like a little triagle. If you fill most of the triangle with alpha (significance level) you end up with little room to make a type II error (accepting h0 when it is false).(3 votes)
- WIll there be a time that a large sample size will have negative effects on statist analysis?(2 votes)
- In some very specific situations that's possible. If you currently have a p-value that would make you draw the correct conclusion, sampling more might change it to one where you would draw the wrong conclusion. But don't use that as a reason to keep your sample small. Larger samples will generally only increase the quality of statistical analyses(1 vote)
- Why isn't the power of a test against a specific alternative always equal to 100 percent, even if the specified alternative is clearly different from the null hypothesis value and supports the alternative hypothesis? For example, let's say that the null hypothesis of a population proportion is 0.13. A specified alternative is 0.17. Why is the power of a test against 0.17 not equal to 100 percent?(1 vote)
- The significance test is the probability of getting the sample result, given the distribution of sample means, were the population mean to equal whatever value is ascribed to it by the null hypothesis. Most importantly the shape of the distribution of sample means, which influences the probability, is affected by your actual sample size, from which your study result is taken. Have I got this right?(1 vote)
- So then is power contingent on the two graphs intersecting at the a-value?(1 vote)
- Hello, I want to know whether the graph of the 2 curves where you explain the power and the errors is required to memorize for the AP exam. I mean by memorize that am I required to fully and deeply understand how it works and will I face questions that uses graphs about the errors and powers ??(1 vote)
- In this video "Not rejecting
H₀
given that it is false", should be, "Not rejectingH₀
given that it isμ₂
". If instead you wanted to calculate "Not rejectingH₀
given that it is false", you would need to integrate over all possibleμ
, given some prior (and this would still involve you assuming the distribution is from the normal family).(1 vote) - Can anyone explain the graphical representation of power ()? 5:54
- I need to know how P(not rejecting Ho | Ho is false) is the alpha end of the curve, it should have been 1-alpha.(1 vote)
Video transcript
- [Instructor] What we are
going to do in this video is talk about the idea of power when we are dealing
with significance tests. And power is an idea
that you might encounter in a first year statistics course. It's turns out that it's
fairly difficult to calculate, but it's interesting to know what it means and what are the levers that
might increase the power or decrease the power
in a significance test. So just to cut to the chase,
power is a probability. You can view it as the probability that you are doing the right thing when the null hypothesis is not true, and the right thing is you
should reject the null hypothesis if it's not true. So it's a probability of rejecting, rejecting your null hypothesis given that the null hypothesis is false. So we can view it as a
conditional probability like that. But there's other ways
to conceptualize it. We can connect it to Type II errors. For example, you could
say this is equal to one minus the probability
of not rejecting, one minus the probability
of not rejecting, not rejecting the null hypothesis given that the null hypothesis is false. And this thing that I just described, not rejecting the null hypothesis given that the null hypothesis is false, this is, that's the
definition of a Type II error. So you could view it
as just the probability of not making a Type II error, or one minus the probability
of making a Type II error. Hopefully that's not confusing. So let me just write it the other way. So you could say it's the probability of not making a Type II error. So what are the things that
would actually drive power? And to help us conceptualize that, I'll draw two sampling distributions, one, if we assume that the
null hypothesis is true, and one where we assume that
the null hypothesis is false, and the true population
parameter is something different than the null hypothesis is saying. So for example, let's say
that we have a null hypothesis that our population mean is
equal to, let's just call it, mu one, and we have an
alternative hypothesis, so H sub a, that says, "Hey, no. "The population mean is
not equal to mu one." So if you assumed a world where
the null hypothesis is true, so I'll do that in blue, so if we assume the
null hypothesis is true, what would be our sampling distribution? Remember, what we do in significance tests is we have some form of a population. Let me draw that. You have a population right over here. And our hypotheses are
making some statement about a parameter in that population. And to test it, we take a
sample of a certain size. We calculate a statistic, in this case, it would be the sample mean, and we say if we assume that
our null hypothesis is true, what is the probability of
getting that sample statistic? And if that's below a threshold, which we call a significance level, we reject the null hypothesis. And so that world that
we have been living in, one way to think about it. In a world where you assume
the null hypothesis is true, you might have a sampling distribution that looks something like this. If the null hypothesis is true, then the center of your
sampling distribution would be right over here at mu one, and given your sample size, you would get a certain
sampling distribution for the sample means. If your sample size increases,
this will be narrower. If it decreases, this
thing is going to be wider. And you set a significance
level, which is essentially your probability of
rejecting the null hypothesis even if it is true. You could even view it as,
and we've talked about it, you can view your significance level as the probability of
making a Type I error. So your significance level is some area. And so let's say it's
this area that I'm shading in orange right over here. That would be your significance level. So if you took a sample right over here and you calculated its sample mean and you happened to fall
in this area, this area, or this area right over
here, then you would reject your null hypothesis. Now if the null hypothesis
actually was true, you would be committing a Type I error without knowing about it. But for power, we are
concerned with a Type II error. So in this one, it's a
conditional probability that our null hypothesis is false. And so let's construct
another sampling distribution in the case where our
null hypothesis is false. So let me just continue
this line right over here, and I'll do that. So let's imagine a world where
our null hypothesis is false and it's actually the case
that our mean is mu two, and let's say that mu
two is right over here, and in this reality, our
sampling distribution might look something like this. Once again, it will be
for a given sample size. The larger the sample size, the narrower this bell curve would be. And so it might look something like this. So in which situation, so in this world, we should be rejecting
the null hypothesis. But what are the samples in
which case we are not rejecting the null hypothesis even though we should? Well, we're not going to
reject the null hypothesis if we get samples in,
if we get a sample here, or a sample here, or a sample here, a sample, where if you
assume the null hypothesis is true, the probability
isn't that unlikely. And so the probability
of making a Type II error when we should reject the
null hypothesis but we don't is actually this area right over here. And the power, the
probability of rejecting the null hypothesis given that it's false, so given that it's false would
be this red distribution, that would be the rest of
this area right over here. So how can we increase the power? Well, one way is to increase our alpha, increase our significance level. If we increase our significance
level, say from that, well, the significance level is an area. So if we want it to go up,
if we increase the area, and it looks something like that, now by expanding that significance area, we have increased the power
because now this yellow area is larger. We've pushed this boundary
to the left of it. Now you might say, "Oh, well, hey. "If we want to increase the power, "power sounds like a good thing, "why don't we just always increase alpha?" Well, the problem with that
is if you increase alpha, so let me write this down,
so if you take alpha, your significance level,
and you increase it, that will increase the power. That will increase the power, but it's also going to
increase your probability of a Type I error, because remember, that's one way to
conceptualize what alpha is, what your significance level is. It's a probability of a Type I error. Now what are other ways
to increase your power? Well, if you increase your sample size, then both of these distributions will, these sampling distributions
are going to get narrower. And so if these sampling distributions, if both of these sampling
distributions get narrower, then that situation where
you are not rejecting your null hypothesis
even though you should is going to have a lot less area. There's gonna be, one
way to think about it, there's going to be a lot less overlap between these two sampling distributions. And so let me write that down. So another way is to, if you
increase n, your sample size, that's going to increase your power. And this in general is always a good thing if you can do it. Now other things that may or
may not be under your control is well, the less variability
there is in the dataset, that would also make the
sampling distributions narrower, and that would also increase the power. So less variability and
you could measure that as by variance or standard deviation of your underlying dataset,
that would increase your power. Another thing that
would increase the power is if the true parameter is further away than what the null hypothesis is saying. So if you say true parameter, far from null hypothesis what it's saying, that also will increase the power. So these two are not
typically under your control but the sample size is and
the significance level is. Significance level,
there's a trade-off though. If you increase the power through that, you're also increasing the
probability of a Type I error. So for a lot of
researchers, they might say, "Hey, if a Type II error is worse, "I'm willing to make this trade-off. "I'll increase the significance level. "But if a Type I error is
actually what I'm afraid of, "then I wouldn't want top use this lever." But in any case, increasing
your sample size, if you can do it, is
going to be a good thing.