If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## Statistics and probability

### Course: Statistics and probability>Unit 12

Lesson 2: Error probabilities and power

# Consequences of errors and significance

Practice thinking about which type of error has more serious consequences and adjusting the significance level to prevent that type of error.

## Introduction

Significance tests often use a significance level of $\alpha =0.05$, but in some cases it makes sense to use a different significance level. Changing $\alpha$ impacts the probabilities of Type I and Type II errors. In some tests, one kind of error has more serious consequences than the other. We may want to choose different values for $\alpha$ in those cases.

## Review: Error probabilities and $\alpha$‍

A Type I error is when we reject a true null hypothesis. Lower values of $\alpha$ make it harder to reject the null hypothesis, so choosing lower values for $\alpha$ can reduce the probability of a Type I error. The consequence here is that if the null hypothesis is false, it may be more difficult to reject using a low value for $\alpha$. So using lower values of $\alpha$ can increase the probability of a Type II error.
A Type II error is when we fail to reject a false null hypothesis. Higher values of $\alpha$ make it easier to reject the null hypothesis, so choosing higher values for $\alpha$ can reduce the probability of a Type II error. The consequence here is that if the null hypothesis is true, increasing $\alpha$ makes it more likely that we commit a Type I error (rejecting a true null hypothesis).
Let's look at a few examples to see why it might make sense to use a higher or lower significance level.

## Example 1

Employees at a health club do a daily water quality test in the club's swimming pool. If the level of contaminants are too high, then they temporarily close the pool to perform a water treatment.
We can state the hypotheses for their test as ${H}_{0}:$ The water quality is acceptable vs. ${H}_{\text{a}}:$ The water quality is not acceptable.
Question A (Example 1)
What would be the consequence of a Type I error in this setting?

Question B (Example 1)
What would be the consequence of a Type II error in this setting?

Question C (Example 1)
In terms of safety, which error has the more dangerous consequences in this setting?

Since one error involves greater safety concerns, the club is considering using a value for $\alpha$ other than $0.05$ for the water quality significance test.
Question D (Example 1)
What significance level should they use to reduce the probability of the more dangerous error?

## Example 2

Seth is starting his own food truck business, and he's choosing cities where he'll run his business. He wants to survey residents and test whether or not the demand is high enough to support his business before he applies for the necessary permits to operate in a given city. He'll only choose a city if there's strong evidence that the demand there is high enough.
We can state the hypotheses for his test as ${H}_{0}:$ The demand is not high enough vs. ${H}_{\text{a}}:$ The demand is high enough.
Question A (Example 2)
What would be the consequence of a Type I error in this setting?

Question B (Example 2)
What would be the consequence of a Type II error in this setting?

Seth has determined that a Type I error is more costly to his business than a Type II error. He wants to use a significance level other than $\alpha =0.05$ to reduce the likelihood of a Type I error.
Question
Which of these significance levels should Seth choose?

## Want to join the conversation?

• Hello folks, Very helpful article. At some point I would be very interested in a discussion of the difference between a significance level and a confidence level. I have read that they are complements. Perhaps a 95% confidence level is a 5% significance level and increasing one decreases the other. Or would it be safer to restrict the term confidence level to the confidence interval context? Best wishes to all.
• I think you are right to consider 95% confidence level as a 5% significance level even in the discussion of confidence interval. However, remember that 95% confidence interval will mean different range of values depending on whether it is a two tailed test, a left tailed test or a right tailed test.
• In example 1: What if we interchanged H0 and Ha, so that H0: The water quality is NOT acceptable vs. Ha: The water quality IS acceptable? In this case, the type I would be more dangereous. Am I right here?
• yup, and that would be more intuitive imo because a “null” hypothesis means a “lack” of something or something “not” being found, acceptable swimming conditions in this case.
• I might be wrong but I believe that there is a little mistake in Example 2 regarding to the hypothesis:

The null hypothesis contains a condition of inequality!
• In general I'm fairly sure Ho only has an equality, but there's no strict rule that says it can't contain an inequality! (although i'm having trouble imagining what a sampling distribution for an Ho like that would be)

*EDIT:* SO IM BACK and I understand it a little better now, ignore what I said above.

Strictly, the Ho and Ha should be complementary, so the Ho is supposed to be an inequality (for a one-sided significance test). However, in practice the sampling dist we use for the Ho will be an equality. Here's an example:

p = pop proportion
p^ = sample proportion

Ho: p <= (less than or equal) 0.5
Ha: p > 0.5

Here, Ho and Ha are complementary; p is always going to be in one or the other.

But now, we want to actually do the test. What sampling distribution do we use, if Ho is an inequality?
Well, we can choose from any p that is <= 0.5.

If we choose a low p, say 0.25, there's going to be a higher chance of accepting our Ha even if Ho was true (Type I error).
Imagine if the true p was 0.4, making our Ho true and Ha false. If we take a sample and get something like p^ = 0.55, with an assumption of p = 0.25 that might still have a p-value below alpha, making us accept Ha even though it was false.

If we choose 0.5, we're the most confident that getting a p-value below our alpha will mean that Ha is true, because we got a p^ so high and with such a low p-value, even though we made the highest assumption for Ho (0.5). Imagine the previous scenario; true p = 0.4, p^ = 0.55. Now, we'll get a very high p-value for p^, meaning that we cannot reject the Ho. This is good, because in reality Ho was true!
• The second question is possibly wrong. You should not start a null hypothesis with an inequality.
• It depends on how you translate "Not high enough", For instance you can state that the null hypothesis stands for a mean demand will be less or equal to a certain proportion, while the alternative hypothesis, claims that this mean demand will be greater than this proportion.
(1 vote)
• Hi everybody
I am confused by calculating the power of the test, so i want to know if there is a difference of power calculating when your Ho is true and when it is false.

Thanks
• in fact, which of H0 or Ha is right doesn't matter for the absolute value of power and alpha

what changes by that result is which type of errors we happened to make. and you may know which for which cases

then what changes power? alpha. the larger it is, the larger power is too, decreasing the risk of type II error, though increasing that of type I error at the same time
(1 vote)
• math is corrupt.
(1 vote)
• The null hypothesis and the alternative hypothesis could be swapped, couldn't them ?