Hypothesis test for difference of means Hypothesis Test for Difference of Means
Hypothesis test for difference of means
- In the last video, we came up with a 95% confidence interval
- for the mean weight loss between the low-fat group and
- the control group.
- In this video, I actually want to do a hypothesis test,
- really to test if this data makes us believe that the
- low-fat diet actually does anything at all.
- And to do that let's set up our null and alternative
- So our null hypothesis should be that this
- low-fat diet does nothing.
- And if the low-fat diet does nothing, that means that the
- population mean on our low-fat diet minus the population mean
- on our control should be equal to zero.
- And this is a completely equivalent statement to saying
- that the mean of the sampling distribution of our low-fat
- diet minus the mean of the sampling distribution of our
- control should be equal to zero.
- And that's because we've seen this multiple times.
- The mean of your sampling distribution is going to be
- the same thing as your population mean.
- So this is the same thing is that.
- That is the same thing is that.
- Or, another way of saying it is, if we think about the mean
- of the distribution of the difference of the sample
- means, and we focused on this in the last video, that that
- should be equal to zero.
- Because this thing right over here is the same thing as that
- right over there.
- So that is our null hypothesis.
- And our alternative hypothesis,
- I'll write over here.
- It's just that it actually does do something.
- And let's say that it actually has an improvement.
- So that would mean that we have more weight loss.
- So if we have the mean of Group One, the population mean
- of Group One minus the population mean of Group Two
- should be greater then zero.
- So this is going to be a one tailed distribution.
- Or another way we can view it, is that the mean of the
- difference of the distributions, x1 minus x2 is
- going to be greater then zero.
- These are equivalent statements.
- Because we know that this is the same thing as this, which
- is the same thing as this, which is what I
- wrote right over here.
- Now, to do any type of hypothesis test, we have to
- decide on a level of significance.
- What we're going to do is, we're going to assume that our
- null hypothesis is correct.
- And then with that assumption that the null hypothesis is
- correct, we're going to see what is the probability of
- getting this sample data right over here.
- And if that probability is below some threshold, we will
- reject the null hypothesis in favor of the alternative
- Now, that probability threshold, and we've seen this
- before, is called the significance level, sometimes
- called alpha.
- And here, we're going to decide for a significance
- level of 95%.
- Or another way to think about it, assuming that the null
- hypothesis is correct, we want there to be no more than a 5%
- chance of getting this result here.
- Or no more than a 5% chance of incorrectly rejecting the null
- hypothesis when it is actually true.
- Or that would be a type one error.
- So if there's less than a 5% probability of this happening,
- we're going to reject the null hypothesis.
- Less than a 5% probability given the null hypothesis is
- true, then we're going to reject the null hypothesis in
- favor of the alternative.
- So let's think about this.
- So we have the null hypothesis.
- Let me draw a distribution over here.
- The null hypothesis says that the mean of the differences of
- the sampling distributions should be equal to zero.
- Now, in that situation, what is going to be our critical
- region here?
- Well, we need a result, so we're going to need some
- critical value here.
- Because this isn't a normalized normal
- But there's some critical value here.
- The hardest thing is statistics is getting the
- wording right.
- There's some critical value here that the probability of
- getting a sample from this distribution above that value
- is only 5%.
- So we just need to figure out what this critical value is.
- And if our value is larger than that critical value, then
- we can reject the null hypothesis.
- Because that means the probability of getting this is
- less than 5%.
- We could reject the null hypothesis and go with the
- alternative hypothesis.
- Remember, once again, we can use Z-scores, and we can
- assume this is a normal distribution because our
- sample size is large for either of those samples.
- We have a sample size of 100.
- And to figure that out, the first step, if we just look at
- a normalized normal distribution like this, what
- is your critical Z value?
- We're getting a result above that Z value,
- only has a 5% chance.
- So this is actually cumulative.
- So this whole area right over here is
- going to be 95% chance.
- We can just look at the Z table.
- We're looking for 95% percent.
- We're looking at the one tailed case.
- So let's look for 95%.
- This is the closest thing.
- We want to err on the side of being a little bit maybe to
- the right of this.
- So let's say 95.05 is pretty good.
- So that's 1.65.
- So this critical Z value is equal to 1.65.
- Or another way to view it is, this distance right here is
- going to be 1.65 standard deviations.
- I know my writing is really small.
- I'm just saying the standard deviation of that
- So what is the standard deviation of that
- We actually calculated it in the last video, and I'll
- recalculate it here.
- The standard deviation of our distribution of the difference
- of the sample means is going to be equal to the square root
- of the variance of our first population.
- Now, the variance of our first population, we don't know it.
- But we could estimate it with our sample standard deviation.
- If you take your sample standard deviation, 4.67 and
- you square it, you get your sample variance.
- And so this is the variance.
- This is our best estimate of the variance of the
- And we want to divide that by the sample size.
- And then plus our best estimate of the variance of
- the population of group two, which is 4.04 squared.
- The sample standard deviation of group two squared.
- That gives us variance divided by 100.
- I did before in the last. Maybe it's still sitting on my
- Yes, it's still sitting on the calculator.
- It's this quantity right up here.
- 4.67 squared divided by 100 plus 4.04
- squared divided by 100.
- So it's 0.617.
- So this right here is going to be 0.617.
- So this distance right here, is going to
- be 1.65 times 0.617.
- So let's figure out what that is.
- So let's take 0.617 times 1.65.
- So it's 1.02.
- This distance right here is 1.02.
- So what this tells us is, if we assume that the diet
- actually does nothing, there's a only a 5% chance of having a
- difference between the means of these two samples to have a
- difference of more than 1.02.
- There's only a 5% chance of that.
- Well, the mean that we actually got is 1.91.
- So that's sitting out here someplace.
- So it definitely falls in this critical region.
- The probability of getting this, assuming that the null
- hypothesis is correct, is less than 5%.
- So it's smaller probability than our significance level.
- Actually, let me be very clear.
- The significance level, this alpha right
- here, needs to be 5%.
- Not the 95%.
- I think I might have said here.
- But I wrote down the wrong number there.
- I subtracted it from one by accident.
- Probably in my head.
- But anyway, the significance level is 5%.
- The probability given that the null hypothesis is true, the
- probability of getting the result that we got, the
- probability of getting that difference, is less than our
- significance level.
- It is less than 5%.
- So based on the rules that we set out for ourselves of
- having a significance level of 5%, we will reject the null
- hypothesis in favor of the alternative that the diet
- actually does make you lose more weight.
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
Have something that's not a question about this content?
This discussion area is not meant for answering homework questions.
Share a tip
When naming a variable, it is okay to use most letters, but some are reserved, like 'e', which represents the value 2.7831...
Thank the author
This is great, I finally understand quadratic functions!
Have something that's not a tip or thanks about this content?
This discussion area is not meant for answering homework questions.
At 2:33, Sal said "single bonds" but meant "covalent bonds."
For general discussions about Khan Academy, visit our Reddit discussion page.
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
- disrespectful or offensive
- an advertisement
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
- a tip or thanks in Questions
- a question in Tips & Thanks
- an answer that should be its own question