If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Two-sample t test for difference of means

AP.STATS:
DAT‑3.G (LO)
,
DAT‑3.G.1 (EK)
,
DAT‑3.H (LO)
,
DAT‑3.H.1 (EK)
,
DAT‑3.H.2 (EK)
,
VAR‑7 (EU)
,
VAR‑7.F (LO)
,
VAR‑7.F.1 (EK)
,
VAR‑7.G (LO)
,
VAR‑7.G.1 (EK)
,
VAR‑7.I (LO)
,
VAR‑7.I.1 (EK)

## Video transcript

kaito grows tomatoes in two separate fields when the tomatoes are ready to be picked he is curious as to whether the sizes of his tomato plants differ between the two fields he takes a random sample of plants from each field and measure the and measures the heights of the plants here is a summary of the results so what I want you to do is pause this video and conduct a two sample t-test here and let's assume that all of the conditions for inference are met the random condition the normal condition and the independence condition and let's assume that we are working with a significance level of 0.05 so can pause the video and conduct the two sample t-test here to see whether there's evidence that the sizes of tomato plants differ between the fields all right now let's work through this together so like always let's first construct our null hypothesis and that's going to be the situation where there is no difference between the mean sizes so that would be that the mean size in field a is equal to the mean size in field B now what about our alternative hypothesis well he wants to see whether the sizes of his tomato plants differ between the two fields he's not saying whether a is bigger than B or whether B is bigger than a and so his alternative hypothesis would be around his suspicion that the mean of a is not equal to the mean of B that they differ and to do this two sample t-test now we assume the null hypothesis we assume our null hypothesis and remember we're assuming that all of our conditions for inference are met and then we want to calculate a t statistic based on this sample data that we have and our T statistic is going to be equal to the differences between the sample means all of that over our estimate of the standard deviation of the sampling distribution of the difference of the sample means this will be the sample standard deviation from sample a squared over the sample size from a-plus the sample standard deviation from the B sample squared over the sample size from B and let's see we have all the numbers here to calculate it this numerator is going to be equal to 1 point 3 minus one point six one point three minus one point six all of that over the square root of let's see the standard deviation the sample standard deviation from the sample from field a is zero point five if you square that you're going to get zero point two five and then that's going to be over the sample size from field a over twenty two plus zero point three squared so that is zero point three squared is zero point zero nine all of that over the sample size from field B all that over twenty four the numerator is just going to be negative 0.3 negative 0.3 divided by the square root of 0.25 divided by 22 plus point zero nine divided by 24 and that gets us negative two point four for approximately negative two point four four and so if you think about a t distribution and we'll use our calculator to figure out this probability so this is a T distribution right over here this would be the assumed mean of our T distribution and so we got a result that is negative we get a t statistic of negative two point four four so we're right over here so this is negative two point four four and so we want to say what is the probability from this T distribution of getting something at least this extreme so it would be this area and it would also be and it would also be this area if we got to point four four above the mean it would also be this area and so what I could do is I'm going to use my calculator to figure out this probability right over here and then I'm just going to multiply that by two to get this one as well so the probability of getting a t-value I guess I could say where it's absolute value is greater than or equal to two point four four is going to be approximately equal to I'm going to go to second distribution I'm going to go to the cumulative distribution function for our T distribution like that and since I want to think about this tail probability here and I'm just gonna multiply it by two the lower bound is a very very very negative number you get a view that as functionally negative infinity the upper bound is negative two point four four negative two point four four and now what's our degrees of freedom well if we take the conservative approach it'll be the smaller of the two samples minus one well the smaller of the two samples is 22 and so 22 minus one is 21 so 21 in there to 21 and now I can paste and I get and that number right over there and if I multiply that by two because this just gives me the probability of getting something lower than that but I also want to think about the probability of getting something 2.44 or more above the mean of our T distribution so times two is going to be equal to approximately zero point zero two four so approximately zero point zero two four and what I want to do then is compare this to my significance level and you can see very clearly this right over here this is equal to our p value our p value in this situation our p value in this situation is clearly less than our significance level and because of that we said hey you assuming the null hypothesis is true we got something that's a pretty low probability below our threshold so we are going to reject reject our null hypothesis we tells us that there is so this suggests this suggests the alternative hypothesis that there is indeed a difference between the sizes of the tomato plants in the two fields
AP® is a registered trademark of the College Board, which has not reviewed this resource.