If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:9:46

Confidence interval for a mean with paired data

UNC‑4 (EU)
UNC‑4.O (LO)
UNC‑4.O.3 (EK)
UNC‑4.R (LO)
UNC‑4.R.1 (EK)
UNC‑4.R.2 (EK)

Video transcript

a group of friends wondered how much faster they could snap their fingers on one hand versus the other hand very important question in life each person snapped their fingers with their dominant hand for 10 seconds and their non-dominant hand for 10 seconds for where if you're right-handed right hand would be your dominant hand if you were left-handed left hand would be your dominant hand each participant flipped a coin to determine which hand they would use first because if you always used your dominant hand first maybe you're tired by the time you're doing your non-dominant hand or there's something else so here it's it's random which one you use first here are the data for how many snaps they performed with each hand the difference for each participant and summary statistics and this is actually real data from the Khan Academy content team and so you see for each of the participants for Jeff right over here he was able to do 44 snaps in ten seconds on his dominant hand which is impressive more than I think I could do and he was even able to do 35 on his non-dominant hand and so the difference here the dominant hand - the non-dominant was nine and then they tabulated this data for all five members now they also calculated summary statistics for them but this is the really interesting thing right over here this is the difference between the dominant and the non-dominant hand and so what they did here the mean difference what they did is they took this row right over here and they calculated the mean which they got to be 6.8 and then they calculated the standard deviation of these differences right over here which they got to be approximately one point six four and then we are asked create and interpret a 95% confidence interval for mean difference in number of snaps for these participants so pause this video see if you can make some headway here see if you could think about how to approach this so what's interesting here is we're not trying to construct a confidence interval for just the mean number of snaps for the dominant hand or the mean number of snaps for the non-dominant hand we're constructing a 95% confidence interval for a mean difference now you might you know I have two different samples here and then this third sample is or this third data is somehow constructed from these other two but one way to think about it this is matched pairs design so in a matched pair design what you do is for each participant for each member in your sample you will make them do the control and the treatment so for example you could view the control as how many they can do in the dominant hand in ten seconds and the treatment is how many they can do in the non-dominant hand and in matched pairs design you're really concerned about the difference and so it you can really view this as is you just have one sample size of five for which you are calculating the difference for each member of that sample and the standard deviation across that entire sample now before we calculate the confidence interval let's just remind ourselves some of our conditions that we like to think about when we are constructing confidence intervals the first condition we think about is whether our sample is random now if we were trying to make some type of judgment about all human beings and they're snapping ability this would not be a random sample these people all work at Khan Academy maybe somehow in our interview process we select for people who snap particularly well but whatever inferences we make we can say hey this is roughly true about this group of friends now the next condition we want to think about is the normal condition now there's a couple of ways to think about it if we had sample size of 30 or larger the central limit theorem says okay the distribution the sampling distribution would be roughly normal the sampling distribution of the sample means but obviously our sample size is much smaller than that one way to think about it we could just plot our data points and see whether they are seem to be skewed in any way and if we just do a little dot plot right over here we could say let's say make this 0 1 2 3 4 5 6 7 8 & 9 so we have one data point where the difference was nine one data point where the difference is 5 one data point where the difference is 8 one data point where the difference is 6 and another data point where the difference is six and so this doesn't look massively skewed in any way our mean difference was right over here is about six point eight it looks roughly symmetric so we can feel okay about this normal distribution this isn't the best study that one could conduct this is obviously a small sample size it's not random of the entire population but maybe we could go with it also when you think about biological processes like how well someone snaps which is a product of a lot of things happening in a human body and it's the sum of many many processes those things also tend to have a roughly normal distribution but I won't go into too much depth there but all of these things once again this isn't a super robust study but this is a fun thing for friends to do what they have nothing else to do all right now the third one is independence and this one actually we can feel pretty good about because just difference right over here really shouldn't impact David's difference or David's difference really shouldn't impact Kim's difference especially if they're not observing each other and let's just say for sake of argument that they did it all independently in a closed room with a independent observer so they weren't trying to get competitive or something like that but needless to say this isn't super robust study but we can still calculate a 95% confidence interval so how do we do that well we've done this so many times our confidence interval would be our sample mean so it would be the mean of our difference the mean of our difference plus or minus now we don't know the population standard deviation so we're going to use our sample standard deviation and if you're using a sample standard deviation and this confidence interval is all about the mean and so our critical value here is going to be based on a t-table on the T statistic and they're going to multiply that times the sample standard deviation of the differences divided by the square root of our sample size divided by the square root of 5 now we know most of this data here and let me just write it down over here we know the mean the sample mean right over here is 6 point 8 so it's going to be 6 point 8 plus or minus and now what will be our critical value here well we want to have a 95% confidence interval and what's our of freedom well it's one less than our sample size so our degrees of freedom right over here is equal to four and so we're you ready to use a tea table so this is a truncated tea table that I could fit on my screen here and so there's a couple of ways to think about it here they actually give us the confidence level and the reason why that corresponds to a tail probability of 0.025 is if you take the middle 95% of a distribution you're going to have 2.5 percent on either and that's going to be your tail probability so that's all that's going on over there so we're going to be in this column right over here and which degree of freedom do we use or degrees of freedom well it's going to be 4 degrees of freedom our sample size is 5 5 minus 1 is 4 so this is going to be our critical value to point 7 7 6 so we have 2 point 7 7 6 as our critical value and then times our sample standard deviation well the sample standard deviation for our differences right over here is 1 point 6 4 and then we're going to divide that by the square root of our sample size so the square root of our sample size where you wrote a 5 in there sometimes I just write it end there and so what is this going to be equal to first let's calculate just the margin of error right over here so this is going to be 2 point 7 7 6 times 1 point 6 4 divided by the square root of 5 and we get a margin of error of approximately two point zero three six so this is going to be six point eight plus or minus two point zero three six it's approximately equal to that where this is our margin of error and if we actually wanted to write out the interval we could just take six point eight minus this and six point eight plus that so let's do that again with the calculator so six point eight minus two point zero three six is equal to four point seven six four so our confidence interval starts at or 0.76 for approximately and it goes to let's see that can actually do this one in my head if I add two point zero three six to six point eight that is going to be eight point eight three six now how would we interpret this confidence interval right over here one way to interpret it is to say that we are 95% confident that this interval captures the true mean difference in snaps for these friends we could also say that there appears to be a difference in the mean number of snaps since zero is not captured in this interval and since the entire interval is above zero zero is not captured here and it's above zero it seems that this group right over here this group of friends at Khan Academy can snap faster with their dominant hands