If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Making a t interval for paired data

AP.STATS:
UNC‑4 (EU)
,
UNC‑4.O (LO)
,
UNC‑4.O.3 (EK)
,
UNC‑4.R (LO)
,
UNC‑4.R.1 (EK)
,
UNC‑4.R.2 (EK)
In some studies, we make two observations on the same individual. For instance, we might look at each student's pre-test and post-test scores in a course. In other studies, we might make an observation on each of two similar individuals. For example, some medicine trials involve pairing similar subjects so one receives the medicine and the other receives a placebo.
In both types of studies, we're working with paired data, and whenever we're working with paired data, we're typically interested in the difference between each pair—for example, the difference between the pre-test and the post-test data, or the difference between the medicine and the placebo data.
If certain conditions are met, we can construct a t interval to estimate the mean of these differences and draw conclusions.
In this article, we'll be going through two examples of making a t interval for paired data. Importantly, you'll have a chance to work through the second example on your own to ensure you've picked up on the main ideas.

Example 1

A running magazine wanted to review two watches—watch A and watch B—that use global position systems (GPS) to calculate the distance someone runs. They noticed that the watches didn't usually agree on the distance someone traveled in a given run.
The magazine took a random sample of 5 subscribers and asked them to run a 10 kilometer route wearing both watches at the same time (they all agreed to participate). At the end of their runs, the participants recorded the distance each watch said they traveled. Here are the data (all distances are in kilometers):
Runner12345
Watch A9, point, 89, point, 810, point, 110, point, 110, point, 2
Watch B10, point, 11010, point, 29, point, 910, point, 1
Construct a 95, percent confidence interval to estimate the mean difference in the distances reported by these watches. Does the interval suggest that there is a difference between the two watches?

Step 1: Calculate the differences

Even though it appears we have two sets of data—watch A and watch B—these data didn't come from two independent samples. The magazine took a single sample of 5 runners, and each runner wore both watches, so this is a matched pairs design. The one set of data we're interested in is the difference between watch A and watch B for each runner. Let's define this variable as start text, d, i, f, f, e, r, e, n, c, e, end text, equals, start text, B, end text, minus, start text, A, end text and calculate the difference for each runner:
Runner12345
Watch A9, point, 89, point, 810, point, 110, point, 110, point, 2
Watch B10, point, 11010, point, 29, point, 910, point, 1
Difference left parenthesis, start text, B, end text, minus, start text, A, end text, right parenthesis0, point, 30, point, 20, point, 1minus, 0, point, 2minus, 0, point, 1
Key idea: When dealing with paired data, we're most interested in the distribution of the differences.

Step 2: Check conditions

We want to use these n, equals, 5 differences to construct a confidence interval for the mean difference. Since we don't know the population standard deviation of the differences, we'll have to use the sample standard deviation in its place. This makes it appropriate to use a t interval instead of a z interval to estimate the mean difference. Let's check the conditions for making a t interval.
  • Random: The magazine took a random sample of their subscribers.
  • Normal: Since our sample of n, equals, 5 runners is small, we need to plot the data. The differences are roughly symmetric with no outliers, so it should be safe to proceed.
  • Independent: It's reasonable to assume independence between each runner's measurements. They were randomly selected, and they shouldn't influence each other's results.

Step 3: Construct the interval

Here are the data:
Runner12345
Watch A9, point, 89, point, 810, point, 110, point, 110, point, 2
Watch B10, point, 110, point, 010, point, 29, point, 910, point, 1
Difference left parenthesis, start text, B, end text, minus, start text, A, end text, right parenthesis0, point, 30, point, 20, point, 1minus, 0, point, 2minus, 0, point, 1
Here are the summary statistics:
MeanStandard deviation
Watch Ax, with, \bar, on top, start subscript, start text, A, end text, end subscript, equals, 10, point, 00s, start subscript, start text, A, end text, end subscript, approximately equals, 0, point, 19
Watch Bx, with, \bar, on top, start subscript, start text, B, end text, end subscript, equals, 10, point, 06s, start subscript, start text, B, end text, end subscript, approximately equals, 0, point, 11
Difference left parenthesis, start text, B, end text, minus, start text, A, end text, right parenthesisx, with, \bar, on top, start subscript, start text, D, i, f, f, end text, end subscript, equals, 0, point, 06s, start subscript, start text, D, i, f, f, end text, end subscript, approximately equals, 0, point, 21
Since we want to construct a confidence interval for the mean difference, we only need the summary statistics for the differences.
We'll use the formula for a one-sample t interval for a mean:
(statistic)±(criticalvalue)(standard deviationof statistic)xˉDiff± tsDiffn\begin{aligned} (\text{statistic}) &\pm \left({\text{critical}\atop\text{value}}\right)\left({\text{standard deviation} \atop\text{of statistic}}\right) \\\\ \bar x_{\text{Diff}} &\pm\ t^{*} \cdot \dfrac{s_{\text{Diff}}}{\sqrt n} \end{aligned}
Components of formula:
Our statistic is the sample mean x, with, \bar, on top, start subscript, start text, D, i, f, f, end text, end subscript, equals, 0, point, 06, start text, space, k, m, end text.
Our sample size is n, equals, 5 runners.
Our sample standard deviation is s, start subscript, start text, D, i, f, f, end text, end subscript, equals, 0, point, 21, start text, space, k, m, end text.
Our degrees of freedom is start text, d, f, end text, equals, 5, minus, 1, equals, 4, so for 95, percent confidence our critical value is t, start superscript, times, end superscript, equals, 2, point, 776.
Computations:
xˉDiff± tsDiffn0.06±2.7760.2150.06±(2.776)(0.094)0.06±0.2610.060.261=0.2010.06+0.261=0.321\begin{aligned} \bar x_{\text{Diff}} &\pm\ t^{*} \cdot \dfrac{s_{\text{Diff}}}{\sqrt n} \\\\ 0.06 &\pm 2.776 \cdot \dfrac{0.21}{\sqrt {5}} \\\\ 0.06 &\pm (2.776)(0.094) \\\\ 0.06 &\pm 0.261 \\\\ 0.06 &- 0.261=-0.201 \\\\ 0.06 &+ 0.261=0.321 \end{aligned}
Interval approximately equals, left parenthesis, minus, 0, point, 20, comma, 0, point, 32, right parenthesis

Step 4: Interpret the interval

Does the interval suggest that there is a difference between the two watches?
We're 95, percent confident that the interval left parenthesis, minus, 0, point, 20, comma, 0, point, 32, right parenthesis captures the mean difference between the distances (in kilometers) reported by the watches on this sort of run. Notice that the interval contains 0, start text, space, k, m, end text—which represents no difference—so it's plausible that there is no difference between the distances reported by Watch A and Watch B.
If the entire interval had been above 0 (all positive values), or if it had been entirely below 0 (all negative values), then it would have suggested a difference between the two watches.

Example 2—Try it!

An educational website offers a practice program for the Law School Admissions Test (LSAT). Users of the program take a pretest and posttest. Here are the scores and gains for a random sample of 6 users:
User123456
Pre140152153159150146
Post150159170164148166
Gain left parenthesis, start text, p, o, s, t, end text, minus, start text, p, r, e, end text, right parenthesis107175minus, 220
Here are summary statistics:
MeanStandard deviation
Prex, with, \bar, on top, start subscript, start text, p, r, e, end text, end subscript, equals, 150s, start subscript, start text, p, r, e, end text, end subscript, approximately equals, 6, point, 48
Postx, with, \bar, on top, start subscript, start text, p, o, s, t, end text, end subscript, equals, 159, point, 5s, start subscript, start text, p, o, s, t, end text, end subscript, approximately equals, 8, point, 89
Gain left parenthesis, start text, p, o, s, t, end text, minus, start text, p, r, e, end text, right parenthesisx, with, \bar, on top, start subscript, start text, g, a, i, n, end text, end subscript, equals, 9, point, 5s, start subscript, start text, g, a, i, n, end text, end subscript, approximately equals, 8, point, 07
Problem A (Example 2)
Based on this sample, which is a 95, percent confidence interval for the mean gain for users of this program?
Choose 1 answer:
Choose 1 answer:

Problem B (Example 2)
Is it plausible that users of this program have no mean gain?
Choose 1 answer:
Choose 1 answer:

The makers of the website say that this interval provides strong evidence that using their program will cause an increase in a user's LSAT score.
Problem C (Example 2)
Is this a valid conclusion?
Choose 1 answer:
Choose 1 answer:

Want to join the conversation?

  • leafers seedling style avatar for user Makara.p
    What is the difference between z* and t*? When to use z* and when to you t*?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user ahmedmetiaz
    "Since there was no control group for comparison we shouldn't say there's a causal relationship here" I didn't understand this explanation
    (1 vote)
    Default Khan Academy avatar avatar for user
    • leaf grey style avatar for user MFogleman
      The only way to tell if something causes something else (a causal relationship) is with a controlled experiment, where you can make sure that only one variable changes. In this example, we did an observational study instead of an experiment, so we don't know for sure if the program causes better performance. There could be a lurking variable in the mix, so we would need to assign treatments and perform an experiment to know more.

      A control group would be a group that doesn't use the program at all between the pretest and post test, so you can compare the performance between both groups and know that the only change was whether someone used the program or not. You would also need to take some more steps to make sure that this experiment is ironclad, which would let you do more with the results. I would recommend checking out the videos on experimental design, observational studies, and surveys, just to get a bit more depth.
      Hope this helps.
      (13 votes)
  • blobby green style avatar for user Hieu Le
    How did you find the standard deviation in the table in step 2?
    (5 votes)
    Default Khan Academy avatar avatar for user
    • starky seedling style avatar for user deka
      using our familiar general formula for std,
      std_sample = sqrt{sum_on_all_data[(data_i-data_mean)^2]/n-1}
      this measures and tells you how each datapoint varies from the mean of them in general

      by the way, you can't use the concept and formula for std of the difference between two random variables that sqrt(std_var1^2 + std_var2^2) here. cause the two datasets (pre and post -tests) are not independent here. they are paired each other. thus the std of their difference may be less than that of those from two independent datasets using the formula above. and it is indeed

      please, hand on the numbers yourself. it will help you to grasp the concept of variance, standard deviation, and variables more clearly
      (1 vote)
  • starky sapling style avatar for user pfarheen
    You made a typo. You wrote "here are the data" instead of "here is the date." It's at Step 3.
    (0 votes)
    Default Khan Academy avatar avatar for user
  • leaf green style avatar for user jk3109
    In my opinion, Step 2 of Example 1 shouldn't be considered independent because I don't believe they "shouldn't influence each other's results". They are running at the same time so isn't it likely they are pressured to keep up at the group pace and thus influenced by each other? If this type of question were on the AP exam, should I state this opinion or assume they aren't going to be influenced and go through with the calculations?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • boggle yellow style avatar for user Jesse Johnson
      I hope this helps,

      It states that they are wearing both watches at the same time, not that they are running in a group together at the same time. Since the runners were randomly selected, and the runners according to the study were not running together, they weren't able to reasonably influence each other's results.
      (4 votes)
  • blobby green style avatar for user phillicia makanatleng
    how was the standard deviations in example 1 calculated?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user B H
    How did you calculate the standard deviation of the difference between means? You show 8.07; I get 4.49 = [6.48/SQRT(6) + 8.89/SQRT(6)]. Thanks.

    At pm CDT: Math AP®︎ Statistics Confidence intervals Confidence intervals for means, example 2.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Mike Papadakis
    Is it okay to use the difference of watch a - watch b? this results in negative numbers but if one keeps in mind we talk about the difference in time intervals, no problem should come up. Isn't that right?( example 1)
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user shadiaroberts.srr
    how did he get the standard deviation if it wasnt given?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user kartikeya.gupta4
    For PROBLEM B (EXAMPLE 2), should the correct answer be- 'Yes, since, we are only 95% confident of the interval' ?
    (0 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Najib Bouhout
      Yes, but we can't attribute the gain to practice on the website. We need the scores of those who don't practice to compare with. An experimental design with a control group would ensure that the gain is indeed caused by practice on the website
      So, the confidence interval tells us that the true value for mean gain for those who practice lies within that interval, but not that it is caused by practice on the website. I hope this helps.
      (1 vote)