If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Simulation showing value of t statistic

See why we use t statistics when building confidence intervals for a mean using the sample standard deviation in place of the population standard deviation.

Want to join the conversation?

Video transcript

- In a previous video we talked about trying to estimate a population mean with a sample mean and then constructing a confidence interval about that sample mean. And we talked about different scenarios. We could use a Z table plus the true population standard deviation, and that actually would construct pretty valid confidence intervals, but the problem is you don't know the population standard deviation. And so you might try to us a Z table to find your critical values plus the sample standard deviation, but what we talked about is this doesn't actually do a good job of calculating our confidence intervals, and we're gonna see that experimentally in a few seconds. And so instead we have something called a t-statistic where if we want our critical value we use a t-table instead of a z-table. And then we use that in conjunction with our sample standard deviation, and then all of a sudden we are actually going to have pretty good confidence intervals. To make this a little bit more real, let's look at a simulation. So this is a scratch pad on Khan Academy, made by Khan Academy user Charlotte Allen. And the whole point there is to see what our confidence intervals look like with these different scenarios. So let's say we have a true population mean of 2.0, let's say it the average number, the mean number of apples people eat a day. The true population mean is two, that seems high, but maybe it's in a certain country that has a lot of apples. And let's say that we know that the population standard deviation is 0.5. And we're going to create confidence intervals with a goal of having a 95 percent confidence level, and we're gonna take sample sizes of 12. So first, we can construct our confidence intervals using z and sigma, which is a legitimate way to do it. And so let's just draw a bunch of samples here. And so we do see that it looks like it is roughly 95 percent. When we keep making these samples, and constructing these confidence intervals, that 95 percent of the time these confidence intervals contain our true population mean. So these look like a good confidence intervals, but what we've talked about is normally when your doing this type of thing, this type of inferential statistics, you don't know the population standard deviation. You don't know sigma. So instead, you might be tempted to use z with our sample standard deviations. But if you look at that for these exact same samples we just calculated, notice, now when we did it over and over again, we've done this 625 times, in this scenario, where we keep calculating the confidence intervals with z and s, the true population mean is contained in the intervals only 92.2 percent of the the time. Now we could keep going. So we have a much lower hit rate than we would hope to have if we were actually using z and sigma. Now what's neat is, if we use t, use a t-table, notice, this is getting much closer, and this is neat! Because with a t-table, and something we can actually get from the sample, the sample standard deviation, we're actually able to have a pretty close hit rate to what we would have if we actually knew the population standard deviation. So that's the value of t and t-statistics, and we're going to give more and more examples, including using a t-table in future videos.