If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Introduction to t statistics

An introduction to why we use t statistics.

Want to join the conversation?

  • piceratops ultimate style avatar for user Sandhya S.
    What is the difference between a statistic and a parameter? Please explain like you would to someone who barely knew anything about statistics.
    (8 votes)
    Default Khan Academy avatar avatar for user
    • leaf yellow style avatar for user Alex Nagirny
      We use 'statistic' in order to approximately estimate 'parameter'.

      Let's say we want to know what percent of all male population of USA (or another random country) do some jogging in the morning. This percent is called 'parameter'.
      Can we really survey and analyze every male in the USA ?
      Well, maybe we can, but it would be to costly to do so in terms of the time, money or human rights infringements of those who don't want to share what the do in the morning.

      So, in practice we just randomly select some men from all over the country and count what percent of them run in the morning. This percent is called 'statistic', which approximately estimates 'parameter'.
      (26 votes)
  • old spice man blue style avatar for user chris
    This explanation of the distinction seems really confusing. If the population is Bernoulli distributed then the population proportion and population mean are the same thing! And yet we can estimate one with a Z-stat but the other needs a T-stat?

    Also, when Sal calculates confidence intervals for the sample mean he uses the sample variance, which is presumably Bessel corrected and therefore less biased. But when he calculates the intervals for the sample proportion there's no Bessel correction!

    Again, the population proportion and population mean are the same for a Bernoulli distribution. And the sample proportion and sample mean are also the same. Yet, when calculating confidence intervals, why do we use Z-stats for one and T-stats for the other? Why do we use Bessel's correction for one, and not for the other?

    Finally, why is there no mention of the sample size? I thought that small n is the determining factor for when to use T-stats instead of Z-stats.
    (13 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user ju lee
    why when calculating p hat (sample proportion), we dont use t score?
    (11 votes)
    Default Khan Academy avatar avatar for user
    • leaf orange style avatar for user szechun33
      When calculating phat, we know sigma. However, now we don't, as mentioned in , so we use a thing called a t score.

      EDIT:
      Sorry for my original unclear answer. Looking at Edexcel S3 and S4 manuals I am pleased to confirm that JW and chris are correct. When n is large(>30 for IAL) the Student-t tends toward a normal. Also remember that the t- and z-statistics are basically the same thing (s is unbiased estimate of \sigma) and the difference is that in one case s (sample variance) is also an r.v. and in the other it's not because of extra data given. So which on to use ultimately depends on whether you want to make the approximation that s==\sigma (which is accurate when n>30).

      PS this vid is an intro to t-score so presumably he wants to connect the z- and t-scores first.
      (3 votes)
  • leaf green style avatar for user Aditya Roongta
    Why is the expression at 'not so good'? Where can I get to read the math behind calculating z and t?
    (8 votes)
    Default Khan Academy avatar avatar for user
  • male robot hal style avatar for user Zev Oster
    What is the difference between z and t that fixes the problem?
    (7 votes)
    Default Khan Academy avatar avatar for user
  • hopper cool style avatar for user NerdLord28
    At , Sal claims that using z* as part of making the confidence interval for a sample mean actually leads to an underestimate for the confidence interval. Why is that?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • piceratops seedling style avatar for user John Bennett
      The actual sampling distribution of means doesn't really follow a normal distribution (which is what z is based on). The sampling distribution of means has more "extreme" values than does the normal distribution, particularly when you use small samples to estimate the mean. This means more of samples will have means further from the population mean than they would if the sampling distribution was normal. So the confidence interval is narrower than should be, and the intervals don't contain the parameter the "correct" proportion of time. The t-distribution accounts for these "fatter tails".
      (2 votes)
  • old spice man green style avatar for user Bastian Widanski
    here I'm a bit stuck...
    p is the proportion of something in the population.
    p_hat is the proportion of the same parameter in the sample we take. So to speak our statistic.
    So isn't p "just" the population_mean (of the something) / N?
    And isn't p_hat the sample proportion: p_hat = sample_mean / n ?
    All this by definition?

    What am I missing?
    Is the X_mean we are searching for the mean of the sampling distribution of the sample means? But wouldn't that be mu = p.... so back to the beginning of my question...
    And if it is the real mean value, so not the proportion wouldn't it be just p_hat * n ?
    So if we have a mean, but not the proportion, then why can't we just do mean / n to get the p_hat. And from here go the old way with p*(1-p)... ?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user kc331155
    Why using sample standard deviation leads to underestimate?!
    It's square root of sample variance right? And sample variance is divided by "n-1" rather than "n", so it seems to have larger value. why doesn't it lead to overestimate?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • piceratops seedling style avatar for user John Bennett
      It's not about the sample standard deviation (the standard error), it's about the shape of the sampling distribution (all the possible means for a particular sample size). This distribution is not a normal distribution, particularly if you have small samples. It actually follows what's called student's t-distribution. This distribution has "fatter tails" (ie, more values that are far from the mean) than the normal distribution, and this is what causes the underestimation.
      (3 votes)
  • stelly yellow style avatar for user 24tinat
    Where does sigma over square root of n come from? Why and how did we put it there?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • primosaur seed style avatar for user Ian Pulizzotto
      Interesting question! In this discussion, we use theoretical (or population) standard deviation and variance.

      To derive this, we use the following properties:
      1) The variance of a sum of independent random variables is the sum of their variances.
      2) When a random variable is multiplied by a factor that doesn't depend on the random variable, the variance is multiplied by the square of this factor.
      3) The standard deviation is the (non-negative) square root of the variance, and so the variance is the square of the standard deviation.

      Let the random variables X_1, X_2, X_3, ... , X_n represent a random sample of n data values, each of which has standard deviation sigma >= 0 (and therefore variance sigma^2). If we assume the population is very large, then it's reasonable to call these n random variables independent.

      The sample mean is (X_1 + X_2 + X_3 + ... + X_n)/n, and the standard error of the mean is the standard deviation of the sample mean. Therefore, the standard error of the mean is

      standard deviation[(X_1 + X_2 + X_3 + ... + X_n)/n]
      = sqrt{variance[(X_1 + X_2 + X_3 + ... + X_n)/n]}
      = sqrt[variance(X_1 + X_2 + X_3 + ... + X_n)/(n^2)]
      = sqrt{[variance(X_1) + variance(X_2) + variance(X_3) + ... + variance(X_n)]/(n^2)}
      = sqrt[n sigma^2 / (n^2)]
      = sqrt(sigma^2 / n)
      = sigma/sqrt(n).

      Have a blessed, wonderful day!
      (3 votes)
  • blobby green style avatar for user rakesh11aggarwal94
    Hey
    can anyone explain what is the difference between True population Proportion and True Population mean.
    ...
    I am bit confused
    (3 votes)
    Default Khan Academy avatar avatar for user

Video transcript

- [Instructor] We have already seen a situation multiple times where there is some parameter associated with a population, maybe it's the proportion of a population that supports a candidate, maybe it's the mean of a population. The mean height of all the people in the city. And we've determined that's it's unpractical or there's no way for us to know the true population parameter. But, we can try to estimate it by taking a sample size. So, we take n samples and then we calculate a statistic based on that. We've also seen that, not only can we calculate the statistic, which is trying to estimate this parameter, but we can construct a confidence interval about that statistic based on some confidence level. And so, that confidence interval would look something like this. It would be the value of the statistic that we have just calculated plus or minus some margin of error. And so, we'll often say this critical value, z, and this will be based on the number of standard deviations we want to go above and below that statistic. And so, then we'll multiply that times the standard deviation of the sampling distribution for that statistic. Now, what we'll see is we often don't know this. To know this, you oftentimes even need to know this parameter. For example, in the situation where the parameter that we're trying to estimate and construct confidence intervals for is say, the population proportion. What percentage of the population supports a certain candidate? Well, in that world, the statistic is the sample proportion. So, we would have the sample proportion plus or minus z star times, well we can't calculate this unless we know the population proportion, so instead we estimate this with the standard error of the statistic, which, in this case, is p hat times one minus p hat. The sample proportion times one minus the sample proportion over our sample size. If the parameter we're trying to estimate is the population mean, then our statistic is going to be the sample mean. So, in that scenario we're going to be looking at, our statistic is our sample mean plus or minus z star. Now, if we knew the standard deviation of this population, we would know what the standard deviation of the sampling distribution of our statistic is. It would be equal to the standard deviation of our population times the square root of our sample size. But, we often will not know this. In fact, it's very unusual to know this. And so, sometimes, you'll say, okay, if we don't know this, let's just figure out the sample standard deviation of our sample here. So, instead, we'll say, okay, let's take our sample mean plus or minus z star times the sample standard deviation of our sample, which we can calculate, divided by the square root of n. Now, this might seem pretty good if we're trying to construct a confidence interval for our sample, for our mean, but, it turns out, that this is not not so good because it turns out that this right over here is going to actually underestimate the actual interval, the true margin of error you need for your confidence level. And so, that's why statisticians have invented another statistic. Instead of using z, they call it t and instead of using a z-table, they use a t-table. Now, we're going to see this in future videos. And so, if you are actually trying to construct a confidence interval for a sample mean, and you don't know the true standard deviation of your population, which is normally the case, instead of doing this, what we're going to do is we're going to take our sample mean, plus or minus, and our critical value, we'll call that t star times our sample standard deviation, which we can calculate, divided by the square root of n. And so, the real, functional difference is that this actually is going to give us the confidence interval that actually has the level of confidence that we want. If we want a 95% level of confidence, if we keep computing this over and over again for multiple samples, that roughly 95% of the time, this interval will contain our true population mean. And, to functionally do it, and we'll do it in future videos, you really just have to look up a t-table instead of a z-table.