If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## Statistics and probability

### Course: Statistics and probability>Unit 11

Lesson 4: More confidence interval videos

# T-statistic confidence interval

Sal computes a confidence interval for the emission from an engine with a new design. Created by Sal Khan.

## Want to join the conversation?

• Hello Sal. I checked up a t-distribution table and found that the degrees of freedom went upto 120. Why would we need that much when we only use the t-distribution when n < 30? • when we use sigma and s? •  sigma is the standard deviation of a population, and s is the standard deviation of a sample. My tip for remembering it is that the population is unknown and mysterious but the sample is very clear data, so that's why we use mysterious Greek letters like mu and sigma to describe population statistics but familiar Latin letters like x-bar and s to describe sample statistics.
• I may have missed this somewhere and a site search didn't seem to find it: where might the t statistic videos be? Thanks. • couldn't find any video specifically describing this way to do a t statistic too. But I guess he means the videos about the t statistic in general, like "Introduction to t statistics" and stuff.

Since the formula is basically the same, just written in another way. So the formula we were given in the videos is:
x_bar +- t* sigma/root(n) to get your confidence interval
using this you can conclude that:
x_bar - t* sigma/root(n) < mu < x_bar + t* sigma/root(n)

all - x_bar
=> -t* sigma/root(n) < mu - x_bar < t* sigma/root(n)

all /sigma/root(n):
=> -t* < (mu - x_bar)/(sigma/root(n)) < t*

all /(-1)
=> t* > (x_bar - mu)/(sigma/root(n)) > -t*
<=> -t* < (x_bar - mu)/(sigma/root(n)) < t*

And here you have the formula he used in this video
• sort of like Katoriak's question. Why do you use the degrees of freedom for anything? I'm not making an intuitive connection.

Mattson's answer makes sense...but why do we replace 'n' with the dof ? • You use (n-1) degrees of freedom because all the values leading up to that last value can be any value, but the last one must fit in just right to make everything before it match the value on the other side of the equal sign. Let's say I have one hundred toys. Furthermore, I have 10 buckets. I only have (10 buckets- 1 bucket= 9 buckets) 9 buckets where I can store these toys. Whether those buckets have equal amounts of toys or not, the last bucket must bring the total number of toys to 100. So I can put ten toys per bucket (10 toys per 10 buckets equals 100), or 99 toys in the first bucket but zero toys in the middle buckets, but the last bucket must have 1, because 99 toys+1 toy= 100 toys.
• Around the end of the video, Sal talks about how there's a 95% chance that it's true that our real population mean is between 19.3 and 15.04. I don't want to confuse anyone but what I learnt in class is that it rather means that a 95% confidence interval represents the fact that when sampling from the population 95% of the time we're going to get a mean between those two values.

It relates more to sampling a certain amount of individuals from a population multiple times and getting different sample means which could all be right.

Its hard to explain, and a small distinction but could be important when writing a report.

Or am I mistaken? • so my teacher always told us we want to "reject the null hypothesis" and if we can't we have to state that we "could not reject the null hypothesis". why was that? • wait, how'd you get S to be 2.98? • In the beginning of the video Sal refers to another video with the same problem. Where can I find that video? Thanks. • at he explain what t* is equal to. he says we've seen this multiple times, but i don't remember this explained before.
would it be this?
t*=(x_bar-mu)/s/sqrt(n)
It seems very similar to the z score, but instead of dividing by the sample standard deviation of the sampling distribution sigma/sqrt(n) it uses s/sqrt(n)
it this explained sonewhere else? • Why can't/didn't we assume that the mean of the sample distribution of the means is 17.17? Sal assumed it would be 20. But can't we assume that it is 17.17 and do our confidence interval around that? / also do the small sample hypothesis test as well?

I solved it and got the same answer using 17.17 as the mean.. just want to understand Sal's logic behind it. Also in previous lessons like the 7 patients and the apples weight one, we assumed sample's mean is the Ux. So I thought Ux should be 17.17 here. 