If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# When to use z or t statistics in significance tests

When to use z or t statistics in significance tests.

## Want to join the conversation?

• when n (sample size) is greater or equal to 30, can we use use z statistics because the sampling distribution of the sample mean is approximately normal, right? if this is the case, then why does t table contain rows where the degree of freedom is 100, 1000 etc (i.e. degree of freedom = n - 1)? if n is greater or equal to 30, we would be using a z table anyway, so is the rows in t table that have degree of freedom greater than 30 redundant?
• I guess it's to show that the t-table approixmates to a normal z-table when n i large. In those t-tables I've seen they jump/skip more and more degrees after 30 and ends with the z-table value.
• This video would have been so helpful waaay back when we were first introduced to sampling distributions.
• I agree. He calls this a "primer", which Webster's defines as an introduction. Shouldn't this have come first?
• But when Sal used simulation technique for calculating p value, the answer is very different.

In this previous video- *"Estimating a P-value from a simulation"* https://www.khanacademy.org/math/statistics-probability/significance-tests-one-sample/idea-of-significance-tests/v/estimating-p-value-from-simulation
I calculated the p value by the formula which Sal just described. However, the answer is very very different.

Simulation p value- 7.5%
formula p value - 0.16%

I have calculated it three times. Can anyone explain this difference in p values ?
• Yes, I also get a p-value of 0.16% with the formula.
I think the problem here is that we do not meet all the conditions for inference. We do not meet the normal condition. n < 30. and n * p is way less than 10. This problem with normality we can also see in the simulation that is performed in the problem, where the distribution of dots is very skewed to the right, being very far from being normal.
• Sal seems to suggest that Z method should be used in the case of sample proportions, and T method for the sample mean. Can one confirm? I thought I saw an earlier video that says if the number of samples is greater than 30 then Z score gives more accurate results, while T score is advised if the number of samples is less than 30.
• Why does Sal introduce another variable of p_0, when he could just use p_1 in calculation Z statistic?
• Let's say 𝐻₀: 𝑝 < 𝑝₁

Then there is no assumed population proportion, we just assume that the true population proportion is less than whatever value 𝑝₁ is, and 𝑝₀ is the true population proportion given that 𝐻₀ is true.

By convention we always treat 𝑝₀ and 𝑝₁ as separate quantities regardless of what 𝐻₀ says.
• Does anyone know at what a standard deviation of the sampling distribution of the sample mean is? If so, any chance can you explain it?
• Say we have a population P.
We are interested in knowing the population parameter mean, but we cant access all the population.

We take a sample called p1 and find its sample mean.
We take a sample again say p2 and find its sample mean.
We do this severally for n samples and so we have n means.
We then plot the means we have a sampling distribution of means.

By the Central Limit Theorem the means would have a normal distribution...just an aside.

Now, finding the standard deviation of that sampling distribution we just plotted is what Sal referred to as the standard deviation of the sampling distribution of the sample mean. It's quite a mouthful. Hopefully you can wrap your head around it now.

H_L.
(1 vote)
• This whole video is great for the difference between when we use z-statistics and when we use t-statistics. How do we know when to use Chi-square-statistics versus f-statistics versus ANOVA versus Linear Regression versus Multiple Regression?
I guess I need a chart that compares all the requirements for each type of evaluation similar to the two columns we end up with at !