- [Tutor] What I wanna do in
this video is give a primer, I'm thinking about when to use a z statistic versus a t statistic, when we are doing significance tests. So there's two major scenarios that we will see in an
introductory statistics class, one is when we are
dealing with proportions, so I'll write that on the
left side right over here and the other is when we
are dealing with means. In the proportion case, when we are doing our significance test, we will set up some null hypothesis, that usually deals with
the population proportion, we might say it is equal to some value, let's just call that P sub one and then maybe you have
an alternative hypothesis, that, well, no, the population proportion is greater than that or less than that or it's just not equal to that, so let me just go with that one, it's not equal to P sub one and then what we do to actually test, to actually do the significance test is we take a sample from the population, it's going to have a sample size of n, we need to make sure that we feel good about making the inference, we've talked about the
conditions for inference in previous videos multiple times, but from this we calculate
the sample proportion and then from this, we
calculate the P value and the way that we do the P value, remember the P value is the probability of getting a sample proportion
at least this extreme and if it's below some threshold, we reject the null hypothesis
and suggest the alternative and over here the way we do that is well, we find an associated z value for that P for that sample proportion and the way that we calculate it, we say, okay look, our z is going to be, how many of the sampling distributions standard deviations are
we away from the mean and remember the mean of
the sampling distribution is going to be the population proportion, so here we've got this sample statistic, this sample proportion, the difference between that
and the assumed proportion, remember when we do
these significance tests, we try to figure out the probability assuming the null hypothesis is true and so when we see this P sub zero, this is the assumed proportion
from the null hypothesis, so that's the difference
between these two, the sample proportion and
the assumed proportion and then you'd wanna divide
it by what's often known as the standard error of the statistic, which is just the standard deviation of the sampling distribution
of the sample proportion and this works out well
for our proportions, because in proportions, I
can figure out what this is, this is going to be
equal to the square root of the assumed population
proportion times one minus the assumed population
proportion, all of that over n and then I would use this z statistic to figure out the P
value and in this case, I would look at both
tails of the distribution, because I care about how far I am either above or below the
assumed population proportion. Now with means, there's
definitely some similarities here, you will make a null hypothesis, maybe you assume the population
mean is equal to mu one and then there's going to be
an alternative hypothesis, that maybe your population
mean is not equal to mu one and you're gonna do something very simple, you take your population,
you take a sample of size n and instead of calculating
a sample proportion, you calculate a sample mean and actually you can
calculate other things, like a sample standard deviation, but now you have an issue, you say, well ideally I
would use a z statistic and you could, if you were able to say, well I could take the difference
between my sample mean and the assumed mean
in the null hypothesis, so that would be this right over here, that's what that zero means, the assumed mean from the null hypothesis and I would then divide by the
standard error of the mean, which is another way of
saying the standard deviation of the sampling distribution
of the sample mean, but this is not so easy to figure out, in order to figure out this, this is going to be the standard deviation of the underlying population divided by the square root of n. We know what n is going to
be, if we conducted a sample, but we don't know what
the standard deviation is, so instead what we do is we estimate this and so we'll take the sample mean, we subtract from that the
assumed population mean from the null hypothesis and we divide by an estimate of this, which is going to be our
sample standard deviation divided by the square root of n, but because this is an estimate, we actually get a better result, instead of saying, hey,
this is an estimate of our z statistic, we will
call this our t statistic and as we will see, we will
then look this up in a t table and this will give us a better
sense of the probability.