If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:6:33
AP.STATS:
UNC‑4 (EU)
,
UNC‑4.V (LO)
,
UNC‑4.V.1 (EK)
,
UNC‑4.V.2 (EK)
,
UNC‑4.X (LO)
,
UNC‑4.X.1 (EK)
,
UNC‑4.X.2 (EK)
,
UNC‑4.Y (LO)
,
UNC‑4.Y.1 (EK)
,
UNC‑4.Y.2 (EK)

Video transcript

let's say that we have two populations so that's the first population and this is the second population right over here and we are going to think about the means of these populations so let's say this first population is a population of golden retrievers and this second population is the population of Chihuahuas and the mean that we're going to think about is maybe the mean weight so mu 1 would be the meet the true mean weight of the population of golden retrievers and Mew 2 would be the true mean weight of the population of Chihuahuas now what we want to think about is what is the difference between these two population means between these two population parameters well if we don't do if we don't know this all we can do is try to estimate it and maybe construct some type of confidence interval and that's what we're going to talk about in this video so how do we go about doing it well we've seen this or similar things before what you would do is you would take a sample from both populations so from population one here I would take a sample of size n sub one and from that I can calculate a sample mean so this is a statistic that is trying to estimate that and I can also calculate a sample standard deviation and I can do the same thing in the population of Chihuahuas if that's what our population two is all about so I could take a sample and actually this sample does not have to be the same as n1 so I'll call it n sub two it could be but doesn't have to be and from that I can calculate a sample mean X bar sub two and a sample standard deviation so now assuming that our conditions for inference are met and we've talked about those before we have the random condition we have the normal condition and we have the independence condition assuming those conditions are met and we talked about those in other videos for means let's think about how we can construct a confidence interval and so you might say all right well that would be the difference of my sample means X bar sub 1 minus X bar sub 2 plus or minus some Z value times my standard deviation times the standard deviation of the sampling distribution of the difference of the sample means so X bar sub 1 minus X bar sub 2 and you might say well where do I get my Z from well our confidence level would determine that confidence confidence level if our confidence level is 95% that would determine our Z now this would not be incorrect but we face a problem because we are going to need to estimate what the standard deviation of the sampling distribution of the difference between our sample means actually is to make that clear let me write it this way so the variance of the sampling distribution of the difference of our sample means is going to be equal to the variance of the sampling distribution of sample mean 1 plus the variance of the sampling distribution of sample mean 2 now if we knew the true underlying standard deviations of this population and this population then we could actually come up with these in that case this right over here would be equal to the variance of the population of population 1 divided by our sample size n 1 plus plus the variance of the underlying population 2 divided by this sample size but we don't know these variances and so we try to estimate them so we estimated with our sample standard deviations so we say this is going to be approximately equal to our first sample standard deviation squared over N 1 plus our second sample standard deviation squared over n 2 and so we can say that an estimate of the standard deviation of the sampling distribution of the difference between our sample means an estimate is going to be equal to the square root of this it's going to be approximately equal to the square root of s 1 squared over N 1 plus s 2 squared over n 2 but the problem is is once we use this estimate that we can figure out a critical Z value isn't going to be as good as a critical T value so instead you would say my confidence interval is going to be X bar sub 1 minus X bar sub du plus or minus a critical T value instead of a Z value because that works better when you are estimating standard deviation of the sampling distribution of the difference between the sample means and so you have T star times our estimate of this which is going to be equal to the square root of s sub 1 squared over n 1 plus S sub 2 squared over n 2 and then you might say well what determines our T star well once again you would look it up on a table using your confidence level and you might be saying wait hold on when I look up a T value I don't just care about a confidence level I also care about degrees of freedom what are the what is going to be the degrees of freedom in this situation well there's a simple answer and a complicated answer once we think about the difference of means there's fairly sophisticated formulas that computers can use to get a more precise degrees of freedom but what you will typically see in a statistical class is a conservative view of degrees of freedom where you take the lower of n 1 and n 2 and you subtract one from that so the degrees of freedom here so the degrees of freedom here is going to be the lower lower of N 1 minus 1 or n 2 minus 1 or you take the lower of n 1 and or n 2 and you subtract 1 from that in future videos we will work through examples that do this
AP® is a registered trademark of the College Board, which has not reviewed this resource.