If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

# Sampling distribution of the sample mean (part 2)

AP.STATS:
UNC‑3 (EU)
,
UNC‑3.H (LO)
,
UNC‑3.H.2 (EK)
,
UNC‑3.H.3 (EK)
,
UNC‑3.H.5 (EK)

## Video transcript

we hopefully now have a respectable working knowledge of the sampling distribution of the sample mean and what I want to do in this video is explore a little bit more on how that distribution changes as we change our sample size n right end down right here our sample size n so just as a bit of review we saw before we could just start off with any crazy distribution maybe it looks something like this I'll do discrete distribution really to model anything at some point you have to make it discrete it could be a very granular discrete distribution but let's say something crazy that looks like this this is clearly not a normal distribution but we saw in the first video if you take let's say sample sizes of four so if you took four numbers from this distribution for random numbers where let's say this is you know the probability of a 1 2 3 4 5 6 7 8 9 if you took 4 numbers at a time and averaged them let me do that here if you took four numbers at a time let's say we take we use this distribution to generate four random numbers right we're very likely to get a nine we're definitely not going to get any seven or eight so we're definitely not going to get a four we might get a one or two three is also very likely five is very likely so we use this function to essentially generate random numbers for us and we take samples of four and then we average them up so let's say our first average is like you know I don't know it's let's say it's a nine it's a five it's another nine and then it's a one so what is that that's 14 plus 10 24 divided by 4 the average for this first trial for this first sample of 4 is going to be 6 right they add up to 24 divided by 4 so we would plot it right here our average was 6 that time just like that and we'll just keep doing it and we've seen in the past then if you just keep doing this this is going to start looking something like a normal distribution so maybe we do it again the average is 6 again maybe we do it again the average is 5 we do it again the average is 7 we do it again the average is 6 and then if you just do this a ton a ton of times your distribution might look something that looks very much like a normal distribution so these boxes are really small so we just do a bunch of these trials at some point it might look a lot like a normal distribution obviously there's some average values it won't be a perfect normal distribution because you can't ever get anything less than zero or anything less than a one really as an average you can get zero as an average and you can't get anything more than nine so it's kind of it's not going to have infinitely long tails but at least for the middle part of it a normal distribution might be a good approximation in this video what I want to think about is what happens is we change as we change n so in this case n was four n is our sample size every time we do a trial we took four and we took their average and we plotted it we could have had an equal ten we could have taken ten samples from this from this population you could say Oh from this random variable averaged them and then plotted them here and in the last video we ran the simulation I'm going to go back to that simulation in a second we saw a couple of things and I'll show it to you a little bit more depth this time when n is pretty small it doesn't approach a normal distribution that well so when n is small I mean let's take the extreme case what happens when n is equal to one that literally just means I take one instance of this random variable and average it well if it's just going to be that thing so if I just take a bunch of trials from this thing and plot it over time what's it going to look like what's definitely not going to look like a normal distribution it's going to look you're going to have a couple of ones you're going to have a couple of twos you're going to have more threes like that you're going to have no fours you're going to have a bunch of fives you're gonna have some sixes that look like that and you're gonna have a bunch of nines so there your sampling distribution of the sample mean for an N of one is going to look I don't care how many trials you do it's not going to look like a normal distribution so the central limit theorem although I said you know you do a bunch of trials that look like a normal distribution it definitely doesn't work for n equal 1 as n gets larger though it starts to make sense that let's see if we got any equals to and I'm all just doing this in my head I don't know what the actual distributions would look like but then it still would be difficult for it to become an exact normal distribution but then you can get more instance that you could get more you know you might get things from all of the above but you can only get two in each of your baskets that you're averaging you're gonna get two numbers right so you're never going to let's see I mean for example you're never gonna get a seven and a half in your in your in your sampling distribution of the sample mean for n is equal to two because it's impossible to get a seven and it's impossible to get an eight so you're never going to get seven in a seven and a half as so maybe when you plot I mean when you plot it maybe you know maybe it looks like this but there'll be a gap at seven and a half because that's impossible and you know maybe it looks something like that so it still won't be a normal distribution when n is equal to two so there's a couple of interesting things here so one thing and I didn't mention this the first time just cuz I really wanted to get the gut sense of what the central limit theorem is the central limit theorem says is n approaches really as it approaches infinity then is when you get the real normal distribution normal distribution but in kind of everyday practice you don't have to get that much beyond N equals two if you get to N equals ten or N equals fifteen you're getting very close to a normal distribution so this converges to a normal distribution very quickly distribution now the other thing is you obviously want many many trials so this is your sample size that is your sample size that's the size of each of your baskets in the very first video I did on this I took a sample as a size of four and the simulation I did in the last video we did sample sizes of four and ten and whatever else this is a sample size of one so that's our sample size so as that approaches infinity your your your actual sampling distribution of the sample of the sample mean will approach a normal distribution now in order to actually see that normal distribution and actually to prove it to yourself you would have to do this many many remember the normal distribution happens this is essentially this is this is kind of the population or this is the random variable that tells you all of the possibilities in real life we seldom know all of the possibilities in fact in real life we seldom know the pure probability generating function only if we're kind of writing it or writing a computer program normally we're doing samples and we're trying to estimate things so normally there's some random variable and then maybe we'll do a bunch of we take a bunch of samples we've take their means and we plot them in and we're gonna get some type of normal distribution let's say we take samples of 100 and we average them we're going to get some normal distribution and in theory as we take as we take those averages hundreds or thousands of times our data set is going to more closely approximate that pure sampling sampling distribution of the sample of the sample mean this thing is a real distribution it's a real distribution with a real mean it's mean it has a pure mean it's mean so the mean of the sample of the sampling the mean of the sampling distribution of the sample mean will write it like that notice I didn't write it as as as just the X with like what this is is this is actually saying this is a real population mean this is a real random variable mean this is this if you looked at every possibility of all of the samples that you can take from your original distribution from some other random original distribution and it took all of the possibilities of let's say sample size let's say we're dealing with the world where a sample size is 10 if you took all of the combinations of 10 samples from some original distribution and you average them out this would be this would describe that function of course in reality if you don't know the original distribution you can't take an infinite samples from it so you don't know you won't know every combination but if you did if you did it with you know if you did it with a thousand if you did the trial a thousand times so a thousand times you took ten samples from some distribution and took a thousand averages and then plotted them you're going to get pretty close you're going to get pretty close now the next thing I want to touch on is what happens is n we know this n approaches infinity it becomes more of a normal distribution but you know as I said already N equals 10 is pretty good and N equals 20 is even better but we saw something in the last video that at least I find pretty interesting let's say you know we start with this crazy distribution up here it really doesn't matter what distribution we're starting with we saw in the simulation that when n is equal to say n is equal to 5 our graph after we try you know we take samples of 5 average them and we do it 10,000 times our graph looks something like this it's kind of wide like that and then when we did n is equal to 10 and is equal to 10 our graph looked a little bit it was actually a little bit squeezed in like that a little bit more so not only was it more normal it you know then that's what the central limit theorem tells us because we're taking larger sample sizes but it had a smaller standard deviation or smaller variance right the mean is going to be the same either case but when our sample size was larger our standard deviation became smaller in fact our standard deviation became smaller than our original population distribution so let me our original probability density function let me show you that with a simulation so let me clear everything and this simulation is as good as any so the first thing I want to show or this distribution is as good as any the first thing I want to show you is that n of 2 is really not that good so and let's compare an N of 2 to let's say an N of 16 so when you compare an n of 2 to an N of 16 you know let's do it once so you go one two trials you average them and then it's going to do it 16 and it's going to plot it down here and average there let's do that 10,000 times so notice when you took an N of 2 even though we did it 10,000 times this is not approaching a normal distribution you can actually see it in the skew and kurtosis numbers it has a rightward positive skew which means it has a longer tail to the right then to the left and then it has a negative kurtosis which means that it's a little bit it has shorter tails and smaller Peaks than on standard normal distribution now when n is equal to 16 you do the same trick so every time we took 16 samples from this from this distribution function up here and averaged them and each of these dots represent an average we did it ten thousand and one times here I notice the mean is the same in both places but here all of a sudden our kurtosis is much smaller and our skew is much smaller so we are more normal in this in this situation but even a more interesting thing is our standard deviation is smaller right this is more squeezed in than that is and it's definitely more squeezed in than our original distribution now let me do it with to let me clear everything again I like this distribution because it's a it's a it's it's a it's a very non normal distribution it looks like a bimodal distribution of some kind and let's take a scenario where I take an N of let's take two good ends let's take an end of 16 that's a nice healthy end let's take an N of 25 and let's compare them a little bit let's compare them a little bit so if we let's just do that I'll do one trial animated just it's always nice to see it so first it's going to do 16 of these trials and average them and there we go then it's going to do 25 of these trials and then average them and then there we go now let's do that what I just did animate it let's do it ten thousand dimes miracles of computers now notice something and this is ten thousand times these are both pretty good approximations of normal distributions the N is equal to twenty five is more normal it has less skew slightly less skew than n is equal to sixteen it has slightly less kurtosis it which means it's it's it's it's closer to being a normal distribution than n is equal to sixteen but even more interesting it's more squeezing it has a lower standard deviation the standard deviation here was two is two point one and the standard deviation here is two point six four so that's another I mean you know and I I kind of touched on that in the last video and it kind of makes sense for every sample you do for your average the more you put in that sample the less standard deviation think of the extreme case if instead of taking 16 samples from our distribution every time or instead of taking 25 if I were to take a million samples from this distribution every time if I were to take a million samples from this distribution every time that sample mean is always going to be pretty darn close to my mean if I take a million samples of everything if I try to essentially try to estimate a mean by taking a million samples I'm going to get a pretty good estimate of that mean the probability that it it you know that a bunch of the Million numbers are all out here is very low so if I if N is a million of course all of my sample all of my sample means when I average them we're all going to be really tightly focused around the mean itself and actually and so you know hopefully that that kind of makes sense to you as well if it doesn't die you know just think about it or even use this use this tool and an experiment with it just so you you can trust that that is really the case and it actually turns out that there's a very clean formula that relates the standard deviation of the original probability distribution function to the standard deviation of the sampling distribution of the sample mean and as you can imagine it is a function of your sample size of how many samples you take out in every basket before you average them and I'll go over that in the next video