If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Inferring population mean from sample mean

Much of statistics is based upon using data from a random sample that is representative of the population at large. From that sample mean, we can infer things about the greater population mean. We'll explain. Created by Sal Khan.

Want to join the conversation?

Video transcript

Let's say you're trying to design some type of a product for men. One that is somehow based on their height. And the product is for the United States. So ideally, you would like to know the mean height of men in the United States. Let me write this down. So how would you do that? And when I talk about the mean, I'm talking about the arithmetic mean. If I were to talk about some other types of means-- and there are other types of means, like the geometric mean-- I would specify it. But when people just say mean, they're usually talking about the arithmetic mean. So how would you go about finding the mean height of men in the United States? Well, the obvious one is, is you go and ask every or measure every man in the United States. Take their height, add them all together, and then divide by the number of men there are in the United States. But the question you'd ask yourself is whether that is practical. Because you have on the order-- let's see, there's about 300 million people in the United States. Roughly half of them will be men, or at least they'll be male, and so you will have 150 million, roughly 150 million men in the United States. So if you wanted the true mean height of all of the men United States, you would have to somehow survey-- or not even survey. You would have to be able to go and measure all 150 million men. And even if you did try to do that, by the time you're done, many of them might have passed away, new men will have been born, and so your data will go stale immediately. So it is seemingly impossible, or almost impossible, to get the exact height of every man in the United States in a snapshot of time. And so, instead, what you do is say, well, look, OK, I can't get every man, but maybe I can take a sample. I could take a sample of the men in the United States. And I'm going to make an effort that it's a random sample. I don't want to just go sample 100 people who happen to play basketball, or played basketball for their college. I don't want to go sample 100 people who are volleyball players. I want to randomly sample. Maybe the first person who comes out of the mall in a random town, or in several towns, or something like that. Something that should not be based in any way, or skewed in any way, by height. So you take a sample and from that sample you can calculate a mean of at least the sample. And you'll hope that that is indicative of-- especially if this was a reasonably random sample-- you'll hope that was indicative of the mean of the entire population. And what you're going to see in much of statistics it is all about using information, using things that we can calculate about a sample, to infer things about a population. Because we can't directly measure the entire population. So for example, let's say-- And if you're actually trying to do this, I would recommend doing at least 100 data points, or 1,000, and later on we'll talk about how you can think about whether you've measured enough or how confident you can be. But let's just say you're a little bit lazy, and you just sample five men. And so you get their five heights. Let's say one is 6.2 feet. Let's say one is 5.5 feet-- 5.5 feet would be 5 foot, 6 inches. Let's say one ends up being 5.75 feet. Another one is 6.3 feet. Another is 5.9 feet. Now, if these are the ones that you happen to sample, what would you get for the mean of this sample? Well let's get our calculator out. And we get 6.2 plus 5.5 plus 5.75 plus 6.3 plus 5.9. The sum is 29.65. And then we want to divide by the number of data points we have. So we have five data points. So let's divide 29.65 divided by 5, and we get 5.93 feet. So here, our sample mean-- and I'm going to denote it with an x with a bar over it, is-- and I already forgot the number-- 5.93 feet. This is our sample mean, or, if we want to make it clear, sample arithmetic mean. And when we're taking this calculation based on a sample, and somehow we're trying to estimate it for the entire population, we call this right over here, we call it a statistic. Now, you might be saying, well, what notation do we use if, somehow, we are able to measure it for the population? Let's say we can't even measure it for the population, but we at least want to denote what the population mean is. Well if you want to do that, the population mean is usually denoted by the Greek letter mu. And so in a lot of statistics, it's calculating a sample mean in an attempt to estimate this thing that you might not know, the population mean. And these calculations on the entire population, sometimes you might be able to do it. Oftentimes, you will not be able to do it. These are called parameters. So what you're going to find in much of statistics, it's all about calculating statistics for a sample, finding these sample statistics in order to estimate parameters for an entire population. Now the last thing I want to do is introduce you to some of the notation that you might see in a statistics textbook that looks very math-y and very difficult. But hopefully, after the next few minutes, you'll appreciate that it's really just doing exactly what we did here-- adding up the numbers and dividing by the number of numbers you add. If you had to do the population mean, it's the exact same thing. It's just many, many more numbers in this context. You have to add up 150 million numbers and divide by 150 million. So how do mathematicians talk about an operation like that-- adding up a bunch of numbers and then dividing by the number of numbers? Let's first think about the sample mean, because that's where we actually did the calculation. So a mathematician might call each of these data points-- let's say they'll call this first one right over here x sub 1. They'll call this one x sub 2. They'll call this one x sub 3. They'll call this one-- when I say sub, I'm really saying subscript 1, subscript 2, subscript 3. They could call this x subscript 4. They could call this x subscript 5. And so if you had n of these you would just keep going. x subscript 6, x subscript seven, all the way to x subscript n. And so to take the sum of all of these, they would denote it as let me write it right over here. So they will say that the sample mean is equal to the sum of all my x sub i's-- so the way you can conceptualize it, these i's will change. In this case, the i started at 1. The i's are going to start at 1 until the size of our actual sample. So all the way until n. In this case n was equal to 5. So this is literally saying this is equal to x sub 1 plus x sub 2 plus x sub 3, all the way to the nth one. Once again, in this case, we only had five. Now, are we done? Is this what the sample mean is? Well, no, we aren't done. We don't just add up all of the data points. We then have to divide by the number of data points there are. So this might look like very fancy notation, but it's really just saying, add up your data points and divide by the number of data points you have. And this capital Greek letter sigma literally means sum. Sum all of the x i's, from x sub 1 all the way to x sub n, and then divide by the number of data points you have. Now let's think about how we would denote the same thing but, instead of for the sample mean, doing it for the population mean. So the population mean, they will denote it with mu, we already talked about that. And here, once again you're going to take the sum, but this time it's going to be the sum of all of the elements in your population. So your x sub i's-- and you'll still start at i equals 1. But it usually gets denoted that, hey you're taking the whole population, so they'll often put a capital N right over here to somehow denote that this is a bigger number than maybe this smaller n. But once again, we are not done. We have to divide by the number of data points that we are actually summing. And so this, once again, is the same thing as x sub 1 plus x sub 2 plus x sub 3-- all the way to x sub capital N, all of that divided by capital N. And once again, in this situation, we found this practical. We found this impractical. We can debate whether we took enough data points on our sample mean right over here. But we're hoping that it's at least somehow indicative of our population mean.