Sampling distribution
Sampling Distribution of the Sample Mean The central limit theorem and the sampling distribution of the sample mean
⇐ Use this menu to view and help create subtitles for this video in many different languages.
You'll probably want to hide YouTube's captions if using these subtitles.
- In the last video, we learned about what is, quite
- possibly, the most profound idea in statistics.
- And that's the central limit theorem.
- And the reason why it's so neat is we can start with any
- distribution that has a well defined mean and variance.
- Actually I made this-- I wrote the standard
- deviation in the last few.
- That should be the mean.
- And let's say it has some variance.
- I could write it like that.
- Or I could write the standard deviation there.
- But as long as it has a well defined mean and standard
- deviation, I don't care what the distribution looks like.
- What I can do is take samples, in the last
- video, of say size 4.
- So in that means I take, literally, four instances
- of this random variable.
- This is one example.
- I take their mean.
- And I consider this the sample mean from my first trial.
- Or, you could almost say, for my first sample.
- I know it's very confusing because you can
- consider that a sample.
- The set to be a sample.
- Or you can consider each of its members of the-- Each member
- of the set as a sample.
- So that can be a little bit confusing there.
- But I have this first sample mean.
- And then I keep doing that over and over.
- In my second sample, my sample size is 4.
- I got four instances of this random variable.
- I average them.
- I have another sample mean.
- And the cool thing about the central limit theorem is, as I
- keep plotting the frequency distribution of my sample
- means, it starts to approach something that approximates
- the normal distribution.
- And it's going to do a better job of approximating that
- normal distribution as n gets larger.
- And just so we have a little terminology on our belt, this
- frequency distribution right here that I plotted out.
- Or here or up here, that I started plotting out.
- That is called-- And it's kind of confusing because we use
- the word sample so much.
- That is called the sampling distribution
- of the sample mean.
- And let's dissect this a little bit.
- Just so that this long description of this
- distribution starts to make a little bit of sense.
- When we say it's the sampling distribution, that's telling us
- that it's being derived from-- It's the distribution of some
- statistic, which in this case, happens to be the sample mean.
- And we're driving it from samples of an original
- distribution.
- So each of these-- So this is my first sample.
- My sample size is 4.
- I'm using the statistic the mean.
- I actually could have done it with other things.
- I could have done the mode or the range or other statistics.
- But the sampling distribution of the sample mean is
- the most common one.
- Is probably in my mind the best place to start learning about
- the central limit theorem.
- And even, frankly, sampling distribution.
- So that's what it's called.
- And just as a little bit of background-- And I'll prove
- this to you experimentally, not mathematically.
- But I think the experimental is, on some levels, more
- satisfying than statistics.
- That this will have the same mean as your original
- distribution right here.
- So it has the same mean.
- But we'll see in the next video that this is actually going
- to be-- It's going to start approximating a
- normal distribution.
- Even though my original distribution that this is
- kind of generated from is completely non-normal.
- So let's do that with this app right here.
- And just to give proper credit where credit is due, this is--
- I think it was developed at Rice University.
- This is from onlinestatbook.com.
- And this is their app, which I think is really neat app
- because it really helps you to visualize what a sampling
- distribution of the sample mean is.
- So I can literally create my own custom distribution here.
- So let me make something kind of crazy.
- So you can do this in theory with a discrete or a continuous
- probability density function.
- But what they have here could take on 1 of 32 values.
- And I'm just going to set the different probabilities of
- getting any of those 32 values.
- So clearly this right here is not a normal distribution.
- It looks a little bit bimodal, but it doesn't have long tails.
- But what I want to do is first just use a simulation to
- understand, or to better understand, what the sampling
- distribution is all about.
- So what I'm going to do I'm going to take-- We'll
- start with 5 at a time.
- So my sample size is going to be 5.
- And so when I click animate, what it's going to do is it's
- going to take five samples from this probability
- distribution function.
- It's going to take five samples and you're going to see
- them when I click animate.
- It's going to average them and plot the average down here.
- And then I'm going to click it again.
- It's going to do it again.
- So there you go.
- I got five samples from there.
- It averaged them.
- And it hit there.
- What did I just do?
- I clicked-- Oh.
- I wanted to clear that.
- Let me make this bottom one none.
- So let me do that over again.
- So I'm going to take 5 at a time.
- So I took five samples from up here.
- And then it took its mean.
- And plotted the mean there.
- Let me do it again.
- Five samples from this probability distribution
- function, plotted it right there.
- I could keep doing-- It'll take some time, but, as you can see,
- I plotted it right there.
- Now, I could do this a thousand times.
- It's going to take forever.
- Let's say I just wanted to do it 1,000 times.
- So it's-- This program, just to be clear, it's actually
- generating the random numbers.
- This isn't like a rigged program.
- It's actually going to generate the random numbers according
- to this probability distribution function.
- It's going to take five at a time, find their means
- and plot the means.
- So if I click 10,000, it's going to do that 10,000 times.
- So it's going to take 5 numbers from here 10,000 times.
- And find their means 10,000 times.
- And then plot the 10,000 means here.
- So let's do that.
- So there you go.
- Notice, it's already looking a lot like a normal distribution.
- And, like I said, the original mean of my crazy
- distribution here was 14.45.
- And the mean of, after doing 10,000 samples or 10,000
- trials, my mean here is 14.42.
- So I'm already getting pretty close to the mean there.
- My standard deviation, you might notice,
- is less than that.
- We'll talk about that in a future video.
- And this skew and kurtosis.
- These are ideas-- These are things that help us measure
- how normal a distribution is.
- And I've talked a little bit about it in the past.
- And let me actually just diverge a little bit.
- Just so it's interesting.
- And they're fairly straightforward concepts.
- Skew literally tells-- So if this is-- Let me do
- it in a different color.
- If this is a perfect normal distribution, and clearly
- my drawing is very far from perfect.
- If that's a perfect distribution, this would
- have a skew of 0.
- If you have a positive skew, that means you have a
- larger right tail than you would've otherwise expect.
- So something with a positive skew might look like this.
- It would have a large tail to the right.
- So this would be a positive skew, which makes it a
- little less than ideal for normal distribution.
- And a negative skew would look like this.
- It has a long tail to the left.
- So negative skew might look like that.
- So that is a negative skew.
- If you have trouble remembering it, just remember which
- direction the tail is going.
- This tail is going towards the negative direction.
- This tail is going to the positive direction.
- So something has no skew, that means that it's nice and
- symmetrical around its mean.
- Now kurtosis, which sounds like a very fancy word, is similarly
- not that fancy of an idea.
- Kurtosis.
- So, once again, if I were to draw a perfect normal
- distribution-- Remember, there is no one normal distribution.
- You could have different means and different
- standard deviations.
- Let's say that's a perfect normal distribution.
- If I have positive kurtosis, what's going to happen is, I'm
- going to have fatter tails.
- Let me draw it a little nicer than that.
- I'm going to have fatter tails, but I'm going to
- have a more pointy peak.
- I didn't have to draw it that pointy.
- Let me draw it like this.
- I'm going to have fatter tails, and I'm going to have a
- more pointy peak than a normal distribution.
- So this, right here, is positive kurtosis.
- So something that has positive kurtosis, depending on how
- positive it is, it tells you it's a little bit more pointy
- than a real normal distribution.
- Positive kurtosis.
- And negative kurtosis has smaller tails, but it's
- smoother near the middle.
- So it's like this.
- So something like this would have negative kurtosis.
- So maybe in future videos, we'll explore that
- in more detail.
- But in the context of the simulation, it's just
- telling us how normal this distribution is.
- So when our sample size was n equal 5, and we did 10,000
- trials, we got pretty close to a normal distribution.
- Let's do another 10,000 trials just to see what happens.
- It looks even more like a normal distribution.
- Our mean is now the exact same number.
- But we still have a little bit of skew and a
- little bit of kurtosis.
- Now let's see what happens if we were to do the same thing
- with a larger sample size.
- And we could actually do them simultaneously.
- So here's n equal 5.
- Let's do here n equals 25.
- Let's let me clear them.
- I'm going to do the sample-- sampling distribution
- of the sample mean.
- As I'm going to run 10,000 trials-- So I'll do one
- animated trial, just so you remember what's going on.
- So I'm literally taking first 5 samples from up here.
- Find their mean.
- Now I'm taking 25 samples from up here.
- Find it's mean.
- And then plotting it down here.
- So here the sample size is 25.
- Here it's 5.
- I'll do it one more time.
- I take 5, get the mean, plot it.
- Take 25, get the mean, and then plot it down there.
- This is a larger sample size.
- Now that thing that I just did, I'm going to do 10,000 times.
- And that's interest-- Remember, our first distribution was just
- this really crazy, very non-normal distribution.
- But once we did it-- whoops.
- I didn't want to make it that big.
- But once we-- Scroll up a little bit.
- So here, what's interesting.
- They both look a little normal.
- But if you look at the skew and the kurtosis when our
- sample size is larger, it's more normal.
- This has a lower skew than when our sample size was only 5.
- And it has a less negative kurtosis then when our
- sample size was 5.
- So this is a more normal distribution.
- And, one thing that we're going to explore further in a future
- video, is not only is it more normal in it's shape, but it's
- also tighter fit around the mean.
- And you can even think about why that kind of make sense.
- When you're sample size is larger, your odds of getting
- really far away from the mean is lower.
- Because it's very low likelihood if you're taking 25
- samples or 100 samples that you're just going to get a
- bunch of stuff way out here, a bunch of stuff way out here.
- You're very likely to get a reasonable spread of things.
- So it makes sense that your mean-- your sample mean is
- less likely to be far away from the mean.
- We're going to talk a little bit more about
- that in the future.
- But hopefully this kind of satisfies you, at
- least experimentally.
- I haven't proven it to you with mathematical rigor, which
- hopefully we'll do in the future.
- But hopefully this satisfies you, at least experimentally,
- that the central limit theorem really does apply to
- any distribution.
- I mean this is a crazy distribution.
- I encourage you to use this applet at onlinestatbook.com
- and experiment with other crazy distributions to
- believe for yourself.
- But the interesting things are that we're approaching a normal
- distribution, but as my sample size got larger, it's a better
- fit for normal distribution.
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
|
Have something that's not a question about this content? |
This discussion area is not meant for answering homework questions.
Where on the onlinestatbook site is this little software toy?
Thanks, John
Thanks, John
John,
http://onlinestatbook.com/ click on "content" in the upper left "List of Simulations and Demonstrations" scroll to the bottom of the page "Simulations from the Rice Virtual Lab in Statistics" and it's the second one down "Sampling Distribution Simulation". Hope that helps.
http://onlinestatbook.com/ click on "content" in the upper left "List of Simulations and Demonstrations" scroll to the bottom of the page "Simulations from the Rice Virtual Lab in Statistics" and it's the second one down "Sampling Distribution Simulation". Hope that helps.
This is not about the vid (srry), but is your name Firstjohn26 have anything to do with 1 John 2:6 (Whoever claims to live in him must live as Jesus did)?
I have a practice question that I just can't figure out. It is: "Eighteen subjects are randomly selected and given proficiency tests. The mean for this group is 492.3 and the standard deviation is 37.6. Construct the 98% confidence interval for the population standard deviation."
I don't know how to figure out the confidence interval for a standard deviation. Can you please help. Thanks. Katie
I don't know how to figure out the confidence interval for a standard deviation. Can you please help. Thanks. Katie
We already know that:
A range from -1 std.dev. to 1 std.dev. contains 68.3% of outcomes.
A range from -2 std.dev. to 2 std.dev. contains 95.4% of outcomes.
A range from -3 std.dev. to 3 std.dev. contains 99.7% of outcomes.
So the question is, how many Std.Dev's do we have to move away from the mean in both directions on the graph to contain 98% of outcomes. Not 95.4%, Not 99.7%, exactly 98%. Right away you know the answer will be between 2 and 3 std.dev's, as 98% is between 95.4% and 99.7%
To
A range from -1 std.dev. to 1 std.dev. contains 68.3% of outcomes.
A range from -2 std.dev. to 2 std.dev. contains 95.4% of outcomes.
A range from -3 std.dev. to 3 std.dev. contains 99.7% of outcomes.
So the question is, how many Std.Dev's do we have to move away from the mean in both directions on the graph to contain 98% of outcomes. Not 95.4%, Not 99.7%, exactly 98%. Right away you know the answer will be between 2 and 3 std.dev's, as 98% is between 95.4% and 99.7%
To
N=18, mean=492.3, std deviation=37.6. The z-score for a 98% confidence interval is 2.325 (approx). The formula for a confidence interval is X (sample mean) +/- Z*(std deviation)/sqrt(N). Thus, the confidence interval at 98% is 492.3+/-(2.325*37.6/sqrt(18)) or 492.3 +/- 20.6. What this really means is that the mean of the population will fall within 471.7 and 512.9 98% of the time.
If I'm wrong please let me know!
If I'm wrong please let me know!
Well, I'm not sure how exactly what language the answer must be in, but hopefully I can help with the theory:
If we assume a normal distribution, which is a pivotal assumption and is not explicitly stated, (if the distribution is not normal, the question does not give enough info for an answer) the question seems to be asking:
How wide would a range of the distribution, centered on the mean, have to be to contain 98% of outcomes?
We already know that:
A range from -1 std.dev. to 1 std.dev. c
If we assume a normal distribution, which is a pivotal assumption and is not explicitly stated, (if the distribution is not normal, the question does not give enough info for an answer) the question seems to be asking:
How wide would a range of the distribution, centered on the mean, have to be to contain 98% of outcomes?
We already know that:
A range from -1 std.dev. to 1 std.dev. c
hi...um arent we suppose to use the t.distribution instead of the z because n(the sample size) is less than 30 and one of the properties to using the t.table is when n is less than 30
Because the distribution is symmetrical, you know that 100 - 99 = 1% will be below the negative of the std.dev. level Excel gives you. And then you have the std.dev. levels between which 98% of outcomes lie. Then, of course multiply your std.dev level by the std.dev given in the question, subtract it from the mean for the lower end of the range and add it to the mean for the upper end of the range. That's it. That's your 98% confidence interval.
To find the exact answer, pull up Excel and pick a random cell. Type in "=NormsInv(0.99)"
Why not 0.98? NormsInv is a cumulative function, so there is no way to specify only between -x std.dev's and x std.dev's. It will only give you the std.dev. below which y% of outcomes fall. So, this will tell you the std.dev. level below which 99% of outcomes fall. Because the distribution is symmetrical, you know that 100 - 99 = 1% will be below the negative of the std.dev. level Excel gives you. And then
Why not 0.98? NormsInv is a cumulative function, so there is no way to specify only between -x std.dev's and x std.dev's. It will only give you the std.dev. below which y% of outcomes fall. So, this will tell you the std.dev. level below which 99% of outcomes fall. Because the distribution is symmetrical, you know that 100 - 99 = 1% will be below the negative of the std.dev. level Excel gives you. And then
If we know the mean and the standard deviation of the population, then why are we taking samples, if we already have the data?
Thanks in advance.
Thanks in advance.
Learning statistics can be a little strange. It almost seems like you're trying to lift yourself up by your own bootstraps. Basically, you learn about populations working under the assumption that you know the mean/stdev, which is silly, as you say, but later you begin to drop these assumptions and learn to make inferences about populations based on your samples.
Once you have some version of the Central Limit Theorem, you can start answering some interesting questions, but it takes a lot of study just to get there!
Once you have some version of the Central Limit Theorem, you can start answering some interesting questions, but it takes a lot of study just to get there!
Is there any difference if I take 1 "sample" with 100 "instances", or I take 100 "samples" with 1 "instance"?
(By sample I mean the S_1 and S_2 and so on. With instances I mean the numbers, [1,1,3,6] and [3,4,3,1] and so on.)
(By sample I mean the S_1 and S_2 and so on. With instances I mean the numbers, [1,1,3,6] and [3,4,3,1] and so on.)
There is a difference. Your "samples" (random selections of values "x") that are made up of "instances" (referred to as the variable "n") provide what will essentially be the building blocks of your Sampling Distribution of the Sample Mean. Because your "instances" determine the value of the mean of "x", your size of "n" determines the value of "x"'s mean, and the Sampling Distribution of the Sample Mean's standard deviation (Defined as The original dataset's standard deviation divided by the square root of "n").
For example: If you were to take 1 "sample" with 100 "instances", you would get only one piece of data regarding the mean of 100 items [1,1,3,6,3,6,3,1,1,1,1,1...] from your original data. Your sampling distribution of the Sample mean's standard deviation would have a value of ((The original sample's S.D.)/(The square root of 100)), but that wouldn't really matter, because your data will likely be very close to your original data's mean, and you'd only have one sample.
Now if you take 100 samples with 1 instance [3], you'll get many pieces of data, but no change in standard deviation from your first sample: ((The original sample's S.D.)/(The square root of 1)). Functionally, with enough samples taken like this, you'll re-create your original dataset! You won't be creating a useful sampling distribution of the sample mean because "x" will equal the mean of "x". With 100 "samples" of 1 "instance", you're randomly picking 100 values of "x" and re-plotting them.
I hope that helps.
For example: If you were to take 1 "sample" with 100 "instances", you would get only one piece of data regarding the mean of 100 items [1,1,3,6,3,6,3,1,1,1,1,1...] from your original data. Your sampling distribution of the Sample mean's standard deviation would have a value of ((The original sample's S.D.)/(The square root of 100)), but that wouldn't really matter, because your data will likely be very close to your original data's mean, and you'd only have one sample.
Now if you take 100 samples with 1 instance [3], you'll get many pieces of data, but no change in standard deviation from your first sample: ((The original sample's S.D.)/(The square root of 1)). Functionally, with enough samples taken like this, you'll re-create your original dataset! You won't be creating a useful sampling distribution of the sample mean because "x" will equal the mean of "x". With 100 "samples" of 1 "instance", you're randomly picking 100 values of "x" and re-plotting them.
I hope that helps.
Sal goes over this better than I do in the next video as well!
Could you define a measure of skewness as (mean-median)/standard deviation? An advantage of this would be that it is easier to calculate, and it can only take values between -1 and 1
There were two cases talked about; n=5 and n=25. It was said that after 10,000 samples the n=25 was a closer fit to the normal distribution than the n=5 case. What I want to know is, if there were infinite samples, would the n=5 and the n=25 cases both be a perfect normal distribution?
If this is so: As the number of samples tends to infinity, does the n=25 case converge to the normal distribution faster than the n=5 case?
If this is so: As the number of samples tends to infinity, does the n=25 case converge to the normal distribution faster than the n=5 case?
This is answered in the next video in the series.
I'm having some issues with this question.
3. For the general population, mean IQ is 100 with a standard deviation of 15. A sample of 100 people is selected at random from the population, with a sample mean of 102. This sample mean comes from a distribution of sample means with the following properties:
a. a mean of 100 and a standard error of 1.5
b. a mean of 102 and a standard error of 1.5
c. a mean of 100 and a standard error of 15
d. a mean of 102 and a standard error of 15
I think that the answer is either a or b, because you would divide the SD 15 by the square root of the original mean 10, which gives 1.5. But I have no idea what to do about the mean 100/102? Can anyone explain why it is one or the other?
3. For the general population, mean IQ is 100 with a standard deviation of 15. A sample of 100 people is selected at random from the population, with a sample mean of 102. This sample mean comes from a distribution of sample means with the following properties:
a. a mean of 100 and a standard error of 1.5
b. a mean of 102 and a standard error of 1.5
c. a mean of 100 and a standard error of 15
d. a mean of 102 and a standard error of 15
I think that the answer is either a or b, because you would divide the SD 15 by the square root of the original mean 10, which gives 1.5. But I have no idea what to do about the mean 100/102? Can anyone explain why it is one or the other?
THe general population is known to have a mean IQ of 100. That means that the distribution of sample means also has a mean of 100.
I have a question m failing to solve. ' A population has a mean of 200 and a standard deviation of 50. A simple random sample of size 100 will be taken and the sample mean x will be used to estimate the population mean. Show the sampling distribution of the sample mean
Me and my friend Callum have been experimenting with sampling distribution progran on online stat book used by Sal (http://onlinestatbook.com/stat_sim/sampling_dist/index.html). However we found a result we cannot explain nor rationalise: When we ask for a sample size of 2 for the median disribution of any population it aproximates the population distribution and not a 'bell curve'. I am very disturbed by this because surely the median of 2 numbers is the same as the mean of 2 numbers and according to the central limit theorem should approximate a normal distribution. Is this assumption correct? Is the programme wrong? Or is there something we fail to understand?
What I don't understand is when you have a large Binary distribution for example, and you approximate it using Normal distribution.. If you only have one sample consisting of x values, you haven't got a standard deviation really.. we always have those kinds of questions on the exam but i always get the formula wrong then..
As long as you know all the values in the sample, you can do the series of calculations described under "basic examples" here http://en.wikipedia.org/wiki/Standard_deviation to figure out what the sample's standard deviation is. Of course, you have to divide by N-1 with samples like the wikipedia article (as well as Sal's video on standard deviation) explains, otherwise it's exactly the same. Perhaps you are limiting your definition of "standard deviation" to "standard deviation of population", which you of course can't figure out with just one sample of values? If it's not specified that the population's SD is asked for in the exam question you're describing, it's safe to assume that they are asking for the sample's SD.
only the mean follows the CLT ?
What would be the difference between the distribution of a sample variable and the sampling distribution of the mean?..? I'm so confused between these two terms
are sample mean and population mean the same? while solving ques for confidence intervals why do we always subtract the sample mean from the value when the formula includes population mean?
9:08, how do you get five samples from the non-normally distributed probability function? How do you get a set of data from the probability function?
Computers can quite easily simulate uniform distributions (for example the rand() function in matlab that gives a number between 0 and 1 accordingly to an uniform distribution). With that number you can simulate all sorts of other distributions.
For example if you want to simulate a fair dice you do :
x = rand(1)
if (x<1/6) then y = 1
elseif (x<2/6) then y = 2
elseif (x<3/6) then y = 3
elseif (x<4/6) then y = 4
elseif (x<5/6) then y = 5
else y=6
This is how you can simulate easily discrete distributions.
For example if you want to simulate a fair dice you do :
x = rand(1)
if (x<1/6) then y = 1
elseif (x<2/6) then y = 2
elseif (x<3/6) then y = 3
elseif (x<4/6) then y = 4
elseif (x<5/6) then y = 5
else y=6
This is how you can simulate easily discrete distributions.
I'm a little confused about what you're doing at 04:40. Lets say the PDF represents the 32 species of animals on a small island. So that application selects 5 types of animals lets say zebras, goats, penguins, gorillas and porcupines and plots their mean on the graph below. How the hell can you get the mean of a set of 5 species of animals? I don't get it.
@cnidoblast, selecting 5 types of animals invalidates the CLT. One of the assumptions of the most common CLT (there are actually many versions, this one is the most common) is that the observations, what Mr. Khan calls samples, are independent and identically distributed instances of a random variable. A random variable is a function that converts an observation from a random process in to a number. Your animals are not numbers, so it's meaningless to sum them much less find the mean. If you're talking about averaging their weights then it still fails the CLT assumptions because the weights that you're averaging do not come from an identical distribution. That is, the distribution of weights of zebras is very different from the distribution of weights of goats. Hope this helps! :)
@Rwlantz I could be wrong, but I think I disagree with your part of your answer when you say "If you're talking about averaging their weights then it still fails the CLT assumptions because the weights that you're averaging do not come from an identical distribution".
Let's say there's 2 types of animals to make things simple (20 zebras and 12 birds). The probability of weight being 5kg = 37.5%, probability of weight being 330kg = 62.5%. Even if the zebras all have different weights, you can still do this, but I'm making it simple and assuming tehy have identical weights for the sake of the example). Then randomly select 5 trials with that distribution. The sample might give a result of S1[5,330,330,330,5] , you'd take the average of that sample and then repeat n times, and as n approaches infinite the CLT should show a distribution with the mean approaching the mean of ~208kg (5kg*37.5% + 330kg*62.5%).
If every zebra and ever bird had a different weight, then instead of probabilities of 62.5% and 37.5%, each naimal would have a probability of 1/32 and you'd do the same process as above.
Let me know if you can spot any errors with my answer.
Let's say there's 2 types of animals to make things simple (20 zebras and 12 birds). The probability of weight being 5kg = 37.5%, probability of weight being 330kg = 62.5%. Even if the zebras all have different weights, you can still do this, but I'm making it simple and assuming tehy have identical weights for the sake of the example). Then randomly select 5 trials with that distribution. The sample might give a result of S1[5,330,330,330,5] , you'd take the average of that sample and then repeat n times, and as n approaches infinite the CLT should show a distribution with the mean approaching the mean of ~208kg (5kg*37.5% + 330kg*62.5%).
If every zebra and ever bird had a different weight, then instead of probabilities of 62.5% and 37.5%, each naimal would have a probability of 1/32 and you'd do the same process as above.
Let me know if you can spot any errors with my answer.
There's a major difference between categorical and numerical data. Categorical data are things that cannot be quantified, like the name of a species. Numerical data are things that can be represented by numbers, like the mass of an animal or the number of offspring it has (the former is an example of continuous numerical data, the latter is an example of discrete numerical data). From what I understand, the central limit theorem (and therefore the sampling distribution) only applies to numerical data, not categorical.
A manufacturer knows that their items have a normally distributed lifespan, with a mean of 2.6 years, and standard deviation of 0.5 years.
If you randomly purchase 25 items, what is the probability that their mean life will be longer than 3 years?
If you randomly purchase 25 items, what is the probability that their mean life will be longer than 3 years?
Discuss the site
For general discussions about Khan Academy, visit our Reddit discussion page.
Flag inappropriate posts
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
abuse
- disrespectful or offensive
- an advertisement
not helpful
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
wrong category
- a tip or feedback in Questions
- a question in Tips & Feedback
- an answer that should be its own question
about the site
Share a tip
Suggest a fix
Have something that's not a tip or feedback about this content?
This discussion area is not meant for answering homework questions.