Statistics and probability
Interpreting confidence level example.
Want to join the conversation?
- I understand that answer C is the best answer. But I don't understand why answer B is a false one.
If this method produces intervals that capture the true parameter in roughly 90% of the intervals cronstructed, why can not I say that any of these intervals I pick, including (341,359), has a 90% chance of containing the true parameter?
In the previous video "Confidence intervals and margin of error" at5:55, if I got it right, it was said that these two statements are equivalent:
(i) The probability that p̂=0.54 is within 2*sd(p̂) is 95%.
(ii) There is a 95% probability that p is within 2*sd(p̂) of p̂=0.54.
In that case, (0.54-2*sd(p̂) , 0.54+2*sd(p̂)) is one of the many intervals the repeatedly sampling could produce.
Isn't (ii) analogous to say that there is 90% probabilty that the true mean is within (341,359)?(48 votes)
- Think of it this way: if you have a single sample with interval that doesn't overlap with the true population parameter then would it be reasonable to assume this particular interval contains the true parameter 90% of the times? Obviously no, because no matter how many times you repeat the experiment with this particular interval, the true parameter won't be captured. So that's why you take multiple samples with their respective intervals and in 90% of all the samples, those different intervals contain the true parameter.(27 votes)
- Example 2 in the next lesson "Interpreting confidence levels and confidence intervals" addresses the B choice. It seems that the problem is with wording. As far as I understand according to it we should say "We're 90% confident the interval captured the true mean" as opposed to "There's a 90% chance the interval captured the true mean". In my mind the two wordings are equivalent though.
If choice C is a better interpretation and choice B is not necessarily false I think it'd be better if the exercise asked to choose the best interpretation, not the "correct" one.
"Can we say there is a 95% chance that the true mean is between 110 and 120 kilometers per hour?
We shouldn't say there is a 95% chance that this specific interval contains the true mean, because it implies that the mean may be within this interval, or it may be somewhere else. This phrasing makes it seem as if the population mean is variable, but it's not. This interval either captured the mean or didn't. Intervals change from sample to sample, but the population parameter we're trying to capture does not.
It's safer to say we're 95%, percent confident that this interval captured the mean, since this phrasing more closely agrees with the long-term capture rate of confidence levels."
- I think you shouldn't use the words "chance/probability" instead of "confident", Because it implies that the true mean is variable.
There is a 16.67% chance of getting a 1 when roll a die.
This means whenever you roll a die, the number on top of the die will change or vary each time (unlike the true mean, which is fixed).
Like you said, the problem is with the wording. It is one of my weaknesses in mathematics :((4 votes)
- 90% confidence interval is within 1.65 standard deviation, if 90% is being between (341, 359) then how can standard deviation be 25? (1.65 * 25 ≈ 41.25)(9 votes)
- When you calculate confidence interval from the standard deviation of the sample mean (standard error). You need to divide it by the square root of sample size. for more details, please review videos on "Sampling distribution of a sample mean" (https://www.khanacademy.org/math/statistics-probability/sampling-distributions-library)(2 votes)
- According to choice B, You use the words "probability/chance" to an event that has not been determined yet (such as a probability of rolling a die and get 4). But both true mean and interval already have been determined, So it is not about probability anymore.
Am I correct?(2 votes)
- Don't we need to know the population SD to create a confidence interval for sampling dists of sampling means?
Here we have a sample where
n = 30
sample mean = 350
sample SD = 25
And the sampling dist of sample means where n = 30 is approx normal since n is big enough, and has a mean of the true population mean
Using a z-table, we know that there is a 90% chance that a sample mean will land within 1.64 SD of sample means from the true pop mean.
So our 90% confidence interval will look like this:
(350 - 1.64*SD, 350 + 1.64*SD)
SD of sample means is SD of population divided by n (30). But we don't have SD of population, only a sample SD of 25, so we can use the sample SD instead. But the sample SD is a biased estimator of SD pop, so won't out confidence interval be biased as well?(2 votes)
- I honestly don't get why answer D is wrong.If I to do a repeated sampling from the same population, I would indeed get a sample mean from 341kg to 359kg 90 percent of the time (roughly)(1 vote)
- Eugene, it's about interpretation of confidence intervals. And confidence intervals are about population mean. Besides, you are wrong, because (341,359) lies in the interval of one standard deviation that contains 68% of values.(1 vote)
- 30 days over elephant life is sure less than 10%, then we have normal sample distribution of sample mean, the sample mean=350 as given and sample standard deviation is 25,
then, from one side: the 90% confident extends +- 1.65 standard deviation means from 310 to 390.
And, from another if we repeated the process then 90% of the samples will include the true mean within their 90% confidence interval. is that true? if so, what meant by "the 90% confidence interval was 341-359? how this interval match with the Standard Deviation of this sample.(1 vote)
- [Instructor] We are told that a zookeeper took a random sample of 30 days and observed how much food an elephant ate on each of those days. The sample mean was 350 kilograms, and the sample standard deviation was 25 kilograms. The resulting 90% confidence interval for the mean amount of food was from 341 kilograms to 359 kilograms. Which of the following statements is a correct interpretation of the 90% confidence level? So like always, pause this video and see if you can answer this on your own. So before we even look at these choices, let's just make sure we're reading the statement or interpreting the statement correctly. A zookeeper is trying to figure out what the true expected amount of food an elephant would eat on a day. You could view that as the mean amount of food that an elephant would eat on a day. If you view it as the number, all the possible days as the population, you could view this as the population mean for mean amount of food per day. Now, the zookeeper doesn't know that, and so instead they're trying to estimate it by sampling 30 days. So let's think about it this way. If I... So let's say that this is the true population mean, the true mean amount of food that an elephant will eat in a day. What the zookeeper can try to do is, well, they take a sample. In this case, they took a sample of 30 days. And they calculated a sample statistic, in this case, the sample mean of 350 kilograms. I don't know if it's actually to the right of the true parameter, but just for visualization purposes let's say it is. So let's say sample mean, and this is their first sample, it was 350 kilograms. And then using the sample, they were able to construct a confidence interval from 341 to 359 kilograms. And so the confidence interval, I'll draw it like this. We actually aren't sure if it actually overlaps with the true mean like I'm drawing here, but just for the sake of visualization purposes, let's say that this one happened to. The whole point of a 90% confidence level is if I kept doing this, so this is our first sample and the associated interval with that first sample. And then if I did another sample, let's say this is the mean of that next sample, so that's sample mean two, and I have an associated confidence interval. And that interval, not only the start and end points will change, but the actual width of the interval might change depending on what my sample looks like. What a 90% confidence level means, that if I keep doing this, that 90% of my confidence intervals should overlap with the true parameter, with the true population mean. So, now, with that out of the way, let's see which of these choices are consistent with that interpretation. Choice A, the elephant ate between 341 kilograms and 359 kilograms on 90% of all of the days. No, that is definitely not what is going on here. We're not talking about what's happening on 90% of the days, so let's rule this choice out. There is a 0.9 probability that the true mean amount of food is between 341 kilograms and 359 kilograms. So this one is interesting, and it is a tempting choice, because when we do this one sample, you can kind of say, all right, if I did a bunch of these samples, 90% of them, if we have a 90% confidence interval or 90% confidence level, should overlap with this true mean, with the population parameter. The reason why this is a little bit uncomfortable is it makes the true mean sound almost like a random variable, that it could kind of jump around, and it's the true mean that kind of is either gonna jump into this interval or not jump into this interval, so it causes a little bit of unease. So I'm just gonna put a question mark here. In repeated sampling, okay, I like the way that this is starting. In repeated sampling, this method produces intervals. Yep, that's what it does. Every time you sample, you produce an interval. That capture the population mean in about 90% of samples. Yeah, that's exactly what we're talking about. If we just kept doing this, if we have well-constructed 90% confidence intervals, that if we kept doing this, 90% of these constructed sampled intervals should overlap with the true mean. So I like this choice. But let's just read choice D to rule it out. In repeated sampling, this method produces a sample mean between 341 kilograms and 359 kilograms in about 90% of samples. No, the confidence interval does not put a constraint on that 90% of the time you will have a sample mean between these values. It is not trying to do that. It is definitely choice C.