If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Interpreting confidence level example

AP.STATS:
UNC‑4.F (LO)
,
UNC‑4.F.1 (EK)
,
UNC‑4.F.2 (EK)
,
UNC‑4.F.3 (EK)
,
UNC‑4.F.4 (EK)
Interpreting confidence level example.

Want to join the conversation?

  • primosaur ultimate style avatar for user Ricardo Koubik Saldanha
    I understand that answer C is the best answer. But I don't understand why answer B is a false one.
    If this method produces intervals that capture the true parameter in roughly 90% of the intervals cronstructed, why can not I say that any of these intervals I pick, including (341,359), has a 90% chance of containing the true parameter?

    In the previous video "Confidence intervals and margin of error" at , if I got it right, it was said that these two statements are equivalent:
    (i) The probability that p̂=0.54 is within 2*sd(p̂) is 95%.
    (ii) There is a 95% probability that p is within 2*sd(p̂) of p̂=0.54.
    In that case, (0.54-2*sd(p̂) , 0.54+2*sd(p̂)) is one of the many intervals the repeatedly sampling could produce.
    Isn't (ii) analogous to say that there is 90% probabilty that the true mean is within (341,359)?
    (44 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Yan K
      Think of it this way: if you have a single sample with interval that doesn't overlap with the true population parameter then would it be reasonable to assume this particular interval contains the true parameter 90% of the times? Obviously no, because no matter how many times you repeat the experiment with this particular interval, the true parameter won't be captured. So that's why you take multiple samples with their respective intervals and in 90% of all the samples, those different intervals contain the true parameter.
      (23 votes)
  • duskpin ultimate style avatar for user Anwar
    Example 2 in the next lesson "Interpreting confidence levels and confidence intervals" addresses the B choice. It seems that the problem is with wording. As far as I understand according to it we should say "We're 90% confident the interval captured the true mean" as opposed to "There's a 90% chance the interval captured the true mean". In my mind the two wordings are equivalent though.

    If choice C is a better interpretation and choice B is not necessarily false I think it'd be better if the exercise asked to choose the best interpretation, not the "correct" one.

    "Can we say there is a 95% chance that the true mean is between 110 and 120 kilometers per hour?

    We shouldn't say there is a 95% chance that this specific interval contains the true mean, because it implies that the mean may be within this interval, or it may be somewhere else. This phrasing makes it seem as if the population mean is variable, but it's not. This interval either captured the mean or didn't. Intervals change from sample to sample, but the population parameter we're trying to capture does not.
    It's safer to say we're 95%, percent confident that this interval captured the mean, since this phrasing more closely agrees with the long-term capture rate of confidence levels."
    https://www.khanacademy.org/math/statistics-probability/confidence-intervals-one-sample/introduction-to-confidence-intervals/a/interpreting-confidence-levels-and-confidence-intervals
    (11 votes)
    Default Khan Academy avatar avatar for user
    • duskpin sapling style avatar for user Nathan Young
      I think you shouldn't use the words "chance/probability" instead of "confident", Because it implies that the true mean is variable.
      Consider this:
      There is a 16.67% chance of getting a 1 when roll a die.
      This means whenever you roll a die, the number on top of the die will change or vary each time (unlike the true mean, which is fixed).

      Like you said, the problem is with the wording. It is one of my weaknesses in mathematics :(
      (3 votes)
  • male robot donald style avatar for user Kamyar Nazeri
    90% confidence interval is within 1.65 standard deviation, if 90% is being between (341, 359) then how can standard deviation be 25? (1.65 * 25 ≈ 41.25)
    (8 votes)
    Default Khan Academy avatar avatar for user
  • duskpin ultimate style avatar for user Akshay L Aradhya
    All the answers seem right to me :/
    (5 votes)
    Default Khan Academy avatar avatar for user
  • male robot johnny style avatar for user Mohamed Ibrahim
    Wish Sal makes a video on why choice B is false
    (4 votes)
    Default Khan Academy avatar avatar for user
  • duskpin sapling style avatar for user Nathan Young
    According to choice B, You use the words "probability/chance" to an event that has not been determined yet (such as a probability of rolling a die and get 4). But both true mean and interval already have been determined, So it is not about probability anymore.

    Am I correct?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • boggle blue style avatar for user Bryan
    Don't we need to know the population SD to create a confidence interval for sampling dists of sampling means?

    Here we have a sample where
    n = 30
    sample mean = 350
    sample SD = 25

    And the sampling dist of sample means where n = 30 is approx normal since n is big enough, and has a mean of the true population mean

    Using a z-table, we know that there is a 90% chance that a sample mean will land within 1.64 SD of sample means from the true pop mean.
    So our 90% confidence interval will look like this:
    (350 - 1.64*SD, 350 + 1.64*SD)

    SD of sample means is SD of population divided by n (30). But we don't have SD of population, only a sample SD of 25, so we can use the sample SD instead. But the sample SD is a biased estimator of SD pop, so won't out confidence interval be biased as well?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Ahmed Nasret
    30 days over elephant life is sure less than 10%, then we have normal sample distribution of sample mean, the sample mean=350 as given and sample standard deviation is 25,
    then, from one side: the 90% confident extends +- 1.65 standard deviation means from 310 to 390.
    And, from another if we repeated the process then 90% of the samples will include the true mean within their 90% confidence interval. is that true? if so, what meant by "the 90% confidence interval was 341-359? how this interval match with the Standard Deviation of this sample.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user ilya112358
    Wait! In the previous videos the idea of confidence interval was introduced for a sampling distribution of sample proportion. But this example suddenly jumps to a sampling distribution of sample mean. The idea of confidence interval for that was not yet introduced. Confusing.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Isaac Friesen
    Suppose 10% of the tubes produced a machine is defective. If a sample of 100 tubes is inspected at random, find the following.

    a) The expected proportion of defectives in the sample.
    (1 vote)
    Default Khan Academy avatar avatar for user

Video transcript

- [Instructor] We are told that a zookeeper took a random sample of 30 days and observed how much food an elephant ate on each of those days. The sample mean was 350 kilograms, and the sample standard deviation was 25 kilograms. The resulting 90% confidence interval for the mean amount of food was from 341 kilograms to 359 kilograms. Which of the following statements is a correct interpretation of the 90% confidence level? So like always, pause this video and see if you can answer this on your own. So before we even look at these choices, let's just make sure we're reading the statement or interpreting the statement correctly. A zookeeper is trying to figure out what the true expected amount of food an elephant would eat on a day. You could view that as the mean amount of food that an elephant would eat on a day. If you view it as the number, all the possible days as the population, you could view this as the population mean for mean amount of food per day. Now, the zookeeper doesn't know that, and so instead they're trying to estimate it by sampling 30 days. So let's think about it this way. If I... So let's say that this is the true population mean, the true mean amount of food that an elephant will eat in a day. What the zookeeper can try to do is, well, they take a sample. In this case, they took a sample of 30 days. And they calculated a sample statistic, in this case, the sample mean of 350 kilograms. I don't know if it's actually to the right of the true parameter, but just for visualization purposes let's say it is. So let's say sample mean, and this is their first sample, it was 350 kilograms. And then using the sample, they were able to construct a confidence interval from 341 to 359 kilograms. And so the confidence interval, I'll draw it like this. We actually aren't sure if it actually overlaps with the true mean like I'm drawing here, but just for the sake of visualization purposes, let's say that this one happened to. The whole point of a 90% confidence level is if I kept doing this, so this is our first sample and the associated interval with that first sample. And then if I did another sample, let's say this is the mean of that next sample, so that's sample mean two, and I have an associated confidence interval. And that interval, not only the start and end points will change, but the actual width of the interval might change depending on what my sample looks like. What a 90% confidence level means, that if I keep doing this, that 90% of my confidence intervals should overlap with the true parameter, with the true population mean. So, now, with that out of the way, let's see which of these choices are consistent with that interpretation. Choice A, the elephant ate between 341 kilograms and 359 kilograms on 90% of all of the days. No, that is definitely not what is going on here. We're not talking about what's happening on 90% of the days, so let's rule this choice out. There is a 0.9 probability that the true mean amount of food is between 341 kilograms and 359 kilograms. So this one is interesting, and it is a tempting choice, because when we do this one sample, you can kind of say, all right, if I did a bunch of these samples, 90% of them, if we have a 90% confidence interval or 90% confidence level, should overlap with this true mean, with the population parameter. The reason why this is a little bit uncomfortable is it makes the true mean sound almost like a random variable, that it could kind of jump around, and it's the true mean that kind of is either gonna jump into this interval or not jump into this interval, so it causes a little bit of unease. So I'm just gonna put a question mark here. In repeated sampling, okay, I like the way that this is starting. In repeated sampling, this method produces intervals. Yep, that's what it does. Every time you sample, you produce an interval. That capture the population mean in about 90% of samples. Yeah, that's exactly what we're talking about. If we just kept doing this, if we have well-constructed 90% confidence intervals, that if we kept doing this, 90% of these constructed sampled intervals should overlap with the true mean. So I like this choice. But let's just read choice D to rule it out. In repeated sampling, this method produces a sample mean between 341 kilograms and 359 kilograms in about 90% of samples. No, the confidence interval does not put a constraint on that 90% of the time you will have a sample mean between these values. It is not trying to do that. It is definitely choice C.