Statistics and probability
- Qualitative sense of normal distributions
- Normal distribution problems: Empirical rule
- Standard normal distribution and the empirical rule (from ck12.org)
- More empirical rule and z-score practice (from ck12.org)
- Empirical rule
- Normal distributions review
More Empirical Rule and Z-score practice. Created by Sal Khan.
Want to join the conversation?
- It seems to be completely arbitrary when one must subtract the z-score table value from 1. Sometimes you do both numbers, sometimes neither, sometimes only one of the two. What am I missing??(8 votes)
- When you're starting (and sometimes after you're used to it), it helps a lot to draw a picture of a bell curve and shade in the part that you're trying to measure. Remember (and they usually draw a picture on the z-score table to drive this point home) that the z-score measures the whole area from your point to the left. So that's great if you want the left tail of a distribution, but if you want a right tail then you need to calculate the left tail and subtract it from 1 (which is the area of the entire standard normal curve). And if you want the area between two points, you need to calculate the left tail of the higher number and then subtract the left tail of the lower number to rule out the part of the curve that you don't want to measure. And if you want to measure the percentage of both tails, then you need to measure the left tail of the higher number, subtract that from one to give you the right tail of the higher number, and then add in the left tail of the lower number to get the area you want from the z-score table.(19 votes)
- Sal and Team, why is a z-table value (that corresponds to a z-score) represented as say 0.5786 instead of the final 57.86% we have to give as an answer in the exercises? Why not represent the percentage value from the get go, so we don't have to multiply it by a hundred to get the % value? Just curious. Thanks.(4 votes)
- I think it's mostly because the z-score is an intermediate step in a math problem, and the custom of expressing decimals as percentages is an end-of-the-problem step to put the probability in a form that is familiar to people. If tables gave z-scores as percentages, for most of the problems you did you would have to turn them back into decimals to finish whatever problem you were solving.(5 votes)
- Why does the actual z score computed from the table differ from the percentage calculated using the empirical rule? P(Z > 2) is = .5 - P(0 <= Z <= 2) = .5 -.4772 = .0228 which is 2.28% not 2.5%. Did I make a mistake in calculations? I understand if this is a little pedantic but my mathematics professors demand precision and 2.3% is not equal to 2.5%. Any clarification would be greatly appreciated.(4 votes)
- The Empirical Rule is just an approximation. It's meant to be a rough, easily calculable rule of thumb. I think it's really meant to be something that people can remember, think of, and assess "on the fly" - it's much easier to multiply something by 2 in your head than by 1.95 !(5 votes)
- How can we use info like z-scores and averages to calculate the probability of success or failure?
For instance I know the average proficiency and average rate of improvement in mathematics of a batch of students, and I want to know what is the probability that they will pass a math exam in the near future or the percentage of students that will most likely pass that exam.(4 votes)
- If you have a specific cut-off that indicates passing the exam, use the probabilities in the normal distribution table (z-table).(4 votes)
- I have a problem learning something and i dont know what video i need to watch.
An example question from my stat. book says "find the indicated area under the standard normal curve - To the left of z = 1.54" How do i do this? Thanks! -Avery(2 votes)
- What you need is a z-table. That lists the area of the bell curve to the left of a z-value. Many z-tables only lists for positive z-values, but if you have a negative one you can work out the area anyway. Here's a good one though that shows negative z-values too: http://lilt.ilstu.edu/dasacke/eco148/ztable.htm
In your case you would scroll down until you see the 1.5 in the first column, and then go 5 steps to the right, to the 0.04 column to add the 1.50 + 0.04 = 1.54. In that box you see the area of the curve to the left of z=1.54, which is exactly what you were looking for =) to help you find the right one, it says 0.9382.(5 votes)
- What does Z scores 3 mean?(1 vote)
- z scores measure the number of "standard deviations" a particular data value is away from the mean...sooooo a z-score of 3 would mean it's associated data value is 3 standard deviations above the mean.(5 votes)
- what if the numbers that are given to me arent as perfect as in these examples? say if they ask for a kid as tall as 160 cm, how would i measure that?(3 votes)
- Let me answer that. Correct me if I'am wrong then.
-SD(standard deviation): 7.1
-TALL ABOVE: 160
160-143.5/7.1 = approximately 2.3
Above 2 SD is 2.5%.
We need to find:
2.5% - Y = answer
2.35/Y = 1/0.3
Y = 0.705%
2.5% - 0.705%
Answer = approximately 1.795%(1 vote)
- In the CFA level 1 text books it says you cannot use the Z-Score if the distribution is Nonnormal, variance is unknown (in the case of sal's example variance was known) and if the sample size is smaller than 30 (n<30). I guess if Sal was using population instead of a sample the answer to the first question would be correct, has anyone come across this before?(3 votes)
- I cannot figure out how to answer this question from any of these videos:
Find the Z-scores that separate the middle 68% of the distribution from the area in the tails of the standard normal distribution. Can someone please explain?(2 votes)
- The z-scores are just numbers assigned to each standard deviation away from the mean, or sometimes equal to the mean. So 68% is one standard deviation away in each direction from the mean, making the z-scores one and negative one. 1 and -1 are the z-scores that answer your question.(2 votes)
- I'm learning this right now, and I'm learning about probability with normal distribution. Is there a formula where you give it the z-score and it returns the probability? I have to look on a chart, but its hard to use.(1 vote)
- There is no direct formula, no. The table was constructed precisely for that reason. It is impossible to express the probability as a simple formula that you can plug a number into.
That being said, there are alternatives to the table. Many scientific calculators have a normalcdf command (I use a TI-84). Common spreadsheets such as MS Excel, Google Spreadsheets, etc, also have commands which will calculate the probability for you.
For any of these ways to bypass the table, however, it is still important to understand normal probabilities in terms of the graph (i.e. draw a normal curve, shade the area that you want). As the saying goes, "Garbage in, garbage out." If you have a fancy tool or gadget, you also need to know how to operate it before it will be of any use.(4 votes)
It never hurts to get a bit more practice. So this is problem number five from the normal distribution chapter from ck12.org's AP statistics FlexBook. So they're saying, the 2007 AP statistics examination scores were not normally distributed with a mean of 2.8 and a standard deviation of 1.34. They cite some College Board stuff here. I didn't copy and paste that. What is the approximate z-score? Remember, z-score is just how many standard deviations you are away from the mean. What is the approximate z-score that corresponds to an exam score of 5? So we really just have to figure out-- this is a pretty straightforward problem. We just need to figure out how many standard deviations is 5 from the mean? Well, you just take 5 minus 2.8, right? The mean is 2.8. Let me be very clear, mean is 2.8. They give us that. Didn't even have to calculate it. So the mean is 2.8. So 5 minus 2.8 is equal to 2.2. So we're 2.2 above the mean. And if we want that in terms of standard deviations, we just divide by our standard deviation. You divide by 1.34. Divide by 1.34. I'll take out the calculator for this. So we have 2.2 divided by 1.34 is equal to 1.64. So this is equal to 1.64. And that's choice C. So this was actually very straightforward. We just have to see how far away we are from the mean if we get a score of 5-- which hopefully you will get if you're taking the AP statistics exam after watching these videos. And then you divide by the standard deviation to say, how many standard deviations away from the mean is the score of 5? It's 1.64. I think the only tricky thing here might have been, you might have been tempted to pick choice E, which says, the z-score cannot be calculated because the distribution is not normal. And I think the reason why you might have had that temptation is because we've been using z-scores within the context of a normal distribution. But a z-score literally just means how many standard deviations you are away from the mean. It could apply to any distribution that you could calculate a mean and a standard deviation for. So E is not the correct answer. A z-score can apply to a non-normal distribution. So the answer is C. And I guess that's a good point of clarification to get out of the way. And I thought I would do two problems in this video, just because that one was pretty short. So problem number six. The height of fifth grade boys in the United States is approximately normally distributed-- that's good to know-- with a mean height of 143.5 centimeters. So it's a mean of 143.5 centimeters and a standard deviation of about 7.1 centimeters. What is the probability that a randomly chosen fifth grade boy would be taller than 157.7 centimeters? So let's just draw out this distribution like we've done in a bunch of problems so far. They're just asking us one question, so we can mark this distribution up a good bit. Let's say that's our distribution. And the mean here, the mean they told us is 143.5. They're asking us taller than 157.7. So we're going in the upwards direction. So one standard deviation above the mean will take us right there. And we just have to add 7.1 to this number right here. We're going up by 7.1. So 143.5 plus 7.1 is what? 150.6. That's one standard deviation. If we were to go another standard deviation, we'd go 7.1 more. What's 7.1 plus 150.6? It's 157.7, which just happens to be the exact number they ask for. They're asking for the probability of getting a height higher than that. So they want to know, what's the probability that we fall under this area right here? Or essentially more than two standard deviations from the mean. Or above two standard deviations. We can't count this left tail right there. So we can use the empirical rule. If we do our standard deviations to the left, that's one standard deviation, two standard deviations. We know what this whole area is. Let me pick a different color so that I don't. So we know what this area is, the area within two standard deviations. The empirical rule tells us. Or even better, the 68, 95, 99.7 rule tells us that this area-- because it's within two standard deviations-- is 95%, or 0.95. Or it's 95% of the area under the normal distribution. Which tells us that what's left over-- this tail that we care about and this left tail right here-- has to make up the rest of it, or 5%. So those two combined have to be 5%. And these are symmetrical. We've done this before. This is actually a little redundant from other problems we've done. But if these are added, combined 5%, and they're the same, then each of these are 2.5%. Each of these are 2.5%. So the answer to the question, what is the probability that a randomly chosen fifth grade boy would be taller then 157.7 centimeters. Well, that's literally just the area under this right green part. Maybe I'll do it in a different color. This magenta part that I'm coloring right now. That's just that area. We just figured out it's 2.5%. So there's a 2.5% chance we'd randomly find a fifth grade boy who's taller than 157.7 centimeters, assuming this is the mean, the standard deviation, and we are dealing with a normal distribution.