Chi-square probability distribution
Chi-Square Distribution Introduction Chi-Square Distribution Introduction
⇐ Use this menu to view and help create subtitles for this video in many different languages.
You'll probably want to hide YouTube's captions if using these subtitles.
- In this video we will talk about what the chi squared distribution is. Chi squared distribution
- sometimes called the chi-squared distribution. And in the next few videos, we will actually use it to test how well
- theoretical distributions explain observed ones, or, how good a fit
- observed results are for theoretical distributions. So lets just think about it a little bit, lets say
- I have some random variables. And each of them are independent standard
- normally distributed random variables. let me just remind you what that means, so lets just say I have
- I have the random variable X. If X is normally distributed
- we could write that X is a normal random variable with a mean
- of 0 and a variance of 1 or you can say that the mean expected value of X is equal to 0
- or in that the variance of our random variable X is equal to 1. Or, just to visualize it
- is that we are sampling when we take an instantiation of this variable
- we are sampling from a normal distribution a standardized normal
- distribution that looks like this. Mean of 0 and then a variance of 1 which would also mean the standard
- deviation of 1.
- So this could be the standard deviation, or , variance, or the standard deviation that would be equal to 1
- so a chi-squared distribution, if you just take one of these random variables
- and you, let me just define it this way, let me take a new random variable
- let me define a new random variable Q, let me define a new random variable Q that is equal to you essentially sampling
- from this standard normal distribution and squaring whatever number you got.
- so it is equal to this random variable X squared, it is equal to that random variable X squared
- the dist for this random variable right here is going to be an example
- of the chi-squared distribution, actually what we are going to see in this video is that the
- chi-squared distribution is actually a set of distributions depending on how many sums you have
- right now we only have 1 random variable that we are squaring so this is just one of the examples
- that we will talk more about them in a second. This right here, this we could write that Q is a chi-squared distributed random variable
- or that we could use this notation right here Q is, we could write it like this
- this isn't an X anymore this is the greek letter chi although it looks alot like a curvy X
- so its a member of chi-squared and since we are only taking one sum over here
- here, we are only taking the sum of on 1 independent normally distributed standard normally distributed variable we say
- that this has only 1 degree of freedom. and we write that over here
- so this right here is our is our degree of freedom
- we have one degree of freedom over there. now if we define
- lets call this Q1, let's say we have another random variable - lets call it Q - we need a diff color, let me do Q2 in blue
- lets say I have another random variable Q2 that is defined as, lets say I have 1 independent normally distributed variable I'll call that
- X1 and then I'll square it. and then I have another independent standard or is guess you could call it standard normal
- distributed variable X2 and then I square it. And you could imagine that both of these guys have distributions like
- this and they are both independent and so to get to sample Q2 you essentially sample X1 from this distribution square that value,
- sample X2 from the same distribution essentially square that value then add the two and then you are going to get Q2
- this over here, here we would write Q1. Q2 here we would write is a chi-square distributed random variable with 2 degrees of freedom.
- Two degrees of freedom and just to visualize the set of chi-squared distributions
- lets look at this over here. So this, I got this off of wikipedia, this shows us some of the probability densities
- functions for some of the chi-squared distributions. This first one over here for K of equal 1
- that's the degrees of freedom so this essssentually our Q1. This is our probability density
- function for Q1 and notice it really spikes close to 0 and that makes sense. If you were sampling just once from this standard norm dist
- there's a very high likelyhood that your going to get something pretty close to 0
- and then if you square something close to 0, remember these are decimals
- they are going to be less than 1 pretty close to 0 its going to become even smaller
- so you have a high prob of getting a very small value you have a high prob of getting
- values less than some threshold, this right here less than is guess this is 1 right here
- so then one half and then you have a very low prob of getting a large number. To get
- a 4 you would have to sample a 2, you would have to sample a 2 from this distribution and we know that 2 is
- actually its 2 variences or two standard deviations from the mean, so its less likely
- and acutally thats - thats to get a 4 so to get an even larger numbers its going to be even less likely
- so thats why you see this shape over here. now when you have 2
- degrees of freedom it moderates a little bit this is the shape for, this blue line right here is the
- shape of Q2 and notice that your are a little bit less likely to get values close to 0
- and a little bit more liekly to get numbers further out
- but its still kinds shifted of weightrd towards small numbers
- then if we had another random variable and other chi-squared distributed random variable so then we have
- lets say Q3 and lets deffing it as sum f three indipenedient variables
- each of them would have a stand norm dist so X1, X squared plus X3 squared then all the sudden
- our Q3, this is Q2 here, has a chi-squared distribution of 3 degrees of freedom so this guy right over here
- that will be this green line. Maybe I should have done this
- in green, this will be this green line over here and then notice that it is becoming a little more likely
- that you get values in this range over here.
- because you are taking the sum, each of these are going to be pretty small values, but you are taking the
- sum so it starts to shift it over to the right. So the more degrees of freedom you have
- the further this lump starts to move to the right. and to some degree
- the more symmetric it gets and whats interesting about this
- i guess its almost different than any other distribution,
- we have looked at, although we have looked at others that have
- this property as well, is that you cant have a value below 0
- because we are always squaring these values each
- these guys can have values below zero, they are normally distributed, they can have negative values
- but since we are squaring and taking the sum of squares
- this is always going to be positive.
- The place that this is going to be useful, and we are going to see in the next
- few videos is in measuring, essentially error from an expected value
- and if you take the total error, you can figure out the probability of getting that error
- from, if you hold some parameters, or if you assume some parameters
- we'll talk more about that in the next video
- Now with that said, I just want to show you how to read
- a chi-squared distribution table. So if I were to ask you, if I were to ask you
- if this is our distribution, let me pick this blue one right here
- so over here we have 2 degrees of freedom for adding two of these guys right here
- if i were to ask you, what is the probability
- what is the probability, of, what is the probability of
- Q2 being greater than, Q2 being greater than. Let me
- put it this way. What is the probability of Q2 being greater than
- 2.41? And I'm picking that value for a reason.
- so I want the probability of Q2 being greater than 2.41
- what I want to do is, ill look at a, ill look at a chi-squared table
- like this. Q2 is a version of chi-squared with 2 degrees of freedom
- so i look at this row right here under 2 degrees of freedom
- and I want the probability of getting a value
- above 2.41 and i picked 2.41 because it is actually
- at this table. And so most of these chi-squared
- the reason why we have these weird numbers
- like this instead of whole numbers
- or easy to read fractions is its actually
- driven by the p value its driven by the probability
- of getting something larger than that value
- so normally you would look at it the other way
- you would say, ok if i want to say what, what chi-squared value
- for 2 degrees of freedom there's a 30% chance of getting
- something larger than that, then i would look up 2.41
- buyt im doing it the other way just for the sake of this video
- so if i want the probability of getting, of this random variable right here
- being greater thaan 2.41, of being greater than 2.41 or its p value
- we read it right here, it is 30% and just to visualize it on this chart
- this chi-squared distribution, this Q2 the blue one
- over here 2.41 is going to sit, lets see this is three
- this is 2.5 so 2.41 is going to be someplace right around
- here, so essentially what that table is telling us
- is this entire area, this entire area, under, under this blue
- line right here. what is that? and that right there is going
- to be 30% of, well its going to be .3, or you can view
- it as 30% of the entire area under this curve
- because obviously all the probabilites have to add up to 1.
- so that's our intro to chi-squared distribution. In the next video we are
- actually going to use it to make some, or to test some
Be specific, and indicate a time in the video:
At 5:31, how is the moon large enough to block the sun? Isn't the sun way larger?
|
Have something that's not a question about this content? |
This discussion area is not meant for answering homework questions.
Discuss the site
For general discussions about Khan Academy, visit our Reddit discussion page.
Flag inappropriate posts
Here are posts to avoid making. If you do encounter them, flag them for attention from our Guardians.
abuse
- disrespectful or offensive
- an advertisement
not helpful
- low quality
- not about the video topic
- soliciting votes or seeking badges
- a homework question
- a duplicate answer
- repeatedly making the same post
wrong category
- a tip or feedback in Questions
- a question in Tips & Feedback
- an answer that should be its own question
about the site
Share a tip
Suggest a fix
Have something that's not a tip or feedback about this content?
This discussion area is not meant for answering homework questions.