If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:10:23

Video transcript

in this video we'll just talk a little bit about what the chi-squared distribution is chi-square chi-squared distribution sometimes called the chi-squared distribution and then in the next few videos we'll actually use it to really test how well theoretical distributions explain observed ones or how good a fit observed results are for theoretical distributions so let's just think about it a little bit so let's say I have some random variables and each of them are our independent standard normal normally distributed random variable so let me just remind you what that means so let's say I have the random variable X if X is normally distributed we could write that X is a normal is a normal random variable with a mean of 0 and a variance of 1 or you could say that the mean the expected value of x is equal to 0 or in that the variance variance of our random variable X is equal to 1 or just to visualize it is that we're sampling when we take an instantiation of this variable we're sampling from a normal distribution a standardized normal distribution that looks like this mean of 0 and then a variance of 1 which would also mean of course a standard deviation of 1 so this we could that could be the standard deviation or or the variance or the standard deviation that would be equal to 1 so a chi-squared distribution if you just take one of these random variables and you let me define it this way let me define a new random variable let me define a new random variable Q let me define a new random variable Q that is equal to u essentially sampling from this standard normal distribution and then squaring whatever number you got so it is equal to this random variable X it is equal to that random variable x squared it is equal to that random variable x squared the distribution for this random variable right here is going to be an example of the chi-squared distribution we regret actually we're going to see in this video is that the chi-square or the chi-squared distribution is actually a set of distributions depending on how many sums you have right now we only have one random variable that we're squaring so this has this this is just one of the examples and we'll talk more about them in a second so this right here this we could write we could write that Q is a chi-squared distributed random variable or that we could use this notation right here Q is we could write it like this so this isn't an X anymore this is the Greek letter Chi although it looks a lot like a curvy X so it's a member of Chi squared and since we're only taking one sum over here we're only taking the sum of one independent normally distributed standard normally distributed variable we say that this only has one degree of freedom and we write that over here so this right here is our degree of degree of freedom we have one degree of freedom right over there now if we defined let me define so let's call this Q one let's say I have another random variable let's call this Q let me do a different color let me do Q two in blue let's say I have another random variable Q two that is defined as let's say I have one independent standard normally distributed variable I'll call that x one and I square it and then I have another independent another independent standard or I guess I guess a standard normally distributed variable x2 and I square it so you can imagine both of these guys have distributions like this and they're independent so to get to get a to get to sample q2 you essentially sample x1 from this distribution square that value sample x2 from this form of the same distribution essentially square that value and then add the two and you're going to get q2 this over here here we would write so this is q1 q2 here q2 we would write is a chi-squared distributed random variable with two degrees of freedom right here two degrees 2 degrees of freedom and just to visualize kind of the set of chi-squared distributions let's look at this over here so this I got this off of Wikipedia this shows us some of the probability density functions for some of the chi-squared distributions this first one over here for K of equal to 1 that's the degrees of freedom so this is essentially our q1 this is our probability density function for q1 and notice it really spikes close to 0 and that makes sense because if you are if you're sampling just once from this standard normal distribution there's a very high likelihood that you're going to get something pretty close to zero and then if you square something close to 0 remember these are decimals they're going to be less than 1 pretty close to 0 it's going to become even smaller so you have a high probability of getting a very small value you have high probabilities of getting values less than you know some threshold this right here you know less then I guess this is one right here so the less than 1/2 and you have a very low probability of getting a large number I mean to get like a to get a 4 you would have to sample 2 you would have to sample a 2 from this distribution we know that 2 is 2 is actually it's two variances or two standard deviations from the mean so it's less likely and actually I would that's to get a 4 so to get in to get even larger numbers are going to be even even less likely so that's why you see this shape over here now when you have two degrees of freedom it moderates a little bit this is the shape for this is the shape this blue line right here is the shape of q2 and notice your a little bit less likely to get values close to 0 and a little bit more likely to get numbers further out but it still is kind of shifted or heavily weighted towards small numbers and then if we had another random variable another chi-squared distributed random variable another so then we have let's say q3 and let's define it as the sum of let's define it as the sum of 3 of these independent of these three independent variables each of them that have a standard normal distribution so x1 x2 squared + x3 squared then all of a sudden our q3 this is q2 right here it has a chi-squared distribution with three degrees of freedom and so this guy right over here that will be this green line maybe I should have done this in green this will be this green line over here and then notice now it's starting to become a little bit more likely that you get values in this range over here because you're taking the sum each of these are going to be pretty small values but you're taking the sum so it starts to shift it a little over to the right and so the more degrees of freedom you have the further this lump starts to move to the right and to some degree the more symmetric it gets and what's interesting about this I guess it's different than almost every other distribution we've looked at although we've looked at others that have that have this property as well is that you can't have a value below zero because we're always squaring these values each of these guys can have values below zero they're normally distributed they could have negative values but since we're squaring and taking the sum of squares this is always going to be positive and the place that this is going to be useful in we're going to see in the next few videos is in measuring essentially error from an expected value and if you if you take this total error you can figure out the probability of getting that total error from if some if you hold some parameters or if you assume some parameters and we'll talk more about in the next video now with that said I just want to show you how to read how to read a chi-squared distribution table so if I were to ask you if I were to ask you if this is our distribution let me pick this blue one right here so over here we have two degrees of freedom because we're adding two of these guys right here if I were to ask you what is the probability what is the probability of what is the probability of Q 2 being greater than BQ 2 being greater than or let me put it this way what is the probability of Q 2 being greater than 2.41 and I'm picking that value for a reason so I want the probability of Q 2 being greater than 2.41 what I want to do is I'll look at a look at a chi-square table like this Q 2 is a version of chi-squared with two degrees of freedom so I look at this row right here under two degrees of freedom and I want the probability of getting a value above two point for one and I picked 2.41 because it's actually at this table and so most of these chi-squared you know the reason why we have these weird numbers like this instead of you know whole numbers or easy-to-read fractions it's actually driven by the p-value it's driven by the probability of getting something larger than that value so normally you would look at the other way you would say okay if I if I want to say what what chi-squared value for two degrees of freedom there's a thirty percent chance of getting something larger than that then I would look up to point four one but I'm doing it the other way just for the sake of this video so if I want the probability of getting of this random variable right here being greater than two point four one of being greater than two point four one or its p-value we read it right here it is thirty percent and just to visualize it on this chart this chi-squared distribution this was Q to the blue one over here two point four one is going to sit let's see this is three this is two point five so two point four one is going to be someplace right around here so essentially what where that table is telling us is the this entire area this entire area under under this blue line right here what is that and that right there is going to be 30% of well it's going to be it's going to be 0.3 it's going to be 0.3 or you could view it as 30% of the entire area under this curve because it's obviously all the probabilities have to add up to 1 so that's our intro to the chi-squared distribution in the next video we're going to use it to it to make some or to test some inferences