If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:12:17

Video transcript

I think now's as good a time as any to play around a little bit with the formula for variance and see where it goes and I think just by doing this we'll also get a little bit better intuition of just manipulating Sigma notation or even what it means so we learned several times that the formula for variance so variance and let's just do variance of a population it's almost the same thing as variance of a sample you just divide by n instead of n minus one variance of a population is equal to when you take each of the data points X sub I you subtract from that the mean you square it and then you take the average of all of these so you add the squared distance for each of these points from I equals 1 to I is equal to n and you divide it by n so let's see what happens if we can I don't maybe we want to multiply out this squared term and see where it takes us so let's see and I think it'll take us someplace interesting so this is the same thing as the sum is the sum from I is equal to 1 to N of let's see this we just multiply it out this is the same thing as X sub I squared minus this is your little algebra going on here so when you square it I mean we could multiply it out we could write it X sub I minus mu times X sub I minus mu so we have X sub I times X sub I that's X sub I squared then you have X sub I times mu x minus mu and then you have minus u times X sub I so you add those two together you get minus 2 X sub I mu right because you have it twice X sub I times mu that's 1 minus X sub I mu and then you have another one minus mu X sub I when you add them together you get minus 2 X sub I mu I know it's confusing with me saying sub I and all of that but it's really no different than when you did you know a minus B squared just the variables look a little bit more complicated and then the last term is minus mu x minus mu which is plus plus mu squared fair enough let me switch colors just to keep it interesting we quit let me cordon that off okay so how can we well the some of this is the same thing as the sum of because you think we're going to take each X sub I we're you know for each of the numbers in our population we're going to perform this thing and we're going to sum it up but if you think about it this is the same thing as if you're not familiar with segmentation this is a good kind of thing to know in general just a little bit of intuition that this is the same thing as I'll do it here to have space as the sum from I is equal to 1 to N of the first term X sub I squared - - and actually we can bring out the constant terms you just can't take you know when you're summing the only thing that matters is the thing that has you know the ith term so in this case is X sub I so X sub 1 X sub 2 so that's the thing that you have to leave on the right-hand side of the Sigma notation and if you if you've done the calculus playlist already Sigma notation is really it's kind of like a discrete integral on some level because in an integral you're summing up a bunch of things you're multiplying them times you know DX which is a really small interval but here you're just taking a sum and that's what well I we showed in the calculus playlist that the integral actually is kind of this infinite sum of infinitely small things but I don't want to digress too much but this was just a long way of saying that the second the sum from I equals 1 to N of the second term is the same thing as minus 2 times mu nice 2 times mu of the sum from I is equal to 1 to N of X sub I and then finally you have plus plus well this is just a constant term right this is just a constant term so you can take it out times mu squared times the sum I sub from I equals 1 to N and what's going to be here let's go ahead it's going to be a 1 right we just divided a 1 we just divide this by 1 to get out of the Sigma sign out of the sum and you just left with the 1 there right now actually we could have just left them you squared there but either way let's just keep simplifying it so this is this we can't really do well actually we could well no we don't know what the x sub i's are so we just have to leave that the same so that's the sum oh sorry this is just the numerator right this whole simplification we just said we're just simplifying the numerator and later we're just going to divide by n so that is equal to that divided by n which is equal to this thing divided by n I'll divide by n at the end because it's the numerator that's the confusing part right we just want to simplify this term up here so let's keep doing this so this is equal to the sum from I equals 1 to N of X sub I squared and let's see minus 2 times mu sorry that mu doesn't look good edit undo minus 2 times mu times the sum from I is equal to 1 to N of X I and then what is this what is another way to write this right essentially we're going to add 1 to itself n times right this is kind of saying just look whatever you have here just iterate through it n times if you had an X sub I here you would use e at the first X term then the second X term well you have a 1 here this is just essentially saying add 1 to itself n times right which is the same thing as n so this is going to be plus plus mu squared times x n all right then see if there's anything else we can do here remember this was just the numerator so this looks fine we add up each of those terms we have minus 2 mu right from I equals 1 to oh well think about this what is what is this what is this thing right here well actually let's bring back let's bring back that in so this is you know this simplified to that divided by n which simplifies to that whole thing which just simplifies to this whole thing divided by n right which simplifies to this whole thing divided by N and which is the same thing as each of the terms divided by n which is the same thing as that which is the same thing as that which is the same thing as that right and now well how does this simplify is the interesting part well this nothing much I can do here so that just becomes the sum from I is equal to 1 to n X sub I squared divided by big n now this is interesting what is if I take each of the terms in my population and I add them up and then I divide it by n what is that this thing right here if I sum up all the terms in my population and divide by the number of terms there are that's the mean right that's the mean of my population so this thing right here is also mu so this thing simplifies to what minus 2 minus 2 times what mu times this whole thing is mu 2 so times mu squared right mu times mu this is the mean of the population so that was a nice simplification and then plus what do you have here let's see you have mu well you have n over n those cancel out so you have plus mu squared so that was a very nice simplification and then this simplifies to can't do much on this side so the sum from I is equal to 1 to N of X sub I squared over N and then see we have minus 2 mu squared plus mu squared well that's the same thing as minus minus mu squared right minus the mean squared so this already we've kind of come up with a cub and eat way of writing the variance right you can essentially take the average of the squares of all of the numbers and your in this case a population and then subtract from that the mean of your population the the mean squared of your population so this could be depending on how you're calculating maybe a slightly faster way of calculating the variance so just playing with a little algebra we got from this thing where you have to each time take each of your data points subtract the mean from it and then square it and of course before you have to do anything yet to calculate the mean you take the square that you some well then you take the average essentially when you divide it when you sum and divided by n we've simplified it just using a little bit of algebra to this formula and this is we're getting to something called the raw score method we could what we want to do is write this right here just in terms of excise and then we really are what you call the raw score method which is oftentimes a faster way of calculating the variance so let's let's see what is what is mu equal to what is that the mean the mean is just equal to the sum from I is equal to 1 to N of each of the terms right you just take the sum of each of the terms and you divide by the number of terms there are right so that is equal to so if we look at this thing this thing can be written as let me draw a line here this thing can be written as the sum from I is equal to 1 to N of X sub I squared all of that over n minus mu square well mu is this so this thing squared so this thing squared is what this is this is let's see X sub I take the sum up to n is equal to 1 you're going to take you're going to square this thing and then you're going to divide it by we squared right you divide it by N squared and what this might seem like a more you know out of all of them this is actually seems like the simplest so the simplest formula for me where you essentially just take if you know the mean of your population right you just say okay you know my mean is whatever and I can just square that and just put that aside for a second but first I can just take each of the numbers square them and then sum them up and divide by the number of numbers I have right I don't know if I wrote no I've erased the last set of numbers now but we could show you that that you'll get to the same variance so to me this is almost a simplest formula but this one's even faster a lot in a lot of ways because you don't really have to even calculate the mean ahead of time you can just say okay for each X I I just perform this operation and then you know I divide by N squared or and accordingly and I'll also get to the variance so you don't have to do this calculation before you figure out the whole variance but anyway I thought it would be instructive and hopefully give you a little bit more intuition behind the algebra dealing with Sigma notation if we kind of worked out these other ways to write variances in fact Li some books will just kind of say oh yeah you know what the variance could be written like this or and whatever the variance of a population or it could be written like this or maybe they'll even write it like this and it's good to know that you can just do a little bit of simple a little simple algebraic manipulating manipulation and get from one to the other anyway I've run out of time see you in the next video