If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:9:10

Video transcript

so let's say that I've got two different data sets so the first data set I have to another 2 of 4 and a 4 and then in the other data set I have a 1 I'm going to do this on the right side of the screen a 1 a 1 a 6 and a 4 now the first thing I want to think about is well how do I is there a number that can give me a measure of a measure of center of each of these data sets and one of the ways that we know how to do that is by finding the mean so let's figure out the mean of each of these data sets so this first data set the mean well we just need to sum up all of the numbers so it's going to be 2 plus 2 plus 4 plus 4 and we're going to divide by the number of numbers that we have so we have 1 2 3 4 numbers so that's that 4 right over there and this is going to be 2 plus 2 is 4 plus 4 is 8 plus 4 is 12 so it's going to be 12 over 4 which is equal which is equal to 3 so actually let's just let's see if we can visualize this a little bit on a number line so so and actually I'll do kind of a little bit of a dot plot here so we can see all of the values so if this is 0 1 2 3 4 & 5 and so we have two twos and so why don't I just do so for each of these twos actually I'll just do it in yellow so I have one two and then I'll have another two I'm just going to do a dot plot here then I have two fours so 1 4 and another 4 right over there and we calculated that the mean is 3 the mean is 3 a measure of central tendency it is 3 so I'll just put 3 right over here I'll just mark it with that dotted line that's where the mean is all right well we've visualized that a little bit and that does look like it's the center it's a it's a it's a pretty it makes sense so now let's look at this other data set right over here so the mean the mean over here is going to be equal to 1 plus 1 plus plus 4 all of that over we still have four data points and this is 2 plus 6 is 8 plus 4 is 12 12 divided by 4 this is also 3 so this also has the same mean we have different numbers but we have the same mean but there's something about this data set that feels a little bit different about this and let's let's visualize it to see if we can see a difference let's see if we can visualize it so now I have to go all the way up to 6 so let's say this is 0 1 2 3 4 5 6 so I'll go one more 7 so we have a 1 we have a 1 we have another one we have a 6 and then we have a 4 and we calculated that the mean is 3 so we calculated that the mean is 3 so the mean is 3 so when we measure it by the mean the central point or measure of that central point which we use as the mean well it looks the same but the data sets look different and how do they look different well we've talked about notions of variability or variation and it looks like this data set is more spread out it looks like the data points are on average further away from the mean then these data points are and so that's an interesting question that we ask ourselves in statistics we just don't want a measure of center like the mean we might also want a measure of variability and one of the more straightforward ways to think about variability is well on average how far are each of the data points from the mean and we're going to that might sound a little complicated but we're going to we're going to figure out what that means and assign not to overuse the word mean so we want to figure out on average how far each of these data points from the mean and what we're about to calculate this is called mean absolute deviation absolute deviation mean absolute deviation or if you just use the acronym ma Dee mad for me and absolute deviation and all we're talking about we're going to figure out how much do each of these points their distance so absolute deviation how much do they deviate from the mean but the absolute of it so each of these points at two they are one away from the mean doesn't matter if they're less or more they're one away from the mean and then we find the mean of all of the deviations so what does that mean I'm using the word mean well using it a little bit too much so let's figure out the mean absolute deviation of this first data set so we've been able to figure out what the mean is the mean is three so we take each of the data points and we figure out what's its absolute deviation from the mean so we take the first two so we say two minus the mean two minus the mean and we take the absolute value so that's its absolute deviation then we have another two so we find that's absolute deviation from three remember if we're just taking two minus three in taking the absolute value that's just saying it's absolute deviation how far is it from three it's really easy to calculate in this case then we have a four and another four so let me write that so then we have the mean or we have the absolute deviation of 4 from three from its from the mean and then plus we have another four we have this other four right up here four minus three we take the absolute value because once it gets absolute deviation and then we divide it and then we divide it by the number of data points we have so what is this going to be 2 minus 3 is negative 1 but we take the absolute value it's just going to be 1 2 minus 3 is negative 1 we take the absolute value and it's just going to be 1 and you see that here visually this point is just 1 away it's just 1 away from 3 this point is just 1 away from 3 4 minus 3 is 1 absolute value of that is 1 this point is just 1 away from 3 4 minus 3 absolute value that's another one so you see in this case every every data point was exactly one away from the mean and we took the absolute value so that we don't have negative 1 so we just care how far it is an absolute in absolute terms so you have four data points each of their absolute deviations is far away so the mean of the absolute deviations are one plus one plus one plus one which is four over four so it's equal to one so one way to think about it it's saying on average the mean of the distances of these points away from the actual mean is one and that makes sense because all of these are exactly one away from the mean now let's see how how what results we get for this data set right over here and I'll do it let me actually get some space over here at any point if you get inspired I encourage you to calculate the mean absolute deviation on your own so let's calculate it so the mean absolute deviation here I'll write mad is going to be equal to well let's look figure out the absolute deviation of each of these points from the mean so let's the absolute value of one minus three that's this first one plus the absolute deviation so one minus three that's the second one and then plus the absolute value of 6 minus three that's the six and then we have the four plus the absolute value of 4 minus three and then we have four points so one minus three is negative two absolute value is two and we see that here this is two away from three we just care about absolute deviation we don't care if it's to the left or to the right then we have another one minus three is negative two which is absolute value so this is two and that's this this is two away from the mean then we have six minus three absolute value that's just going to be three and that's this right over here we see this six is three to the right of the mean we don't care where there's to the right on the left and then 4 minus 3 4 minus 3 is 1 absolute value is one and we see that it is one to the right of 3 and so what do we have we have 2 plus 2 is 4 plus 3 is 7 plus 1 is 8 over 4 which is equal to 2 so the mean absolute deviation let me write it down it fell off over here here for this data set the mean absolute deviation is equal to 2 well for this data set the mean absolute deviation is equal to 1 and that makes sense they have the exact same means they both have a mean of 3 but this one is more spread out the one on the right is more spread out because on average each of these points are two away from three while on average each of these our one away from three the means of the absolute deviations on this one is one the means of the absolute deviations on this one is two so the green one is more spread out from the mean