If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Statistics and probability

Course: Statistics and probability>Unit 3

Lesson 4: Variance and standard deviation of a population

Mean and standard deviation versus median and IQR

Learn to choose the "preferred" measures of center and spread when outliers are present in a set of data.

Want to join the conversation?

• 1,2,3 ,1000,2000,10000,20000
median is 1000.
It just tries to stay in between.
Mean is like finding a point that is closest to all. But it gets skewed.
If for a distribution,if mean is bad then so is SD, obvio.
Standard deviation is how many points deviate from the mean.
For two datasets, the one with a bigger range is more likely to be the more dispersed one.
IQR is like focusing on the middle portion of sorted data. So it doesn’t get skewed.

Why not use IQR Range only.
Use standard deviation using the median instead of mean.
Create levels expanding from the IQR range, level 1, level 2.
Is it a good idea?
• When you perform an exploratory data analysis you may be interested the range.

There is no such thing as IQR range. IQR is a form of range (interquartile range).

There is no such thing as levels in IQR. But perhaps you can create a new feature if you feel it is necessary.
• How about mode? Wouldn't that often be more reliable? Like when calculating the average salary in a large population - would the amount most people make not seem the most representative?
• Not necessary Powel . The example Carlos explained above is accurate
• If median and IQR are preferred when there are outliers, doesn't that imply that they are more accurate when there is any variance at all?

The only case where mean and standard deviation are going to be as accurate as median and IQR is if there is no variance at all in the data.

With that being said, is there any situation where mean and standard deviation would be preferable?
• what does the Standard deviation have to do with the IQR
(1 vote)
• They are both measures of how far the typical data point is from the center--either the mean or the median, depending on which you use.
• why cant we mix and match
? as we figure out that median captures central tendency better. why cant we still use median in standard deviation formula?. That would be better capturing total variance/spread in the data set
• interesting idea

and it would remedy the misleading by biased mean a bit

but the skew and thus bias by an outlier remain even with median for calculating standard deviation.

i think that's why we better rely on IQR in that type of situations as it can simply ignore too extreme cases.
(1 vote)
• i have 2 questions.. the first one is on variance... why was the previous video refer to it as sample biased variance.. what does it mean? the second question is the term skew.. what does it mean here? thank you
• Would the mean be robust if there are outliers on both sides of the main group of data points?
(1 vote)
• Still no because it is unknown how drastically the outliers differ from each other. For example, if most of the data were from 50-60 one of the outliers could be 30 while another outlier is 200. Thus if any outliers as a general reasons use the median.
• if mean is 80 how far away is 60 and in what direction