If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:6:32
AP.STATS:
UNC‑1 (EU)
,
UNC‑1.H (LO)
,
UNC‑1.H.1 (EK)
,
UNC‑1.H.2 (EK)
,
UNC‑1.H.5 (EK)
,
UNC‑1.H.6 (EK)
CCSS.Math:

Video transcript

in this video I want to do some examples looking at distributions in particular different features and distributions like clusters gaps and Peaks so over here I want to do with some examples which of the following are accurate descriptions of the distribution below select all that apply so the first statement is the distribution has an outlier so an outlier is a data point that's way off of where the other data points are it's way larger or way smaller than where all of the other data points seem to be clustered and if we look over here we have a lot of data points between 0 and 6 and let's just think about what they're measuring this is shelf time for each apple at Gore gorgeous gorgeous grocer so for example we see there's 1 2 3 4 5 6 7 apples that have a shelf life of 0 days so they're they're about to go bad you see you have 1 2 3 4 5 6 7 8 apples that are going to be good for another day you have two apples that are going to be good for another 6 days and you have one Apple that's going to be good for 10 days and this is unusual this is an outlier here it has a way larger cell shelf life than all of the other data so I would say this definitely does have an outlier we just have this one data point sitting all the way to the right way larger way more shelf life than everything else so it definitely has an outlier and this one would be the outlier the distribution has a cluster from 4 to 6 days and we indeed do see a cluster from 4 to 6 days a cluster you can imagine it's a bit it's a grouping of data that's sitting there or you have a grouping of apples that have a shelf life between 4 and 6 days and you definitely do see that cluster there and since I already selected two things I'm definitely not going to select none of the above so let me check my answer let me do let me do a few more of these which of the following are accurate descriptions of the distribution below and once again we're going to select all that apply so the distribution has an outlier so let's see this distribution I do have a data point here that's at the high end and I have another data point here that's at the low end I don't have any data points that are sitting far far above or far below the bulk of the data if I had a data point that was out here then yeah I would say that was a outlier to the right or a positive outlier if I had a data point way to the left off the screen over here maybe that would be an outlier but I don't really see any obvious outliers all the data it's pretty clustered it's pretty clustered together so I would not say that the distribution has an outlier the distribution has a peak at 22 degrees yeah it does indeed look like we have and let's just look at we're actually measuring high temperature each day in Edge tin Iowa in July so it does indeed look like we have the most number of days that had a high temperature at 22 most number of days in July had a high temperature at 22 degrees Celsius so that is a peak and you can see it if you imagine this is kind of a mountain this is a peak right here this is a high point you have the at least locally you have the most number of days at 22 degrees Celsius so I would say it definitely has a peak there since I selected something I'm not going to select none of the above let's do a couple more of these which of the following are accurate descriptions of the distribution below so the first one the distribution has an outlier so let's see this number of guests by day at sets sandwich shop so let's see the lowest the loads they have so they have no days no days where he had between 0 and 19 guests no days we had between 20 and 39 guests looks like there's about nine days where he had between 40 and 59 guess looks like 20 days where he had between sixty and seventy nine guests all the way that looks like this is maybe eight days that he had between 180 and 199 guests but the question of outliers there doesn't seem to be any day where he had an unusual number of guests there's not a day that's you know way out here where he had like 500 guests so I would say this this distribution does not have an outlier the distribution has a cluster from 0 to 39 guests so 0 to 39 guests is right over here is 0 to 39 guests and there's no days where he had between 0 & 39 guests either 0 to 19 or 20 to 39 so there's definitely not a cluster there I would say that the cluster would be between we're days at between 40 and 199 guests definitely not 0 & 39 there's no days that were between 0 & 39 guests so I would say none of the above very confidently let's do one more of these which of the following are accurate descriptions of the distribution below the distribution has a peak from 12 to 13 points let me see what this what this is what this is measuring or what this what this data is about test scores by student and miss frines class so you had one student who got between a 0 and a 1 on on the 20-point scale so got between I guess you may be out of 20 questions got it between 0 & 1 points and then you see that there's no students got between 2 & 3 or 4 & 5 or 6 and 7 then we have another student who got between 8 & 9 looks like 3 students got between 10 and 11 and then we keep increasing this looks like it's about 12 students got but either a 16 or 17 or something in between maybe if you could get decimal points on that on that test and then looks like 10 students got from 18 to 19 all right so this says the distribution has a peak from 12 to 13 points 12 to 13 points there were five students but this isn't a peak if you just go to 14 to 15 probably you have more students so this is definitely not a peak if you were looking at this as a mountain of some kind you definitely wouldn't describe this point as a peak you would say this distribution has a peak has the most number of students who got between 16 and 17 points so that's the peak right there not 12 to 13 points I wouldn't I would not select that first choice the distribution has an outlier well yeah look at this you have this outlier most of the students scored between 8 and 19 points and then you have this one student who got between 0 & 1 it's really an outlier you even see this you know when you look at it visually it's not even connected to the rest of the distribution it's way to the left if something is way to the left or way to the right that's an outlier if it's like kind unusually low or unusually high so I would say this distribution definitely does have an outlier and I'm not going to pick none of the above since I found a choice and I think we're all done