If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:5:42
AP.STATS:
DAT‑2 (EU)
,
DAT‑2.E (LO)
,
DAT‑2.E.1 (EK)
,
DAT‑2.E.2 (EK)
,
DAT‑2.E.6 (EK)
,
VAR‑1 (EU)
,
VAR‑1.E (LO)
,
VAR‑1.E.1 (EK)
CCSS.Math:

Video transcript

we're told that David hosts a podcast and he's curious how much his listeners liked his show he decides to start with an online poll he asks his listeners to visit his website and participate in the poll the poll shows that 89% of about 200 respondents love his show what is the most concerning source of bias in this scenario and well like always pause this video and see if you can figure it out on your own and then we'll work through it together what's think about what's going on he has this population of listeners right over here I'll assume that the number of listeners is more than 200 and he says hey I want to I want to find a sample too and I can't ask all of my listeners who knows maybe he has 10,000 listeners they don't tell us that but let's say there's 10,000 listeners here and he says well I want to get an indication of what percent like my show so I need a sample but instead of taking a truly random sample he asks them to volunteer he asks his listeners to visit his website so that's classic volunteer response sampling this is non-random because who decides to go to his website and listen to what he just said and he maybe even has access to a computer that's not random in fact the people more likely to do that so the people so these are the people out of the 10,000 so these are the 200 responds this year who decide to do it these are more likely to be the people who already like David or like to listen to what he tells them to do the people the listeners who are not into David or you know or don't want to do or don't want to do what he tells them to do well they're unlikely to be to say oh I don't not really into David and I don't like him telling me what to do but I'm gonna go to his website anyway I'm gonna fill out that poll that's that's less likely or you might get extreme' so people who really don't like him might say I'm going to definitely go there but in this case I would say that it's more likely your your fans are going to do what you ask them to do and go to your website and spend time on your website and because of that that 89 percent is probably an over estimate eighty nine percent is probably an over estimate of the number of listeners who really love his show because you're more likely to get the ones who love them to show up and fill out that actual survey now these other forms of bias response bias this is when you're asking something that people don't necessarily want to answer truthfully or the way that it's phrased it might make someone respond and you say in a biased way classic examples of this are like you know have you lied to your parents in the past week or have you ever cheated on your spouse or something like we're having to smoke any of these things that people might not want to answer completely truthfully or they might be hiding from the world they might I just want to answer that truthfully on it on a survey and so you're going to have response bias but that's not the case right over here and under coverage is when the way that you're sampling you're definitely missing out on a important constituency you know voluntary response we're likely missing out on some important constituencies on some people who might not be into going to your website but under coverages where it's a little bit more clear that that is happening now let's do another case let's do another case maybe an alternate reality where David's trying to figure this out again he's still hosting a podcast and he's still curious how much his listeners like his show but tries to take a different sample he decides in this case to pull the next hundred listeners who send him fan emails they don't all respond but ninety four out of the 97 listeners polled say they love his show what is the most concerning source of bias in this scenario well this is a classic hey I have a group I have a sample sitting in front of me it's in my inbox on my email let me just go to them isn't that convenient so this is classic convenience sample and this isn't just like a you know these are the first hundred people to walk through the door and there's you know a lot of times you could argue why that might be not so random but isn't the next hundred listeners who sent him fan emails so these are this is a this is convenience sampling and the the sample that you happen to use out of convenience is one it's going to be very skewed to liking you so once again this is overestimating overestimating the percent the percent that love his show now non-response is when you ask a certain number of people to fill out a survey or do you know to answer a questionnaire and for some reason some percent do not fill it out and you're like well who are those people maybe they would have said something important and maybe their viewpoint is not properly represented in the overall number that actually did fill it out and there is some non-response going out here he asks a hundred people who said fan me mails to to fill out the survey to say whether they loved it or not 97 fill it out so there were three people who did not fill out the survey so there is some non-response going on that would be a source of bias but it's not the most concerning you know right over here they're asking us fill out the most concerning source of bias and the convenience sampling is definitely the biggest deal here there were three people who didn't respond but that's not as big of a deal voluntary response sample well he didn't ask people like in the last example of like hey if you can go here and and and and and fill it out I guess there was actually actually now take that back there is a little bit of voluntary response here where he goes to these hundred people and he asks them to respond and so you have the 97 people who choose to respond but once again that could be a source of advice but most of the 97 out of 100 are responding and once again the most concerning thing is the convenience sampling which will once again based on this this sample that he's happening to use out of convenience is going to be a very significant over estimate in terms of representing the entire population of his listeners