If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Example of undercoverage introducing bias

Nonresponse, undercoverage, and voluntary responses can all introduce bias when we sample a population for a study. Given the description of a study, we can think about potential sources of bias, and how they may have impacted the results of the study.

Want to join the conversation?

  • blobby green style avatar for user Sreevathsa
    It's not made particularly clear as to how to know if it's overestimate or an underestimate when we have a typical case of Undercoverage bias.

    However, I can think of an explanation to suggest that it could be an underestimate for this example. You could argue that since they sampled only the landline users, not all of them might be using internet. However, the ones left out are mobile phone users who are more likely to use internet on their phones. Possibly a separate study asking if there is a difference in internet usage between landline users and mobile phone users can give us a better picture. Assuming that the mobile users are more likely to use internet compared to landline users, and the population involving potential non internet users show a 42% concern about internet privacy, you could argue that the number would go up if you include mobile phone users. Your thoughts on this?
    (13 votes)
    Default Khan Academy avatar avatar for user
  • piceratops ultimate style avatar for user Dayvyd
    If undercoverage is a concern, is there also such a thing as overcoverage? Would it be something like including individuals in the population who shouldn't belong there, since undercoverage is not including those who should be included (e.g., counting men as part of a population for a pregnancy study.)?
    (10 votes)
    Default Khan Academy avatar avatar for user
  • hopper cool style avatar for user Mathmaster600
    What are all the types of biases?
    (6 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user daniella
      There are several types of biases that can affect research outcomes, including:

      Selection Bias: Occurs when individuals or groups are systematically excluded or included in a study, leading to a biased sample.
      Measurement Bias: Arises from errors or inaccuracies in the measurement process, such as flawed instruments or observer bias.
      Sampling Bias: Results from a non-random selection of participants, leading to a sample that does not accurately represent the population of interest.
      Response Bias: Occurs when participants provide inaccurate or misleading responses due to social desirability, leading questions, or other factors.
      Confirmation Bias: Involves seeking or interpreting information in a way that confirms one's pre-existing beliefs or hypotheses, while ignoring contradictory evidence.
      Reporting Bias: Arises when certain outcomes or results are selectively reported or emphasized, leading to an incomplete or biased portrayal of the findings.
      Observer Bias: Occurs when researchers' expectations or preferences influence their observations or interpretations of data, leading to biased conclusions.
      (1 vote)
  • male robot hal style avatar for user Yash
    If the choices for the question presented here were 'Convenience sampling' and 'Under coverage', which one would have been the correct answer?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • mr pink red style avatar for user S M
    The videos on bias start off without an introduction on what types of biases exist and what they each mean. Perhaps I missed the video.
    I would like to know the kinds of biases that exist and explanations on them. Can someone point me in the right direction?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • aqualine seedling style avatar for user inwayi2
    If there is an undercoverage bias, how do we know that it is a 42% underestimate? How about if the rest of those upsampled people (the unlisted people) were also 42% concerned? Or what if they brought up the percentage? Is that still an "underestimation"?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • duskpin ultimate style avatar for user Rowan Belt
      Sal was saying that it is likely an underestimate because of the nature of the particular scenario. The people who were not included in the survey (mobile and unlisted phone numbers) may, according to Sal's logic, be more concerned about internet privacy than the people who are in the yellow pages. Therefore, if you took a broader survey including mobile and unlisted numbers, you would be likely to see that the percentage of people who are "very concerned about internet privacy" is more like 43% or 44%, if not higher. That is what is meant by the statement that 42% is an underestimate. I hope this has helped more than it has confused. :)
      (6 votes)
  • duskpin seedling style avatar for user lucero
    a survey of high school students to measure teenage use of illegal drugs will be a biased sample because it does not include home-schooled students or dropouts. A sample is also biased if certain members are underrepresented or overrepresented relative to others in the population.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • aqualine ultimate style avatar for user An_Awesome_Person
    Don't you also get pro-privacy bias from the repeated calling until you get a response? After all, you invaded people's privacy and made them mad by calling them a bunch of times, so they'll value privacy more than they usually do.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • boggle yellow style avatar for user Peach1209
    i have a question about the exercise, here is what is confusing:(People who listen to David's show probably like it in the first place, and those that choose to take the time to visit the website and respond to the poll probably feel even stronger than the typical listener.89,%percent,is probably an overestimate of the percentage of all listeners that love the show.)why is overestimate and not underestimate, if more people who like the show did not respond then 89%should be an underestimate not overestimate, does anyone know?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Jalyn Beecham
    what are the types of bias?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • duskpin ultimate style avatar for user The Telepath
      There are many types of bias, but some of the ones mentioned in this video are non-response (some people not responding to the survey), undercoverage bias (not able to sample from part of the population), and voluntary response sampling (when people volunteer to be part of the survey).
      Hope this helps! :)
      (1 vote)

Video transcript

- [Instructor] A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling 100 people whose names were randomly sampled from the phone book. Note that mobile phones and unlisted numbers are not in phone books. The senator's office called those numbers until they got a response from all 100 people chosen. The poll showed that 42% of respondents were very concerned about internet privacy. What is the most concerning source of bias in this scenario? And we should also think about, well what kind of bias does that likely introduce? Is this likely to be an overestimate or an underestimate of the number of respondents? And maybe there's no bias here. But our choices, and no bias is not one of the choices you can imagine, it's gonna be one of these three. So, I encourage you to pause this video and think about what we just said. Where we're a senator, we're trying to figure out what percentage of our response, of our constituents, are very concerned about internet privacy. And we go to the phone book, we sample 100 people. We keep calling them until they answer, and we get that 42% are very concerned. So what's the source of bias? Alright, now let's work through this together. So nonresponse is, would've been the case, if we selected these 100 people, and let's say only 50 people answered the phone and we didn't keep calling them. Then we'd say, well 50 of the people who we sampled to answer our survey didn't even respond. There was a nonresponse there, what was there about those 50 people? Maybe there was something that would've skewed the survey or actually if we, had we gotten them, it would've gotten maybe get better data. But in this case, they tell us, the senator's office called those numbers until they got a response from all 100 people chosen. So the 100 people that they chose, they made sure they got a response. So nonresponse is not going to be an issue here. Alright, next choice, undercoverage. Well undercoverage is where you're not able to sample from part of the population. And a part of the population that actually might, because you didn't sample it, it might introduce bias. Now let's think about what happened in this situation. We are a senator. We want to sample all of our constituents, but we choose, we instead we sample from the constituents who happen to be listed in the phone book. So these are the people who happen to be listed in the phone book. And so we're not sampling from people who are not in the phone book, who maybe have landlines and they're unlisted. And we're not sampling from people who don't have landlines, who only have mobile phones. And you might say, well why is that important? Well think about it, people who decide not to list in the phone book or people who don't even have a landline, some of those people might be a little bit more concerned about privacy than everyone else. They explicitly chose not to be listed. So undercoverage is definitely a very concerning source of bias over here. We are sampling from only a subset of our entire population we care about. In particular, we're missing out on people who might care about privacy. And so I would say because of undercoverage, 42% is likely to be an underestimate of the people concerned about internet privacy. Probably a higher proportion of the people out here care about privacy, because they're unlisted or they don't even have a landline. So, undercoverage, it probably introduced bias, and it implies that 42% is an underestimate of the percentage of the senator's constituents who care about internet privacy. Now the last question, volunteer response sampling. Well this would be the case where you, you know the senator, I don't know, put a billboard out or just told someone, told a bunch of people, maybe on her website, hey vote for this or give us your information on how much you care about internet privacy. And that would've been, the source of bias there, is well who shows up on that website? Once again, if you did, hey come to my website and fill it out, you're only getting information from a subset of your population who are choosing, who are volunteering. That is not the situation that she did over here. She didn't ask 100 people to volunteer. Her team went out and got them from the phone book. So this was definitely a case of undercoverage.