If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content
Current time:0:00Total duration:2:40

Identifying individuals, variables and categorical variables in a data set

VAR‑1 (EU)
VAR‑1.B (LO)
VAR‑1.B.1 (EK)
VAR‑1.C (LO)
VAR‑1.C.1 (EK)
VAR‑1.C.2 (EK)

Video transcript

we're told that millions of Americans rely on caffeine to get them up in the morning which is true although if I drank caffeine in the morning I'm very sensitive I wouldn't be able to sleep at night here's nutritional data on some popular drinks at Ben's beans coffee shop alright so here we have the different names of the drinks and then here we have the type of the drink and it looks like they're either hot or cold here we have the calories for each of those drinks here we have the sugar content in grams for each of those drinks and here we have the caffeine and milligrams for each of those drinks and then we are asked the individuals in this data set are and we have three choices Ben's beans customers Ben's beans drinks or the caffeine contents now we have to be careful when someone says in the individuals in a data set they don't necessarily mean that they have to be people they could be things and the individuals in this data set each of these rows they're referring to a certain type of drink at Ben's beans coffee shops so the different types of drinks that ben b pens beans offers those are the individuals in this data set so they're Ben's beans drinks next they ask us the data set contains and they say how many variables and how many of those variables are categorical so if we look up here let's look at the variables so this first column that is essentially giving us the the type of drink this wouldn't be a variable this would be more of an identifier but all of these other columns are representing variables so for example type is a variable it can either be hot or cold and because it can only take on one of kind of a number of buckets it's either going to be hot or cold it's going to fit in one category or another and you don't just have two categories you could have more than two categories but it isn't just some type of variable number that can take on a bunch of different values so this right over here is a categorical variable calories is not a categorical variable you could have something with four pointone calories you could have something with a 178 things aren't fitting into nice buckets same thing for sugar and for the caffeine that these are quantitative variables that don't just fit into a category and so here I would say that we have four variables one two three four one of which is categorical so that would be choice a over here