Main content

## AP®︎/College Statistics

### Course: AP®︎/College Statistics > Unit 1

Lesson 1: The language of variation: Variables# Identifying individuals, variables and categorical variables in a data set

AP.STATS:

VAR‑1 (EU)

, VAR‑1.B (LO)

, VAR‑1.B.1 (EK)

, VAR‑1.C (LO)

, VAR‑1.C.1 (EK)

, VAR‑1.C.2 (EK)

Identifying individuals, variables and categorical variables in a data set.

## Want to join the conversation?

- what does categorical mean(53 votes)
- It means the data in the set can be sorted into categories, in this case hot drinks and cold drinks. The sugar content, on the other hand, is not categorical, because a drink could have infinite different amounts of sugar.

Hope this helps!(96 votes)

- Why isn't the type of drink classified as a variable?(19 votes)
- It's not a variable because it's not describing anything or numbering anything. For example, "Type" is a categorial variable because it describes the heat of the drinks. "Sugars" is a quantitative variable because it numbers the amount of sugar in the drinks.(36 votes)

- What are the prerequisites for the Statistics and Probability course?(6 votes)
- Algebra is a must for any Statistics and Probability course.

Whether or not calculus is also required depends on how deeply the course goes into probability theory. If the course covers topics such as probability density functions of continuous random variables, cumulative distribution functions of continuous random variables, moment generating functions, and/or maximum likelihood estimators, then calculus would be required.(23 votes)

- what is standard deviation?(5 votes)
- Standard deviation is a measurement of the spread of the data. If you have a high standard deviation, that means your data are far away from the mean, while if it is low it means they are closer.

Hope this helps!(14 votes)

- where can you find the "individuals" of set of data in a given table?(3 votes)
- An individual is what the data is describing. In a table like this, each individual is represented by one row. So in this case, the individuals would be the drinks. An example individual is cappuccino, which is a hot coffee that has 60 calories, 8 grams of sugar, and 75 milligrams of caffeine.(8 votes)

- Why are there no missions for statistics and probability? If there is, can you tell where?(3 votes)
- Khan Academy is getting rid of Missions in June 2020, so they have not been adding them to existing courses. If you are wondering about mastery challenges or practice problems, you should be able to access problems and quizzes either by assigning AP Statistics as one of your courses or by looking at the course overview here: https://www.khanacademy.org/math/ap-statistics/analyzing-categorical-ap

Hope this helps!(4 votes)

- How i can get this lesson as a PDF or other Formats.(2 votes)
- While some KA courses also have documentation, i don't see any documentation for the statistics course.

In this case, you can either download the video for offline use (if thats what you're looking for), or you can look for another textbook.

The introductory statistics book of OpenStax (https://openstax.org/details/books/introductory-statistics) is written by professors and experts and is free to use. It contains the same info discussed here and can be downloaded as a PDF.

Hope this helps and good luck in your learning adventures :)(2 votes)

- What does quantitative mean in the Practice lesson after this video?(4 votes)
- I have just figured this out after not realising some of the questions asked for number of categorical variables and others asked for number of quantitative variables.

*Categorical refers to non numerical variables

*Quantitative refers to numeric variables

So if the variables are numbers they are quantitative, if they are words they are categorical.(0 votes)

- I’m looking at this problem also from a machine learning perspective. In that sense wouldn’t we also include the column ‘drink name’ here also be counted as a categorical variable ? Please help me out here.

Thanks !(2 votes)- Similar to how Sal explained in the video, the drink name
**would not**be a categorical variable in the sense that the pursuing variables are all**describing it**; therefore, it is an**individual**. (Variables describe the individual)

Hope this helps!(1 vote)

- okay, so categorial is more like something that describes it, like in this example, if the coffee is hot or cold. and quantitative is more like the caloric intake, the actual amounts with numbers?(2 votes)

## Video transcript

- [Narrator] We're told
that millions of Americans rely on caffeine to get
them up in the morning. Which is true, although, if I
drank caffeine in the morning, I've very sensitive. I wouldn't be able to sleep at night. Here's nutritional data
on some popular drinks at Ben's Beans coffee shop. All right, so here we
have the different names of the drinks. And then here we have
the type of the drink, and it looks like they're
either hot or cold. Here we have the calories
for each of those drink. Here we have the sugar content in grams for each of those drinks. And here we have the
caffeine in milligrams for each of those drinks. And then we are asked, the individuals in this data set are, and then we have three choices. Ben's Beans customers. Ben's Beans drinks. Or the caffeine contents. Now, we have to be careful. When someone says the
individuals in a data set, they don't necessarily mean
that they have to be people. They could be things. And the individuals in this data set, each of these rows, they're referring to a certain type of drink
at Ben's Beans coffee shops. So the different types of
drinks that Ben's Beans offers, those are the individuals
in this data set. So they're Ben's Beans drinks. Next, they ask us the data set contains, and they say how many variables and how many of those
variable are categorical. So if we look up here,
let's look at the variables. So this first column is essentially giving
us the type of drink. This wouldn't be a variable, this would be more of an identifier. But all of these other columns
are representing variables. So, for example, type is a variable. It can either be hot or cold. And because it can only take
on one of kind of a number of bucket, it's either
going to be hot or cold. It's going to fit in
one category or another. And you don't just have two categories, you could have more than two categories. But it isn't just some
type of variable number that could take on a
bunch of different values. So this right over here
is a categorical variable. Calories is not a categorical variable. You could have something
with 4.1 calories. You could have something with 178. Things aren't fitting into nice buckets. Same thing for sugars
and for the caffeine. These are quantitative variables that don't just fit into a category. And so here I would say
that we have four variables, one, two, three, four. One of which is categorical. So that would be choice A over here.