Main content

## Statistics and probability

### Course: Statistics and probability > Unit 6

Lesson 1: Statistical questions# Statistical and non statistical questions

A statistical question is one that can be answered by collecting data and where there will be variability in that data. For example, there will likely be variability in the data collected to answer the question, "How much do the animals at Fancy Farm weigh?" but not to answer, "What color hat is Sara wearing?". Created by Sal Khan.

## Want to join the conversation?

- In5:32, it is said that the question "Find the difference in rainfall in Seattle and Singapore in 2013" is not a statistical question. However, to find the amount of rainfall, you have to measure the rainfall on each day. It is data with variability, so shouldn't it be a statistical question?(16 votes)
- If the question asked to compare the total rainfall in 2013 to the average over that past five years, then that would be a statistical question. Look for key words for measures of central tendency and trends, rather than computation of exact amounts.(5 votes)

- Would a non-statistical question like, "What is 4x5?" become a statistical question if someone had the wrong answer?(4 votes)
- I believe it could be a statistical question if it were something like this:

"How many 3rd graders got the question, 'What is 4 x 5?' wrong on this year's state test compared to last year?"

These types of questions really depend a lot on the exact words used to express the question.(14 votes)

- Are categorical questions statistical questions? For example, "What are the color of eyes of the 6th grade class?" While the answer is qualitative rather than quantitative, and would use a bar graph rather than a histogram, is it still a statistical question since it has variability?(8 votes)
- Statistical, if depends on who you ask. I have brown eyes but I can bet that the next person has blue. But then again, can't all questions have variation in them?(1 vote)

- At7:17, is it only a statistical question if you mention "in 2013" before you added it in the last example, or can it be statistical either way?(3 votes)
- Without the in 2013, it is kind of a complicated case. You have to think about what you want to consider. if you say throughout the history of the schools or since 1800, then this does become a statistical question no matter how you look at it. However, considering other things (like in 2013), you have concrete numbers that you can check the difference of, making it not a statistical question(2 votes)

- Question 8 asks: "In general, will I use less gas driving at 55 than at 70 mph?" The instructor states that variability in road conditions and in automobiles mean this question is statistical, however the question asks if, in general, *
**I*** will use less gas. Because I can only drive one automobile at any given time, all of the conditions of any given trip are constant regardless of the speed at which I drive and the only variable is my speed. If in all cases I would use less gas driving at 55 than at 70 mph, does this not mean that there is no variability in the outcomes and therefore this cannot be a statistical question? To put it a different way, because there will never be a time when I get into a car and say, "Hey! With this car and these road conditions I can save more gas driving at 70 than I would driving at 55", there is nothing to analyze. Hmm.. Or perhaps there is some rule of thumb which says that any time you see the words "In General" or "On Average", it is automatically a statistical question regardless of the particulars of the question? Such a rule should be stated explicitly. But then it would follow that the question "In general what base does the binary number system use?" would also be a statistical question (which it is not). For these reasons, question 8 can't be statistical. What do you think? Thanks.(3 votes)- It is statistical because you drive the car multiple times, thus coming up with a set of numbers rather than just one.(2 votes)

- Hi Do statistical questions have an origin?(3 votes)
- What sorts of follow up questions about the statistics might you ask that person in order to obtain the data needed to make a decision about the validity of that statement?(2 votes)
- I am greatly pleased on how kahn works and how thinghs are generated ILL let you know the usernames and the filter you guys use is broken i just had noticed this today at 2/12/2021

THIS is urgent:(1 vote)- Really? What doesn't work? I've been here for years and I've seen many updates but I'm fairly certain everything works.(3 votes)

- In the last question, even with clause like "in 2013", it might still be a statistical question, right? since, to find the highest paid professor, we need to know the salary of every professor in that department in that university. From that we take the maximum from that list. Isn't this similar to taking the mean from the list? And would it be a statistical question then? Please correct me if I am wrong.(2 votes)
- Mean does not equal the max. Also, there would only be one highest paid professor, even if there are other professors in the university. This would be like saying "How old are you?", is s statistical question because you need to find every person in the world and take "you" from that list. So no, this is not a statistical question.(1 vote)

- My teacher had a game that had a question asking, "Are the students at the school or the staff at the school more diverse?" I said that it was a non-statistical question, but the game said that I was incorrect. Could I get clarification on why?(2 votes)

## Video transcript

What I want to do
in this video is think about the
types of questions that we need
statistics to address and the types of
questions that we don't need statistics
to address. We could call the ones
where we need statistics as statistical questions. And I'll circle the statistical
questions in yellow. I encourage you to
pause this video and try to figure this
out yourself first. Look at each of these
questions, and think about whether you think you
need statistics to answer this question or you
don't need statistics, whether these are
statistical questions or not. I'm assuming you've
given us a pass at it. Now we can go through
this together. This first question
is, how old are you? So we're talking about how
old is a particular person. There is an answer
here, and we don't need any tools of
statistics to answer this. So this is not a
statistical question. How old are the people who have
watched this video in 2013? Now this is interesting. We're assuming that multiple
people will have watched this video in 2013, and
that they're not all going to be the same age. There's going to be some
variability in their age. One person might
be 10 years old. Another person might be 20. Another person might be 15. So what answer do you give here? Do you give all of the ages? But we want to get a
sense of in general. How old are the people? So this is where statistics
might be valuable. We might want to find some
type of central tendency, an average a median
age for this. So this is absolutely
a statistical question. And you might already be
seeing kind of a pattern here. The first question,
we were asking about a particular person. There was only one answer here. There's no variability
in the answer. The second one, we're
asking about a trait of a bunch of people,
and there's variability in that trait. They're not all the same age. And so we'll need
statistics to come up with some features
of the data set to be able to make
some conclusions. We might say, on
average, the people who have watched
this video in 2013 are 18 years old,
or 22 years old, or the median is 24 years
old, whatever it might be. Do dogs run faster than cats? Once again, there are
many dogs and many cats, and they all run at
different speeds. Some dogs run faster
than some cats, and some cats run
faster than some dogs. So we would need some statistics
to get a sense of in general or on average how fast do dogs
run and then maybe on average how fast do cats run. Then we could compare
those averages, or we could compare the
medians in some way. So this is definitely
a statistical question. Once again, we're talking
about, in general, a whole population of dogs,
the whole species of dogs, versus cats. And there's variation
in how fast dogs run and how fast cats run. If we were talking
about a particular dog and a particular cat, well, then
there would just be an answer. Does dog A run
faster than cat B? Well, sure. That's not going to be
a statistical question. You don't have to use
the tools of statistics. And this next question
actually fits that pattern. Actually, no, this fits the
pattern of the previous one. Do wolves weigh more than dogs? Once again, there are some very
light dogs and some very heavy wolves. So those wolves definitely
weigh more than those dogs. But there's some very,
very, very heavy dogs. And so what you want to do here,
because we have variability in each of these,
is you might want to come with some
central tendency. On average, what's the
median wolf weight? What's the average,
the mean wolf weight? Compare that to the
mean dog's weight. So once again,
since we're speaking in general about wolves,
not a particular wolf, and in general about dogs, and
there's variation in the data, and we're trying to glean some
numbers from that to compare, this is definitely a
statistical question. Does your dog weigh
more than that wolf? And we're assuming that we're
pointing at a particular wolf. Now this is the particular. We're comparing a particular
dog to a particular wolf. We can put each of them
on a weighing machine and come up with
an absolute answer. There's no variability in
this dog's weight, at least at the moment that we weigh it,
no variability in this wolf's weight at the moment
that we weigh it. This is not a
statistical question. I'll put an x next
to the ones that are not statistical questions. Does it rain more in
Seattle than Singapore? Once again, there
is variation here. And we would also
probably want to know, does it rain more in Seattle
than Singapore in a given year, over a decade, or whatever? But regardless of those
questions, however we ask it, in some years, it might
rain more in Seattle. In other years, it might
rain more in Singapore. Or if we just picked Seattle,
it rains a different amount from year to year. In Singapore, it rains a
different amount from year to year. So how do we compare? Well, that's where the
statistics could be valuable. There's variability in the data. So we can look at the
data set for Seattle and come up with some type
of an average, some type of a central
tendency, and compare that to the average, the mean,
the mode-- the mode probably wouldn't be that useful
here-- to Singapore. So this is definitely
a statistical question. What was the
difference in rainfall between Singapore
and Seattle in 2013? Well, these two
numbers are known. They can be measured. Both the rainfall in
Singapore can be measured. The rainfall in Seattle
can be measured. And assuming that this
has already happened and we can measure them, then
we can just find the difference. So you don't need
statistics here. You just have to have
both of these measurements and subtract the difference. So not a statistical question. In general, will I use less
gas driving at 55 miles an hour than 70 miles per hour? This feels statistical,
because it probably depends on the circumstance. It might depend on the car. Or even for a given car, when
you drive at 55 miles per hour, there's some variation
in your gas mileage. It might be how recent an oil
change happened, what the wind conditions are like, what
the road conditions are like, exactly how
you're driving the car. Are you turning? Are you going in
a straight line? And same thing for
70 miles an hour. When we're saying
in general, there's variation in what the gas
mileage is at 55 miles an hour and at 70 miles an hour. What you'd probably
want to do is say, well, what's my average mileage when
I drive at 55 miles an hour and compare that to the average
mileage when I drive at 70. So because we have
this variability in each of those cases,
this is definitely a statistical question. Do English professors get paid
less than math professors? Once again, all
English professor don't get paid the same
amount, and all math professors don't get paid the same amount. Some English professors
might do quite well. Some might make very little. Same thing for math professors. So we'd probably want to
find some type of an average to represent the central
tendency for each of these. Once again, this is a
statistical question. Does the most highly paid
English professor at Harvard get paid more than the
most highly paid math professor at MIT? Well, now we're talking about
two particular individuals. You could go look
at their tax forms, see how much each
of them get paid, especially if we assume that
this is in a particular year. Let's just make it that
way, say in 2013, just so that we can remove some
variability that they might make from year to year, make
it a little bit more concrete. If this was does the most
highly paid English professor at Harvard get paid more than
the most highly paid math professor at MIT
in 2013, then you have an absolute number
for each of these people. And then you could just
compare them directly. So when we're talking
about a particular year, particular people, then it
isn't a statistical question.