Main content

## Statistics and probability

### Course: Statistics and probability > Unit 9

Lesson 1: Discrete random variables- Random variables
- Discrete and continuous random variables
- Constructing a probability distribution for random variable
- Constructing probability distributions
- Probability models example: frozen yogurt
- Probability models
- Valid discrete probability distribution examples
- Probability with discrete random variable example
- Probability with discrete random variables
- Mean (expected value) of a discrete random variable
- Expected value
- Mean (expected value) of a discrete random variable
- Expected value (basic)
- Variance and standard deviation of a discrete random variable
- Standard deviation of a discrete random variable

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Random variables

Basic idea and definitions of random variables. Created by Sal Khan.

## Want to join the conversation?

- At1:24, could you define the variables Heads and Tails using the numbers 1 and 2, and then stating 0 as a value that cannot be an outcome.(14 votes)
- You can define them however you want. 0 and 1 are nice if you end up calculating sums - 1 and 2 would interfere with each other without special care.(17 votes)

- Could you explain the difference between random and arbitrary? And also, how it relates to probability theory?(4 votes)
- From the Oxford English Dictionary:

Random (Statistics): Governed by or involving equal chances for each of the actual or hypothetical members of a population; (also) produced or obtained by a such a process, and therefore unpredictable in detail.

Arbitrary: Derived from mere opinion or preference; not based on the nature of things; hence, capricious, uncertain, varying.

When someone says "pick a random number", the following definition might apply:

"Having no definite aim or purpose; not sent or guided in a particular direction; made, done, occurring, etc., without method or conscious choice; haphazard."(4 votes)

- I'm a bit confused; how can we decide if something is a random variable or not?(4 votes)
- As the word suggest that Random means any number (in mathematical terms) and variable means whose value can change all the time and takes up the value which you assign to it (in Computer science terms though context is same in both and maths). So Random Variable means that for any event if you are calculating the value you may assign it to a variable randomly. To make it simpler further let's say here in example we are using roling of dice, so we cannot predict before hand which face would be up so it means random. Now let's say on first roll we get 2 then on subsequent rolls 4 then 5 then 6 and so on. So you see we are getting random faces up for the dice and we cannot predict if on the next roll it would be 3 or 5 or 6 or 2 or whatever. I hope this explains the concept of random variable. There can be 2 types of Random variable Discrete and Continuous. Discrete which cannot have decimal value e.g. no. of people, we cannot have 2.5 or 3.5 persons and Continuous can have decimal values e.g. height of person, time, etc..(4 votes)

- At0:53Sal defines a random variable X as 1 if heads and 0 if tails. If he had defined X as H if heads and T if tails, would X be a random variable? Why or why not?(3 votes)
- Then H and T would be random variables. The reason you want to adress numbers to them is that it becomes easy to deal with the possible outcomes. Imagine i have two coins and i use the definition sal gave for X. I flip the coins 100 times each and write down the results. then i want to know wich coin varied the most. For this problem, i could use the standard deviation concep. comput my results and see wich coins has a greater tendency for dispersion. Quantifying the events gives us that much power to better analyze them.

Hope this helped!(5 votes)

- what does he mean by rolling 7 dice?? is it that the dice is rolled 7 times??(2 votes)
- Yes - he mean taking one die, rolling it seven times and summing up each result into a total. (You could achieve the same result by rolling 7 dice all at once. ) For example you roll a 5, then a 3, then a 2, then another 5, a 1 , a 2 and a 4. The result is 5+3+2+5+1+2+4 = 22. That is the process. Repeat it many times and you get a sample set.

The probabilities he mentioned are , when doing that process 1) what is the probability that the results is less than 30 and 2) what is the probability that the result is even.(4 votes)

- Is the difference between a variable 'x' and a Random variable 'X' simply that x represents a single number, whilst X represents a set of numbers? So by quantifying the results, you mean that X contains a numerical value for each possible outcome for the random process.(2 votes)
- Well, variables don't have to be single numbers. Take the equation | x | - 4 = 0. If the
*absolute value*of x minus four equals zero, then both negative four (-4) and positive four (4) are correct. However they are fixed values which can be solved.

The term "random" in random variable really says it all. You can't determine what the result is, rather you can express probabilities of certain outcomes. For instance, with normal variables, if I want to know what the variable x must be to make y = 0 in the function y = x -7, you simply plug in numbers and find that x must equal 7.

But if you wanted to say X = the sum of two six-sided dice, but put it in the same equation, so y = X -7. You come to the same results of knowing X must equal 7, however you're incorporating elements (The two dice) which no longer can simply be substituted with a fixed number. So a more logical question involving the Random variable becomes, what is the probability that X is equal to 7.

Realistically the point of the Random Variable is to define the set of outcomes (The results of two six-sided dice summed in this example) in the shortest way, to make the*notation*of the math as simple (And easy to write out) as possible.(2 votes)

- Is a random variable a function? If so, is it possible to plot it ?(2 votes)
- yes, since each outcome is only mapped to one value, it is a function, and that is the definition of a Random Variable. It is also possible to plot Outcome vs Number although it isn't needed. We are more interested in each value's Probabilities(1 vote)

- If X is the exact time it takes for a random computer to start up, is it discrete or continuous?(1 vote)
- Unless we are rounding off time, it is considered continuous.(3 votes)

- in the example of random variable Y, what are the outcomes and what are the number which the outcomes are assigned to them? how does the random variable Y work ?(1 vote)
- In the video Sal defines 𝑌 as the sum of 7 dice, and I assume he means fair 6-sided dice.

When rolling 7 dice we could get

6, 3, 6, 2, 4, 4, 6

so one possible value for 𝑌 is

6 + 3 + 6 + 2 + 4 + 4 + 6 = 31

The lowest possible value is

1 + 1 + 1 + 1 + 1 + 1 + 1 = 7

and the highest possible value is

6 + 6 + 6 + 6 + 6 + 6 + 6 = 42(2 votes)

- I did not get the last bit of the first video, could you tell me the difference between a general variable like 'x' and a random variable X = {all possible outcomes of an event}. Can I say that, all the possible outcomes of an event can be denoted by different variables, and a set of these variables is a Random Variable?(2 votes)
- I'm not an expert, but I think there is a lot of flexibility with this. I disagree with Sal, because there is usually no probability of "A = tossing 7 dice" when no outcome is associated with it - he can use A that way but I don't think it's a random variable. A random variable can be associated with any outcome, and for outcomes A, B, C, D,... etc. you can have random variable X1 = (A or B or...... etc.) or X2 = (A and B and..... etc.), and other combinations of outcomes for which probability can be calculated. You could certainly have a set of random variables, but that set itself is not a random variable, because there isn't enough information to determine a result. It's just there. (Correct me if I'm wrong).(0 votes)

## Video transcript

What I want to discuss a
little bit in this video is the idea of a
random variable. And random variables at first
can be a little bit confusing because we will want to think
of them as traditional variables that you were first exposed
to in algebra class. And that's not quite what
random variables are. Random variables are
really ways to map outcomes of random processes to numbers. So if you have a random process,
like you're flipping a coin or you're rolling dice or you
are measuring the rain that might fall tomorrow,
so random process, you're really just mapping
outcomes of that to numbers. You are quantifying
the outcomes. So what's an example
of a random variable? Well, let's define
one right over here. So I'm going to define
random variable capital X. And they tend to be
denoted by capital letters. So random variable capital
X, I will define it as-- It is going
to be equal to 1 if my fair die rolls heads--
let me write it this way-- if heads. And it's going to be
equal to 0 if tails. I could have defined
this any way I wanted to. This is actually a
fairly typical way of defining a random variable,
especially for a coin flip. But I could have
defined this as 100. And I could have
defined this as 703. And this would still be a
legitimate random variable. It might not be as pure a
way of thinking about it as defining 1 as
heads and 0 as tails. But that would have
been a random variable. Notice we have taken this
random process, flipping a coin, and we've mapped the outcomes
of that random process. And we've quantified them. 1 if heads, 0 if tails. We can define another random
variable capital Y as equal to, let's say, the sum of
rolls of let's say 7 dice. And when we talk
about the sum, we're talking about the
sum of the 7-- let me write this-- the
sum of the upward face after rolling 7 dice. Once again, we are quantifying
an outcome for a random process where the random process
is rolling these 7 dice and seeing what
sides show up on top. And then we are taking those
and we're taking the sum and we are defining a
random variable in that way. So the natural
question you might ask is, why are we doing this? What's so useful about defining
random variables like this? It will become
more apparent as we get a little bit
deeper in probability. But the simple way
of thinking about it is as soon as you
quantify outcomes, you can start to do a little
bit more math on the outcomes. And you can start
to use a little bit more mathematical
notation on the outcome. So for example, if you
cared about the probability that the sum of the upward
faces after rolling seven dice-- if you cared
about the probability that that sum is less than
or equal to 30, the old way that you would have
to have written it is the probability
that the sum of-- and you would have to write
all of what I just wrote here-- is less than or equal to 30. You would have had to
write that big thing. And then you would try
to figure it out somehow if you had some information. But now we can just
write the probability that capital Y is less
than or equal to 30. It's a little bit
cleaner notation. And if someone else cares
about the probability that this sum of the upward
face after rolling seven dice-- if they say, hey, what's the
probability that that's even, instead of having to
write all that over, they can say, well, what's the
probability that Y is even? Now the one thing that
I do want to emphasize is how these are different
than traditional variables, traditional variables that
you see in your algebra class like x plus 5 is equal
to 6, usually denoted by lowercase variables. y is equal to x plus 7. These variables, you can
essentially assign values. You either can solve for
them-- so in this case, x is an unknown. You could subtract 5 from
both sides and solve for x. Say that x is going
to be equal to 1. In this case, you could say,
well, x is going to vary. We can assign a
value to x and see how y varies as a function of x. You can either
assign a variable, you can assign values to them. Or you can solve for them. You could say, hey x is
going to be 1 in this case. That's not going to be the
case with a random variable. A random variable can take on
many, many, many, many, many, many different values with
different probabilities. And it makes much
more sense to talk about the probability of
a random variable equaling a value, or the probability
that it is less than or greater than something,
or the probability that it has some property. And you see that in
either of these cases. In the next video, we'll
continue this discussion and we'll talk a little
bit about the types of random variables
you can have.