If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

## Statistics and probability

### Course: Statistics and probability>Unit 9

Lesson 2: Continuous random variables

# Probability density functions

Probability density functions for continuous random variables. Created by Sal Khan.

## Want to join the conversation?

• At Sal says that the two statements P(|Y-2|<.1) and P(1.9<Y<2.1) are the same. Why? •  |Y-2|<.1 is the same as 1.9<Y<2.1 because :
solving "|Y-2|<.1" for "(Y-2) >= 0" gives "Y-2<.1" wich gives "Y < 2.1"
solving "|Y-2|<.1" for " (Y-2) < 0" gives "-Y+2<.1" wich gives "-Y<-1.9" wich gives "Y > 1.9"
Search for lecture about absolute value for more explanation.
• I have a hard time wrapping my head around infinity (probably not the first one.)
I get the concept of continuity and that the probability on a specific point is zero. I am just curious... if the area under the curve is 1 but the curve goes on to +Inf (or in case of a normal distribution even to -Inf and +Inf) then it feels like you could add a little area to the right whenever you want to - so going on forever even with smaller and smaller probability that gets added to the area. So how can something be fixed to 1 when the area itself is not really fixed.
I guess its the same paradox like the finger of a spinning wheel that has 0 probabilty to stop at any particular point but eventually does stop somewhere.... • In answer to your question about how the total area can be fixed to 1 even though the curve may continue to infinity: try thinking of it this way: start with 0.5 and keep adding half and half again: that is, 0.5+0.25+0.125+0.0625 +.... (keep going forever). However far you go will not get an infinite number, you will get a number that keeps approaching but not quite reaching 1; that is, 'tending to' 1. (You may prefer to think of this as 1/2 + 1/4 + 1/8 + 1/16 + 1/32 + ..... + 1/n where n tends to infinity.) So, contrary to our intuitive first impression, it is actually possible to add increasingly small amounts infinitely and yet never be in 'danger' of exceeding a certain finite total. In this case it's because you are only ever adding on half of what you would actually need to reach 1 - like being 1m away from a wall and walking half a metre, then a quarter of a metre, etc...... - but I'm sure there are other cases!

• This might be stupid, but instead of asking for precisely P(Y = 2), couldn't we ask for the limit as Y approaches 2? • The question of "limit as Y approaches 2" is not at all stupid: It is exactly the point! You just have to be careful with placement of the word "precisely": The limit of Prob(Y= precisely 2) = limit of Prob(Y= 2 +/- a bit) as "a bit" approaches zero,... which = 0 . On the other hand... limit Prob(Y = 2 +/- a bit) =1 as "a bit" approaches "whatever" (ie. as "a bit" approaches infinity).
• Is there any continuation to this with multidimensional density functions??? more continuos density functions or expected values from continuos density functions???? beacuse i could not find any video. • Nice question! Yes, there are joint probability density functions of more than one variable! If X_1, X_2, ... , X_n are continuous random variables, then their joint density function is denoted by f(x_1, x_2, ... , x_n).

The joint cumulative distribution function of X_1, X_2, ... , X_n is given by
F(x_1, x_2, ... , x_n) = P(X_1 <= x_1 and X_2 <= x_2 and ... and X_n <= x_n)
= integral -infinity to x_1 integral -infinity to x_2 ... integral -infinity to x_n of f(y_1, y_2, ... , y_n) dy_n ... dy_2 dy_1.

The joint probability density function, f(x_1, x_2, ... , x_n), can be obtained from the joint cumulative distribution function by the formula

f(x_1, x_2, ... , x_n) = n-fold mixed partial derivative of F(x_1, x_2, ... , x_n) with respect to x_1, x_2, ... , x_n.

If A is a subset of R^n (i.e. n-dimensional space), then the probability that (X_1, X_2, ... , X_n) is in A is given by

P((X_1, X_2, ... , X_n) is in A) =
n-fold integral over (X_1, X_2, ... , X_n) in A of f(x_1, x_2, ... , x_n) dV,

where dV is the n-dimensional infinitesimal volume element.

For a function g of these n random variables, the expectation of g is given by
E(g(X_1, X_2, ... , X_n)) = integral -infinity to infinity integral -infinity to infinity ... integral -infinity to infinity of f(x_1, x_2, ... , x_n) g(x_1, x_2, ... , x_n) dx_n ... dx_2 dx_1.

Have a blessed, wonderful day!
• I don't understand how you are supposed to draw a continuous probability graph properly, as it is just a probability of 0 along the whole graph. • The probability of 2 inches of rain can't be zero, can it? I get that we can't be certain but probabilit y of 0 would imply that we never ever get 2 inches of rain but we couldn't be sure of that. I would really like to get this point cleared. • The probability of exactly two inches of rain is zero. But we can think about the probability of getting between 1.9 and 2.1 inches of rain and the probability of getting between 1.99 and 2.01 inches of rain and so on, because all of those probabilities with actual intervals will be non-zero. So if you consider the ratio of those probabilities to the length of the intervals and take the limit of that ratio as the intervals become very very small, you will get, in some sense, the relative likelihood that you will get "around" two inches of rain, which is what the continuous density function is trying to measure.
• Does sal explain the area under the curve and explain the fact that its equal to 1 in any of his videos? • In the video on Discrete and Continuous random variables, Sal said you can have an infinite number of variables for the discrete case, so long as they are countable and listable. But in this video, he says you can only have a finite number (around 20-30 seconds in). So which is it? • There's a pop-up text in the bottom-right where they mention that discrete RVs can take on countably infinite number of values. And that's the correct answer - discrete variables can have an infinite number of values, as long as it's 'countably infinite'.

For example, there's the Poisson distribution, it's used to model things that have to do with a number of events per some unit, e.g. "How many texts do you receive per day?" This is how many events (texts) by some unit (per day). You could have 0 texts, 1 text, 2, 3, 4, etc etc. There's really no upper bound on the number of texts you can receive, so it can go up to infinity. But we can't have the in-betweens, there's no way to get 3.5 texts.

Since we can go up to infinity, but we're restricted to whole numbers, this is known as countably infinite. Not all discrete distributions go up to infinity, but some do.
• I am not able to fully comprehend the probability density function. So what does the probability density function itself measures? What exactly does the y-axis of this function represents if it is not the probability? • The pdf and the y-value are talking about density.

It's fairly math-heavy to try and explain it, the intuitive idea is that with discrete variables, the height of the bars of the probability distribution function can be thought of as actual probability - and is equivalent to the density.

With continuous variables, we can't do this, and the reason is that there are S) MANY possible values that the variable can take on. Sal sort of explains at . For instance, in the video, the density at x=2 is roughly 0.5, right? Well, if we don't move too far away from 2, then the height at all the points around 2 will also have density of about 0.5. So 1.9 and 2.1 have, say, density of 0.45. And then 1.95 and 2.05 might have density of 0.48. Are you seeing the problem? We have 5 numbers with various densities, if we add them up, we get 0.5+0.45+0.45+0.48+0.48 = 2.36. So already with just a few numbers, this simply cannot be probability!

You might think that we could just make the graph shorter, but it wouldn't work, because a line is infinitely thin, there will always be just far too many possible outcomes, and we'll always wind up with the total probability going over 1.

So we change to thinking about the probability density. What we want is for the entire area beneath the line to be 1. Or in calculus terms, we want our pdf to integrate to 1. The density function allows us to do this. 