If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Directional derivative

Directional derivatives tell you how a multivariable function changes as you move along some vector in its input space.  Created by Grant Sanderson.

Want to join the conversation?

  • mr pink red style avatar for user Marta Jablecka
    Shouldn't the vector v be change to its unit vector first?
    (54 votes)
    Default Khan Academy avatar avatar for user
  • leaf green style avatar for user Chris Forsyth
    Why are a(df/dx) and b(df/dy) ADDED together? He didn't really explain this in the video, he just said it would be a "good idea." It just seems to me like they should be kept separate, perhaps as two different entries in a vector.

    Oh wait now I get it. I'm gonna leave this explanation I thought of here for people who might have been just as confused as I was. Here's why they get added together...

    Think of f(x, y) as a graph: z = f(x, y). Think of some surface it creates. Now imagine you're trying to take the directional derivative along the vector v = [-1, 2]. If the nudge you made in the x direction (-1) changed the function by, say, -2 nudges, then the surface moves down by 2 nudges along the z-axis. Now imagine nudges in the y direction (+2) pushed the surface of the function up more that it was dragged down, by, let's say, +1.5 nudges each for a total of +3 nudges. The surface of the function has moved back up along the z-axis to +1 nudge above where it started. That's why you add the series of nudges together; it's a net change based on how a combination of nudges, in the component directions of the vector, affects the function overall.
    (42 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Mohith kankanala
      The above explanation is amazing, but I would like to add something. Remember that in this video we are talking multivariable function, meaning ℝ^2 --> ℝ, which is the symbol representation of saying that the function takes in 2 independent variables, x and y, as input and outputs 1 variable z. Small changes in x or y can cause changes in z. We add in the video because small changes in x and y cause changes in z only. Therefore, to find the net change to z, we would add the changes caused by x and y. Hope this helps!
      (9 votes)
  • blobby green style avatar for user ganesh8374387106
    what is the actual meaning of taking directional derivative
    i.e; derivative along some other vector
    (6 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Maria Calisto
      One way to think of it intuitively is to think the direction derivative as What the slope is going to be AS we're moving through a multivariable function in a certain direction.

      Imagine that you're hiking on a mountain and you want to know the slope in the direction you're looking. If you think of the mountain as a function, taking the derivative of that function on the direction you're looking will give you the slope of the mountain as you go on in that same direction.
      Mathematically you'd just think of the direction you're looking as a vector and then multiply that by the gradient

      (I know it's been a while since this was asked but I hope it helps someone)
      (18 votes)
  • marcimus purple style avatar for user Ajaykrishnan R
    In the process of computing directional derivatives the vector itself seems to be more important rather than the direction of it. I mean, differentiation includes just a tiny nudge. Then why should vectors with same direction but different magnitudes(scaled versions) give a different value for it. After all partial differentiation involves nudging input in either direction keeping the other constant . There the length of the x or y axis does not matter. Then why should the directional derivative depend on the the magnitude
    of the vector at all?
    (6 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Albert Nakao
      If I get this correctly.. this is the way I think of it.
      Lets say x and y are coordinates on a map, and f(x,y) is the elevation in some hilly region.

      Taking the directional derivative with a unit vector is akin to getting the slope of f() in the direction of that unit vector. So if you were standing on a hill at (x,y), this derivative would define how steep the f() is at that point, in that direction.
      However if you are moving on the hill, and you want to know how fast you are changing elevation, then your rate of change depends on three things: your speed, your direction and your location. Your direction and location determine the slope of the hill (as mentioned in the above paragraph) but then your speed determines how fast you are going, so if you double your speed (i.e. double the magnitude of your vector) then your rate of change doubles as well.
      (4 votes)
  • blobby green style avatar for user renebarrientos59
    the formula for the directional derivative used in this clip is good only provided the direction vector is a unit vector.
    (5 votes)
    Default Khan Academy avatar avatar for user
  • female robot amelia style avatar for user boron baruah
    i am wondering about how adding the rate of change of x component and y component leads to the rate of change of vector V .please save me
    (3 votes)
    Default Khan Academy avatar avatar for user
  • aqualine tree style avatar for user Jimmy
    I don't understand the use of gradient at isn't the gradient of a function a vector! This is driving me nuts :(
    (3 votes)
    Default Khan Academy avatar avatar for user
  • sneak peak green style avatar for user Claireliz
    I don't think I'm quite following the idea of visualizing the output of the function on a real number line. Are the "nudges" on the number line representative of changes in the f(x,y) produced when the original value is "nudged"? How come when y is nudged up the value of f goes down? Or is it more of a conceptual idea and not taken quite so literally?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • primosaur seedling style avatar for user john.doe.13896
    I'm a bit confused with the dot-product interpretation of the directional derivative formula. The dot product measures how much two vectors point the same direction, right? And it seems likely, that the closer your directional vector is to the gradient vector, the larger the slope in that direction, but to my mind this is not at all necessary. And even more confusing: if the angle between directional vector and gradient vector is 90 degrees, the dot product between them is zero. So the slope has to be always zero, if you look in a direction perpendicular to your gradient vector?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user logoffske
    What is this "nudge"? I've tried googling it, but it gave me nothing. Is this some sort of offset/translocation of function?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • female robot grace style avatar for user loumast17
      Nudge just means a small change.

      Maybe it will help if we look at the original derivative equation with the limit as h goes to 0 of ((f(x+h)-f(x))/(x+h-x))

      the x+h part means a small change in x, yes? Sometimes people just say "a nudge in the x direction" I hope that helped.
      (2 votes)

Video transcript

- [Voiceover] Hello everyone. So here I'm gonna talk about the directional derivative and that's a way to extend the idea of a partial derivative. And partial derivatives, if you remember, have to do with functions with some kind of multi-variable input, and I'll just use two inputs because that's the easiest to think about, and it could be some single variable output. It could also deal with vector variable outputs. We haven't gotten to that yet. So, we'll just think about a single variable, ordinary real number output that's, you know, an expression of x and y, and the partial derivative. One of the ways I said you could think about it is to take a look at the input space, your x and y-plane. So, this would be the x-axis, This is y. And you know, vaguely, in your mind, you're thinking that somehow this outputs to a line. This outputs to just the real numbers. And maybe you're thinking about a transformation that takes it there, or maybe you're just thinking, "Okay, this is the input space. "That's the output." And when you take the partial derivative, at some kind of point... So, I'll write it out like partial derivative of f with respect to x at a point like one, two. You think about that point. You know one, y is equal to two. And if you're taking with respect to x, you think about just nudging it a little bit in that x-direction, and you see what the resulting nudge is in the output space. And the ratio between the size of that resulting nudge, the original one, the ratio between, you know, partial-f and partial-x is the value that you want. And when you did it with respect to y, you know you were thinking about traveling in a different direction, maybe you nudge it straight up. And you're wondering, "Okay, how does "that influence the output?" And the question here, with directional derivatives, what if you have some vector, v, I'll give a little vector hat on top of it, that, you know, I don't know, let's say it's negative one, two, is the vector. So you'd be thinking about that as a step of negative one in the x-direction, and then two more in the y-direction. So, it's gonna be something that ends up there. This is your vector, v. At least, if you're thinking of v as stemming from the original point. And you're wondering, "What does a nudge "in that direction do to the function itself?" And remember, with these original, you know, nudges in the x-direction, nudges in the y, you're not really thinking of it as, "You know, this is kind of a large step." You're really thinking of it as something as itty-itty-bitty-bitty. You know it's not that, but it's really something very, very small, and formally you'd be thinking about the limit as this gets really, really. really small approaching zero. And this gets really, really small approaching zero, what does the ration of the two approach? And similarly with the y, you're not thinking of it as something, "This is pretty sizable," but it would be something really, really small. And the directional derivative is similar. You're not thinking of the actual vector actually taking a step along that, but you'd be thinking of taking a step along, say, h multiplied by that vector, and h might represent some really, really small numbers. You know, maybe this here is like 0.001. And when you're doing this formula, you'd just be thinking the limit as h goes to zero. So, the directional derivative is saying when you take a slight nudge in the direction of that vector, what is the resulting change to the output? And one way to think about this is you say, "Well, that slight nudge of the vector..." If we actually expand things out and we look at the definition itself, it'll be negative h, negative one times that component, and then two-h here. So it's kind of like you took negative one nudge, in the x-direction, and then two nudges in the y-direction. You know, so for whatever your nudge in the v-direction, there, you take a negative one step by x, and then two of them up by y. So, when we actually write this out... The notation, by the way, is you take that same nabla from the gradient but then you put the vector down here. So, this is the directional derivative in the direction of v. And there's a whole bunch of other notations too. You know, I think there's like derivative of f with respect to that vector, is one way people think about it. Some people will just write like partial with a little subscript vector. There's a whole bunch of different notations, but this is the one I like. You think that nabla with the little f down there with a little v for your vector, of f, and it's still a function of x and y. And the reason I like this is it's indicative of how you end up calculating it, which I'll talk about at the end of the video. And for this particular example, a good guess that you might have is to say, "Well, we take a negative step in the x-direction." So you think of it as whatever the change that's caused by such a step in the x-direction, you do the negative of that, and then it's two steps in the y-direction. So, whatever the change caused by a tiny step in the y-direction, let's just take two of those. Two times partial-f, partial-y. And this is actually how you calculate it. And if I was gonna be more general, you know, let's say we've got a vector, w. I'm going to keep it abstract and just call it a and b, as its components, other than the specific numbers. You would say that the directional derivative in the direction of w, whatever that is, of f is equal to a times the partial derivative of f with respect to x plus b times the partial derivative of f, with respect to y. And this is it. This is the formula that you would use for the directional derivative. And again, the way that you're thinking about this is you're really saying, you know, you take a little nudge that's a in the x-direction and b in the y-direction. So, this should kind of make sense. And sometimes you see this written not with respect to the partial derivatives themselves and the actual components, a and b, but with respect to the gradient. And this is because it makes it much more compact, more general, if you're dealing with other dimensions. So, we'll just write it over here. If you look at this expression, it looks like a dot product. If you would take the dot product of the vectors, a, b, and the one that has the partial derivatives in it. So, what's lined up with a is the partial derivative with respect to x, partial-f, partial-x, and what's lined up with b is the partial derivative with respect to y. And you look at this and you say, "Hey, a, b, I mean that's just the original vector. "Right, that's w. "That's the vector, w." And then you're dotting this with, well, partial derivative with respect to x, in one component, the other partial derivative in the other component. That's just the gradient. That is the gradient of f. And here, you know, it's nabla without that little w at the bottom, and this is why we use this notation because it's so suggestive of the way that you ultimately calculate it. So, this is really what you'll see in a textbook, or see as the compact way of writing it. And you can see how this is more flexible for dimensions. So, if we were talking about something that has like a five dimensional input and the vector of the direction you move has five different components. This is flexible. When you expand it, the gradient would have five components, and the vector itself would have five components. So, this is the directional derivative and how you calculate it. And the way you interpret, you're thinking of moving along that vector by a tiny nudge, by a tiny, you know, little value multiplied by that vector and saying, "How does that change the output "and what's the ratio of the resulting change?" And in the next video, I'll clarify that with the formal definition of the directional derivative itself.