Main content

## Gradient and directional derivatives

# Directional derivative

## Video transcript

- [Voiceover] Hello everyone. So here I'm gonna talk about
the directional derivative and that's a way to extend the idea of a partial derivative. And partial derivatives, if you remember, have to do with functions with some kind of multi-variable input, and I'll just use two inputs because that's the easiest to think about, and it could be some
single variable output. It could also deal with
vector variable outputs. We haven't gotten to that yet. So, we'll just think
about a single variable, ordinary real number output that's, you know, an expression of x and y, and the partial derivative. One of the ways I said
you could think about it is to take a look at the input space, your x and y-plane. So, this would be the x-axis, This is y. And you know, vaguely, in your mind, you're thinking that somehow
this outputs to a line. This outputs to just the real numbers. And maybe you're thinking
about a transformation that takes it there, or maybe you're just thinking, "Okay, this is the input space. "That's the output." And when you take the partial derivative, at some kind of point... So, I'll write it out like partial derivative of f with respect to x at a point like one, two. You think about that point. You know one, y is equal to two. And if you're taking with respect to x, you think about just
nudging it a little bit in that x-direction, and you see what the resulting nudge
is in the output space. And the ratio between the size of that resulting nudge, the original one, the ratio between, you know, partial-f and partial-x is the value that you want. And when you did it with respect to y, you know you were thinking about traveling in a different direction,
maybe you nudge it straight up. And you're wondering, "Okay, how does "that influence the output?" And the question here, with
directional derivatives, what if you have some vector, v, I'll give a little
vector hat on top of it, that, you know, I don't
know, let's say it's negative one, two, is the vector. So you'd be thinking about that as a step of negative one in the x-direction, and then two more in the y-direction. So, it's gonna be something
that ends up there. This is your vector, v. At least, if you're
thinking of v as stemming from the original point. And you're wondering, "What does a nudge "in that direction do
to the function itself?" And remember, with these
original, you know, nudges in the x-direction, nudges in the y, you're not really thinking of it as, "You know, this is kind of a large step." You're really thinking
of it as something as itty-itty-bitty-bitty. You know it's not that,
but it's really something very, very small, and
formally you'd be thinking about the limit as this gets
really, really. really small approaching zero. And this gets really, really
small approaching zero, what does the ration of the two approach? And similarly with the
y, you're not thinking of it as something,
"This is pretty sizable," but it would be something
really, really small. And the directional derivative is similar. You're not thinking of the actual vector actually taking a step along that, but you'd be thinking
of taking a step along, say, h multiplied by that vector, and h might represent some
really, really small numbers. You know, maybe this here is like 0.001. And when you're doing this formula, you'd just be thinking the
limit as h goes to zero. So, the directional derivative is saying when you take a slight
nudge in the direction of that vector, what is the
resulting change to the output? And one way to think
about this is you say, "Well, that slight nudge of the vector..." If we actually expand things out and we look at the definition itself, it'll be negative h, negative
one times that component, and then two-h here. So it's kind of like you
took negative one nudge, in the x-direction, and then
two nudges in the y-direction. You know, so for whatever
your nudge in the v-direction, there, you take a negative one step by x, and then two of them up by y. So, when we actually write this out... The notation, by the way, is you take that same nabla from the gradient but then you put the vector down here. So, this is the directional derivative in the direction of v. And there's a whole bunch
of other notations too. You know, I think there's
like derivative of f with respect to that vector, is one way people think about it. Some people will just write like partial with a little subscript vector. There's a whole bunch
of different notations, but this is the one I like. You think that nabla with
the little f down there with a little v for your vector, of f, and it's still a function of x and y. And the reason I like this is it's indicative of how
you end up calculating it, which I'll talk about
at the end of the video. And for this particular example, a good guess that you
might have is to say, "Well, we take a negative
step in the x-direction." So you think of it as whatever the change that's caused by such a
step in the x-direction, you do the negative of that, and then it's two steps
in the y-direction. So, whatever the change
caused by a tiny step in the y-direction, let's just take two of those. Two times partial-f, partial-y. And this is actually how you calculate it. And if I was gonna be more general, you know, let's say we've got a vector, w. I'm going to keep it abstract and just call it a and
b, as its components, other than the specific numbers. You would say that the
directional derivative in the direction of w, whatever that is, of f is equal to a times the partial derivative
of f with respect to x plus b times the partial derivative of f, with respect to y. And this is it. This is the formula that you would use for the directional derivative. And again, the way that
you're thinking about this is you're really saying, you know, you take a little nudge
that's a in the x-direction and b in the y-direction. So, this should kind of make sense. And sometimes you see this written not with respect to the
partial derivatives themselves and the actual components, a and b, but with respect to the gradient. And this is because it
makes it much more compact, more general, if you're
dealing with other dimensions. So, we'll just write it over here. If you look at this expression,
it looks like a dot product. If you would take the dot
product of the vectors, a, b, and the one that has the
partial derivatives in it. So, what's lined up with a
is the partial derivative with respect to x, partial-f, partial-x, and what's lined up with b
is the partial derivative with respect to y. And you look at this and you say, "Hey, a, b, I mean that's
just the original vector. "Right, that's w. "That's the vector, w." And then you're dotting this with, well, partial derivative
with respect to x, in one component, the
other partial derivative in the other component. That's just the gradient. That is the gradient of f. And here, you know, it's
nabla without that little w at the bottom, and this is
why we use this notation because it's so suggestive of the way that you ultimately calculate it. So, this is really what
you'll see in a textbook, or see as the compact way of writing it. And you can see how this is
more flexible for dimensions. So, if we were talking about something that has like a five dimensional input and the vector of the
direction you move has five different components. This is flexible. When you expand it,
the gradient would have five components, and the vector itself would have five components. So, this is the directional derivative and how you calculate it. And the way you interpret, you're thinking of
moving along that vector by a tiny nudge, by a tiny,
you know, little value multiplied by that vector and saying, "How does
that change the output "and what's the ratio of
the resulting change?" And in the next video, I'll clarify that with the formal definition of the directional derivative itself.