Can someone further explain the reason for switching from [1 1} to [sqrt2/2 sqrt2/2]? Can you just do that for any problem and if so how? In other words: Why is the unit vector [sqrt2/2 sqrt2/2]? And how does one determine the unit vector for a given problem?

You simply divide both part of that vector with its absolute value. If v=ai+bj then unit vector is (a / sqrt(a^2+b^2) i + (b / sqrt(a^2+b^2) j. In this case it results in 1/sqrt(2) i + 1/sqrt(2) j . But what he doesn't mention is that he uses some algebra to transform 1/sqrt(2) into sqrt(2)/2 . Those two are basically the same ( just multiply both parts of the fraction with sqrt(2)

While the partial derivatives tells us how the value of the function changes when we change in x or y alone, the directional derivative tells us how the function would change if change both x and y at the same time. That is if we change the input by a vector which has components both in x and y directions. So the directional derivative is like generalizing the concept of the partial derivative to all directions? Am I correct?

Sure you could look at it that way. I feel a better way to look at it is that partial derivatives actually tell us the "directional derivate" along the î vector (for x derivative) and j vector (for y derivative). And now we're generalizing this to any arbitrary vector v.

I dont get why directional vector has to be a unit vector. It still shows direction why it has to be a unit vector. Could someone clarify please? thanks

First imagine two non-unit vectors with the same direction like [5,5] and [10,10], now plug in those values into the formula and it will become obvious the magnitude of the derivative of [10,10] vector will be larger even its direction is the same as the one of [5,5]. What the directional derivative calculates is how much an output function changes with respect to the DIRECTION you're going, NOT MAGNITUDE. If it's still not clear, imagine that you have a function f(x,y) = | a(x),g(y) | ,and you have a vector V which is equal to [5,5]. So to find the directional derivative of f(x,y) with respect V ,you would multiply ∇f(x,y) by the unit vector of V(which is[1/sqrt(2),1/sqrt(2)). But if you multiplied ∇f(x,y) by [5,5], then your result would be larger just because of the magnitude. Another way to think of unit vector is that it shows how much x and y change with respect to its length like if your vector is [1,0] then your direction is towards x axis only, so you would only consider a(x) because you are not moving in y direction, but vector [9999,0] has the same direction, however if you plug it expect for the unit vector, your result will be larger just as you multiplied your derivative by 9999. What unit vector does is showing how many % of a meter you go on x and y as you go one meter forth on your vector, so it would be illogical to have something larger than 1.

at 2:43, I m lost why instead of saying [1 1], it is changed to [ sqrt(2)/2 sqrt(2)/2)]?

He wants a vector with unit length, that is length 1. The vector <sqrt(2)/2, sqrt(2)/2> has length one. You can can imagine this vector on a 2d plane - it is sqrt(2)/2 long in the x-axis, and sqrt(2)/2 long in the y axis. We can then use pythagoras to find the length of the vector = sqrt((sqrt(2)/2)^2) + (sqrt(2)/2)^2) = sqrt(2/4 + 2/4) = 1.

In my textbook the definition of directional derivative is: ∇Ƒ(a,b) • u (any unit vector) Why you don't like the normalized vector definition?

I think because it limits the way you perceive the directional derivative as just the slope of graph. And not a more abstract concept.

How did you get ((2)^0.5)/2?

those coordinates are legs of a right triangle, and the hypotinuse (the lenght of the vector with those coordinates) =1. By pythogoras

How come when we find the gradient of a function, it return a vector, but with the partial derivative, it only return a value?

Just by definition, the gradient is the vector comprised of the two partial derivatives, while each partial derivative is just the derivative that focuses on one variable. It might help to think of it as the partials each focus on one while the gradient is taking into account both variables , so to describe both variables we need one "thing" that has both at once. A vector can have as many elements as we like so it works out. More technically, a partial derivative gives the derivative with respect to one variable while holding the other constant. The gradient meanwhile describes what direction you want to face, so that a point on the surface graphed, you move in the direction of steepest ascent. I don't know if this video is after that particular detail was discussed. Basically though the gradient will tell you a vector int he xy plane and you imagine a pont on the surface f(x,y) that has a "forward" direction, then you twist it so the forward direction is in line with that vector. I really hope this made sense. Let me know if not though.

Thanks for a great video. In your example the result was sqrt2 + sqrt2/2 (at 6 min 18 sec). What exactly is this telling us?

This is the slope of the function f at (-1, -1) in the direction of the direction vector v. It is a scalar value. The dot product of two vectors is a scalar value.

Main content

Course: Multivariable calculus > Unit 2

Lesson 2: Gradient and directional derivatives

Directional derivatives and slope

Name: Directional derivatives and slope
Uploaded: 2016-05-12T01:27:10Z
Description: The directional derivative can be used to compute the slope of a slice of a graph, but you must be careful to use a unit vector.

Google Classroom

The directional derivative can be used to compute the slope of a slice of a graph, but you must be careful to use a unit vector. Created by Grant Sanderson.

Want to join the conversation?

Sort by:

Gabe Rudansky
Posted 8 years ago. Direct link to Gabe Rudansky's post “Can someone further expla...”
Can someone further explain the reason for switching from [1 1} to [sqrt2/2 sqrt2/2]? Can you just do that for any problem and if so how? In other words: Why is the unit vector [sqrt2/2 sqrt2/2]? And how does one determine the unit vector for a given problem?
Button navigates to signup pageButton navigates to signup page
(27 votes)
Answer
- Dino Rendulić
  Posted 8 years ago. Direct link to Dino Rendulić's post “You simply divide both pa...”
  You simply divide both part of that vector with its absolute value. If v=ai+bj then unit vector is (a / sqrt(a^2+b^2) i + (b / sqrt(a^2+b^2) j. In this case it results in 1/sqrt(2) i + 1/sqrt(2) j . But what he doesn't mention is that he uses some algebra to transform 1/sqrt(2) into sqrt(2)/2 . Those two are basically the same ( just multiply both parts of the fraction with sqrt(2)
  Button navigates to signup page
  (32 votes)
Sreeram
Posted 7 years ago. Direct link to Sreeram's post “While the partial derivat...”
While the partial derivatives tells us how the value of the function changes when we change in x or y alone, the directional derivative tells us how the function would change if change both x and y at the same time. That is if we change the input by a vector which has components both in x and y directions.

So the directional derivative is like generalizing the concept of the partial derivative to all directions?

Am I correct?
Button navigates to signup pageComment on Sreeram's post “While the partial derivat...”
(18 votes)
Answer
- Erwin
  Posted 7 years ago. Direct link to Erwin's post “Sure you could look at it...”
  Sure you could look at it that way.
  
  I feel a better way to look at it is that partial derivatives actually tell us the "directional derivate" along the î vector (for x derivative) and j vector (for y derivative). And now we're generalizing this to any arbitrary vector v.
  Button navigates to signup page
  (13 votes)
Sanam
Posted 7 years ago. Direct link to Sanam's post “I dont get why directiona...”
I dont get why directional vector has to be a unit vector. It still shows direction why it has to be a unit vector. Could someone clarify please? thanks
Button navigates to signup pageButton navigates to signup page
(10 votes)
Answer
- timothy.sudanov
  Posted 7 years ago. Direct link to timothy.sudanov's post “First imagine two non-uni...”
  First imagine two non-unit vectors with the same direction like [5,5] and [10,10], now plug in those values into the formula and it will become obvious the magnitude of the derivative of [10,10] vector will be larger even its direction is the same as the one of [5,5]. What the directional derivative calculates is how much an output function changes with respect to the DIRECTION you're going, NOT MAGNITUDE. If it's still not clear, imagine that you have a function f(x,y) = | a(x),g(y) | ,and you have a vector V which is equal to [5,5]. So to find the directional derivative of f(x,y) with respect V ,you would multiply ∇f(x,y) by the unit vector of V(which is[1/sqrt(2),1/sqrt(2)). But if you multiplied ∇f(x,y) by [5,5], then your result would be larger just because of the magnitude. Another way to think of unit vector is that it shows how much x and y change with respect to its length like if your vector is [1,0] then your direction is towards x axis only, so you would only consider a(x) because you are not moving in y direction, but vector [9999,0] has the same direction, however if you plug it expect for the unit vector, your result will be larger just as you multiplied your derivative by 9999. What unit vector does is showing how many % of a meter you go on x and y as you go one meter forth on your vector, so it would be illogical to have something larger than 1.
  Button navigates to signup page
  (24 votes)
Corsino Alexandre
Posted 8 years ago. Direct link to Corsino Alexandre's post “In my textbook the defini...”
In my textbook the definition of directional derivative is: ∇Ƒ(a,b) • u (any unit vector)
Why you don't like the normalized vector definition?
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- Laurens Sandt
  Posted 8 years ago. Direct link to Laurens Sandt's post “I think because it limits...”
  I think because it limits the way you perceive the directional derivative as just the slope of graph. And not a more abstract concept.
  Button navigates to signup page
  (15 votes)
Thomas Qian
Posted 7 years ago. Direct link to Thomas Qian's post “at 2:43, I m lost why ins...”
at
2:43
, I m lost why instead of saying [1 1], it is changed to [ sqrt(2)/2 sqrt(2)/2)]?
Button navigates to signup pageButton navigates to signup page
(5 votes)
Answer
- garyybai
  Posted 7 years ago. Direct link to garyybai's post “He wants a vector with un...”
  He wants a vector with unit length, that is length 1. The vector <sqrt(2)/2, sqrt(2)/2> has length one. You can can imagine this vector on a 2d plane - it is sqrt(2)/2 long in the x-axis, and sqrt(2)/2 long in the y axis. We can then use pythagoras to find the length of the vector = sqrt((sqrt(2)/2)^2) + (sqrt(2)/2)^2) = sqrt(2/4 + 2/4) = 1.
  Button navigates to signup page
  (6 votes)
Sarah Lewis
Posted 8 years ago. Direct link to Sarah Lewis's post “How did you get ((2)^0.5)...”
How did you get ((2)^0.5)/2?
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- Taras.Pokalchuk
  Posted 8 years ago. Direct link to Taras.Pokalchuk's post “those coordinates are leg...”
  those coordinates are legs of a right triangle, and the hypotinuse (the lenght of the vector with those coordinates) =1. By pythogoras
  Button navigates to signup page
  (5 votes)
fabian.bourdeaux
Posted 4 years ago. Direct link to fabian.bourdeaux's post “How come when we find the...”
How come when we find the gradient of a function, it return a vector, but with the partial derivative, it only return a value?
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- loumast17
  Posted 4 years ago. Direct link to loumast17's post “Just by definition, the g...”
  Just by definition, the gradient is the vector comprised of the two partial derivatives, while each partial derivative is just the derivative that focuses on one variable.
  
  It might help to think of it as the partials each focus on one while the gradient is taking into account both variables , so to describe both variables we need one "thing" that has both at once. A vector can have as many elements as we like so it works out.
  
  More technically, a partial derivative gives the derivative with respect to one variable while holding the other constant. The gradient meanwhile describes what direction you want to face, so that a point on the surface graphed, you move in the direction of steepest ascent. I don't know if this video is after that particular detail was discussed. Basically though the gradient will tell you a vector int he xy plane and you imagine a pont on the surface f(x,y) that has a "forward" direction, then you twist it so the forward direction is in line with that vector.
  
  I really hope this made sense. Let me know if not though.
  Button navigates to signup page
  (4 votes)
Radek
Posted 7 years ago. Direct link to Radek's post “So... the directional vec...”
So... the directional vector doesn't have to an unit vector? Doesn't that affect the result?
And doesn't the gradient have to be unit length too?
I am confused about how the length of a vector affects its direction...
Button navigates to signup pageButton navigates to signup page
(5 votes)
Answer
JanDavid93
Posted 8 years ago. Direct link to JanDavid93's post “Thanks for a great video....”
Thanks for a great video. In your example the result was sqrt2 + sqrt2/2 (at 6 min 18 sec). What exactly is this telling us?
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- Ron Waller
  Posted 7 years ago. Direct link to Ron Waller's post “This is the slope of the ...”
  This is the slope of the function f at (-1, -1) in the direction of the direction vector v. It is a scalar value. The dot product of two vectors is a scalar value.
  Button navigates to signup page
  (2 votes)
GAURAV MANWANI
Posted 6 years ago. Direct link to GAURAV MANWANI's post “How can a vector define t...”
How can a vector define the plane? There can be many planes through that vector
Button navigates to signup pageComment on GAURAV MANWANI's post “How can a vector define t...”
(4 votes)
Answer
- Heiz
  Posted 8 months ago. Direct link to Heiz's post “It seems that it is the p...”
  It seems that it is the plane that is spanned by vectors (0, 0, 1) and v.
  Button navigates to signup page
  (1 vote)

Video transcript

- [Voiceover] Hello everyone, what I wanna talk about here is how to interpret the directional derivative in terms of graphs. I have here the graph of a function, a multivariable function, it's f of xy is equal to x squared times y. In the last couple videos I talked about what the directional derivative is, how you can formally define it, how you can compute it using the gradient. Generally the setup that you might have is, you have some kind of vector, and this is a vector in the input space so in this case it's gonna be in the xy plane. In this case I'll just take the vector 1 1. Okay? And the directional derivative, which we denote by kind of taking the gradient symbol, except you stick the name of that vector down in the lower part there, the directional derivative of your function, it'll still take the same input. This is kind of a measure of how the function changes when the input moves in that direction. So I'll show you what I mean, I mean you could imagine slicing this graph by some kind of plane but that plane doesn't necessarily have to be parallel to the x or y axes. That's what we did for the partial derivative, we took a plane that represented the constant x value or the constant y value, but this is gonna be a plane that kind of tells you what movement in the direction of your vector looks like, and like I have a number of other times I'm gonna go ahead and slice the graph along that plane, and just to make it clear, I'm gonna color in where the graph intersects that slice. This vector here, this little v, you'll be thinking of it as living on the xy plane and it's determining the direction of this plane that we're slicing things with. On the xy plane you've got this vector, it's 1 1, it kind of points to that diagonal direction, and then you take the whole plane and you slice your graph. And if we want to interpret the directional derivative here, I'm gonna go ahead and fill it an actual value, so let's say we wanted to do it at -1 1, - 1, -1 'cause I guess I chose a plane that passes through the origin, so I've got to make sure that the point I'm evaluating actually goes along this plane, but you could imagine one that points in the same direction, but you kind of slide it back and forth, if we're doing this, we can interpret this as a slope, but you have to be very careful, if you're gonna interpret this as a slope, it has to be the case that you're dealing with a unit vector, that the magnitude of your vector is equal to 1. I mean, it doesn't have to be, you can kind of account for it later but it makes it more easy to think about. If we're just thinking of a unit vector. When I go over here instead of saying that it's 1 1, I'm gonna say it's whatever vector points in the same direction but has a unit length, and in this case that happens to be square root of 2 over 2, for each of the components. You can kind of think about why that would be true by diagonal but this is a vector with unit length, and its magnitude is 1, and it points in that direction. If we're evaluating this negative point like 1 1, we can draw that on the graph, see where it actually is, and in this case it'll be, oops, moving things about when I had a point. It'll be this point and if you kinda look from above, you see that's -1, -1, and if we want the slope at that point, you're kinda thinking of the tangent line here. Tangent line to that curve, and we're wondering what its slope is, so, the reason that the directional derivative is gonna give us this slope, is because, another notation that might be kinda helpful for what this directional derivative is, some people will write partial f, and partial v. You can think about that as taking a slight nudge in the direction of v, so this would be a little nudge, a little partial nudge in the direction of v. And then you're saying "what changed in the value of the function that's then resulting?" "The height of the graph, does it devalue the function?". As this initial change approaches zero and the resulting change approaches zero as well, that ratio, the ratio of the partial f to partial v, is going to give the slope of this tangent line. Conceptually, that's kind of a nicer notation, but the reason we use this other notation is nabla sub v 1, is it's very indicative of how you compute things once you need it computed. You take the gradient of f, just the vector value function gradient of f, and take the dot product with the vector. Let's actually do that, just to see what this would look like, and I'll go ahead and write it over here, use a different color. The gradient of f, first of all, is a vector full of partial derivatives, it'll be the partial derivative of f with respect to x and the partial derivative of f with respect to y. When we actually evaluate this, we take a look, partial derivative of f with respect to x, x looks like the variable y I just a constant, so its partial derivative is 2 times x times y. 2 times x times y. but when we take the partial with respect to y, y now looks like a variable, and x looks like a constant, derivative of a constant times a variable, is just that constant x squared. And if we were to evaluate this at the point -1, -1, then you can plug that in, 2 times -1 times -1 would be 2, and then negative 1 squared, would be 1. So that would be our gradient at that point, which means if we want to evaluate gradient of f times v, we could go over here, and say that's 2 1, 'cause we evaluate the gradient at the point we care about. And then the dot product, with v itself in this case, root 2 over 2, and root 2 over 2. The answer that we get, we multiply the fist two components together, 2 times root 2 over 2, then square it to 2, and then here we multiply the second components together, and that's gonna be 1 times root 2 over 2, root 2 over 2, and that would be our answer, that would be our slope. But this only works if your vector is a unit vector, and I showed this in the last video where we talked about the formal definition of the directional derivative. If you scale v by 2, and I can do it here if instead of v you're talking about 2 v, so I'll go ahead and make myself some room here. If you're taking the directional derivative along 2 v of f, the way that we're computing that, we're still taking the gradient of f, dot product with 2 times your vector, and dot product, you can pull out that too. This is just gonna double the value of the entire thing. V, this started with v, it's gonna be twice the value, the derivative will become twice the value, but you don't necessarily want that because you'd see this plane you sliced with, if instead of doing it in the direction of v, the unit vector, you did in the direction of 2 times v, it's the same plane, it's the same slice you're taking, and you'd want that same slope, so that's gonna mess everything up. This is super important if you're thinking about things in the context of slope, one thing that you could say is your formula for the slope of a graph in the direction of v, is you take your directional derivative, that dot product between f and v, and you just always make sure to divide it by the magnitude of v, divide it by that magnitude. That will always take care of what you want, that's basically a way of making sure that really, you're taking the directional derivative in the direction of a certain unit vector. Some people even go so far as to define the directional derivative to be this, to be something where you normalize out the length of that vector. I don't really like that, but I think that's because they're thinking of the slope context, they're thinking of rates of change as being the slope of a graph. One thing I'd like to emphasize as always, graphical intuition is good, and visual intuition is always great, you should always be trying to find a way to think about things visually, but with multivariable functions, the graph isn't the only way. You can kind of more generally think about just a nudge in the v direction, and in the context where v doesn't have a length 1, the nudge doesn't represent an actual size but it's a certain scaling constant times that vector, you can look at the video on the formal definition for the directional derivative, if you want more details on that. But I do think this is actually a good way to get a feel for what the directional derivative is all about.