Optimizing multivariable functions
Current time:0:00Total duration:11:37
Warm up to the second partial derivative test
- So in single variable calculus, if you have a function f of x and you want to find the maximum or the minimum of this function, what you do is you find its derivative and you set that equal to zero. And graphically, this has the interpretation that, you know, if you have the graph of f, setting its derivative equal to zero means that you're looking for places where its got a flat tangent line. So in the graph that I drew, it would be these two flat tangent lines. And then once you find these points, you know for example, here you have one solution that I'll call x1 and then here you have another solution, x2, you can ask yourself the question are these maxima, or are they minima, right? Because both of these can have flat tangent lines. So when you do find this and you want to understand is it a maximum or a minimum, if you're just looking at the graph, we can tell. You can tell that this point here is a local maximum and this point here is a local minimum. But if you weren't looking at the graph there's a nice test that will tell you the answer. You basically look for the second, second derivative and in this case because the concavity is down, that second derivative is going to be less than zero, and then over here, because the concavity is up, that second derivative is greater than zero. And by getting this information about the concavity you can make a conclusion that when the concavity is down, you're at a local maximum, when the concavity is up, you're at a local minimum. In the case where the second, second derivative is zero, it's undetermined. You'd have to do more tests to figure it out. It's unknown. So in the multi-variable world, the situation is very similar. As I've talked about in previous videos, what you'd do is you'd have some kind of function and let's say it's a two variable function, and instead of looking for where the derivative equals zero, you're gonna be looking for where the gradient of your function is equal to the zero vector, which we might make bold to emphasize that that's a vector. And that corresponds with finding flat tangent planes. If that seems unfamiliar, go back and take a look at the video where I introduce the idea of multi-variable maxima and minima. But the subject of this video is gonna be on what is analogous to this second derivative test, where in the single variable world, you just find the second derivative and check if it's greater than or less than zero. How can we, in the multi-variable world, do something similar to figure out if you have a local minimum, a local maximum, or that new possibility of a saddle point, that I talked about in the last video? So there is another test and it's called the second partial derivative test. I'll get to the specifics of that at the very end of this video. To set the landscape, I want to actually talk through a specific example where we're finding when the gradient equals zero, just to see what that looks like and just to have some concrete formulas to deal with. So, the function that you're looking at right now is f of xy is equal to x to the fourth, minus four x squared, plus y squared. Okay, so that's the function that we're dealing with. In order to find where its tangent plane is flat, we're looking for where the gradient equals zero. And remember, this is just really a way of unpacking the requirements that both partial derivatives, the partial derivative of f with respect to x, at some point, and we'll kind of write it in as we're looking for the x and y where this is zero, and also where the partial derivative of f with respect to y at that same point, xy is equal to zero. So the idea is that this is gonna give us some kind of system of equations that we can solve for x and y. So let's go ahead and actually do that. In this case, the partial derivative with respect to x, we look up here and the only places where x shows up, we have x to the fourth minus four x squared, so that x to the fourth, turns into four times x cubed, minus four x squared, that becomes minus eight x, and then y, y just looks like a constant. So we're adding a constant and nothing changes here. So the first requirement is that this portion is equal to zero. Now the second part, where we're looking for the partial derivative with respect to y, the only place where y shows up is this y squared term, so the partial derivative with respect to y is just two y. And we're setting that equal to zero. I chose a simple example where these partial derivative equations, you know this one nicely only includes x and this one nicely only includes y but that's not always the case. You can imagine if you intermingle the variables a little bit more, these will actually intermingle Xs and Ys and it'll be a harder thing to solve. But I just want something where we can actually start to find the solutions. So if we actually solve this system, this equation here, the two y equals zero, just gives us the fact that y has to equal zero. So that's nice enough, right? And then this second equation, that four x cubed minus eight x equals zero, let's go ahead and rewrite that where I'm going to factor out one of the Xs and factor out a four, so this is four x multiplied by x squared, minus two, has to equal zero. So there's two different ways that this can equal zero, right? Either x itself is equal to zero, so that would be one solution, x is equal to zero, or x squared minus two is zero, which would mean x is plus or minus the square root of two. So we have x is plus or minus the square root of two. So the solution to the system of equations, we know that no matter what, y has to equal zero, and then one of three different things can happen. X equals zero, x equals positive square root of two, or x equals negative square root of two. So this gives us three separate solutions, and I'll go ahead and write them down. Our three solutions as ordered pairs are gonna be either zero, zero; for when x is zero and y is zero. You have square root of two, zero. And then you have negative square root of two, zero. These are the three different points, the three different values, for x and y that satisfy the two requirements that both partial derivatives are zero. What that should mean on the graph then is when we look at those three different inputs, all of those have flat tangent planes. So the first one, zero, zero, if we kind of look above, I guess we're kind of inside the graph here, zero, zero, is right at the origin. We can see, just looking at the graph, that that's actually a saddle point. You know, this is neither a local maximum nor a local minimum. It doesn't look like a peak or like a valley. Then the other two, where we kind of move along the x axis, and that guess it turns out that this point here is directly below x equals positive square root of two, and this other minimum is directly below x equals negative square root of two. I wouldn't have been able to guess that just looking at the graph but we just figured it out. We can see visually that both of those are local minima. But the question is, how could we have figured that out, once we find these solutions, if you didn't have the graph to look at immediately, how could you have figured out that zero, zero corresponds to a saddle point, and that both of these other solutions correspond to local minima? Well following the idea of the single variable second derivative test, what you might do is take the second partial derivatives of our function and see how that might influence concavity. For example, if we take the second partial derivative with respect to x, and I'll try to squeeze it up here. Second partial derivative of the function, with respect to x, and we're doing that twice, we're taking the second derivative of this expression, with respect to x, so we bring down that three, and that's gonna become 12 because three times four times x squared, 12 times x squared minus eight, minus eight. So what this means, woah, kind of moved that around. What this means in terms of the graph is that if we move purely in the x direction, which means we kind of cut it with a plane, representing a constant y value, and we look at the slice of the graph itself, this expression will tell us the concavity at every given point. So these bottom two points here correspond to plus and minus x equals the square root of two. So if we go over here and think about the case where x equals the square root of two, and we plug that in to the expression, what are we gonna get? Well, we're gonna get 12 multiplied by, if x equals square root of two, then x squared is equal to two, so that's 12 times two, minus eight. So that's 24 minus eight. We're gonna get 16. Which is a positive number, which is why you have positive concavity at each of these points. So as far as the x direction is concerned it feels like oh yes, both of these have positive concavity, so they should look like local minima. Then if you plug in zero, if instead we went over here and we said x equals zero, then when you plug that in, you'd have 12 times zero, minus eight. And instead of 16, you would be getting negative eight. So because we have a negative amount that gives you this negative concavity on the graph, which is why, as far as x is concerned, the origin looks like a local maximum. So let's actually write that down. If we kind of go down here and we're analyzing each one of these, and we think about what does it look like from the perspective of each variable? As far as x is concerned, that origin should look like a max, and then each of these two points should look like minima. This is kind of what the variable x thinks. And then the variable y, if we do something similar, and we take the second partial derivative with respect to y, I'll go ahead and write that over here because this'll be pretty quick, second partial derivative with respect to y, we're taking the derivative of this expression, with respect to y, and that's just a constant. That's just two. And because it's positive, it's telling you that as far as y is concerned, there's positive concavity every where. And on the graph, what that would mean, what that would mean, if you just look at things where you kind of slicing with a constant x value to see pure movement in the y direction, there's always going to be positive concavity. And here I've only drawn the plane where x is constantly equal to zero, but if you imagine kind of sliding that plane around left and right, you're always getting positive concavity. So as far as y is concerned, everything looks like a local minimum. So we kind of go down here and you'd say everything looks like a local minimum. Minimum, minimum, and minimum. So it might be tempting here to think that you're done, to think you found all the information you need to. Because you say well in the x and y direction, they disagree about whether that origin should be a maximum or a minimum, which is why it looks like a saddle point, and then they agree, they agree on the other two points, that both of them should look like a minimum, which is why, which is why you could say, you think you might say, both of these guys look like a minimum. However, that's actually not enough. There are cases, there are examples that I could draw where doing this kind of analysis would lead you to the wrong conclusion. You would conclude that certain points are, you know, a local minimum when in fact they're a saddle point. And the basic reason is that you need to take into account information given by that other second partial derivative. Because in the multi-variable world, you can take the partial derivative with respect to one variable, and then with respect to another. And you have to take into account this mixed partial derivative term in order to make full conclusions. And I'm a little bit afraid that this video might be running long, so I'll cut it short here and then I will give you the second partial derivative test in it's full glory, accounting for this mixed partial derivative term in the next video. I'll also, you know, give intuition for where this comes in, why this comes in, why this simple analysis that we did in this case is close and it does give intuition but it's not quite full and it won't give you the right conclusion always. All right, I will see you then.