If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

# The Hessian matrix

## Video transcript

hey guys so before talking about the vector form for the quadratic approximation of multivariable functions I've got to introduce this thing called the Hessian matrix Hessian matrix and essentially what this is it's just a way to package all the information of the second derivatives of a function so let's say you have some kind of multi variable function like I don't know like the example we had in the last video e to the X halves multiplied by sine of Y so some kind of multi variable function what the hatching matrix is and it's often denoted with an H where kind of a bold-faced H is it's a matrix incidentally enough that contains all the second partial derivatives of F so the first component is going to be the partial derivative of F with respect to X kind of twice in a row and everything in this first column it's kind of like you first do it with respect to X because the next part is the second derivative where first you do it with respect to X and then you do it with respect to Y so that's kind of the first column of the matrix and then up here it's the partial derivative where first you do it with respect to Y and then you do it with respect to X and then over here it's where you do it with respect to Y both times in a row so partial with respect to Y both times in a row so let's go ahead and actually compute this and think about what this would look like in the case of our specific function here so in order to get all the second partial derivatives we first should just kind of keep a record of the first partial derivatives so the partial derivative of F with respect to X the only place X shows up is in this e to the X halves kind of bring down that 1/2 e to the X halves and sine of Y just looks like a constant as far as X is concerned sine of Y and then the partial derivative with respect to Y partial derivative of F with respect to Y now e to the X halves looks like a constant and it's being multiplied by something that has a Y in it e to the X halves and the derivative of sine of Y since we're doing it with respect to Y is cosine of Y cosine of Y so these terms won't be included in the Hessian itself but we're kind of just kind of keeping a record of the because now when we go in to fill in the matrix this upper left component we're taking the second partial derivative where we do it with respect to X then X again so up here's up here's when we did it with respect to X if we did it with respect to X again we kind of bring down another half so that becomes 1/4 by e to the X halves and that sine of Y just still looks like a constant sine of Y and then this mixed partial derivative where we do it with respect to X then Y so we did it with respect to X here when we differentiate this with respect to Y the 1/2 e to the X halves just looks like a constant but then derivative of sine of Y ends up as cosine of Y and then up here it's going to be the same thing but let's kind of see how it when you do it in the other direction when you do it first with respect to Y then X so over here we did it first with respect to Y if we took this derivative with respect to X you'd have the half would come down so that would be 1/2 e to the X halves multiplied by cosine of Y because that just looks like a constant since we're doing it with respect to X the second time so that would be cosine of Y and it shouldn't feel like a surprise that both of these terms are now to be the same with most functions that's the case technically not all functions you can come up with some crazy things where this won't be symmetric where you'll have different terms than the diagonal but for the most part those you can kind of expect to be the same and then this last term here where we do it with respect to Y twice we now think of taking the derivative of this whole term with respect to Y that e to the X halves looks like a constant and derivative of cosine is negative sine negative sine of Y so this whole thing a matrix each of whose components is a multivariable function is the Hessian this is the Hessian of F and sometimes people will write it as Hessian of F kind of specifying what function it's of and you could think of this I mean you could think of it as a matrix valued function which feels kind of weird but you you know you plug in two different values x and y and you'll get a matrix so it's this matrix valued function and the nice thing about writing it like this is that you can actually extend it so that rather than just for functions that have two variables let's say you had a function you know kind of like this up let's say you had a function that had three variables or four variables or kind of any number so let's say it was you know a function of XY and Z then you can follow this pattern and following down the first column here the next term that you would get would be the second partial derivative of F where first you do it with respect to X and then you do it with respect to Z and then over here it would be the the second partial derivative of F where first you did it with respect to first you did it with respect to Y and then you do it with respect to Z I'll clear up even more room here because you'd have another column where you'd have the second partial derivative where this time everything you know first you do it with respect to Z and then with respect to X and then over here you'd have the second partial derivative where first you do it with respect to Z and then with respect to Y and then there's the very last component you'd have the second partial derivative where first you do it with respect to well I guess you do it with respect to Z twice so this whole thing this three by three matrix would be the Hessian of a three variable function and you can see how you could extend this pattern where if it was a four variable function you get a four by four matrix of all of the possible second partial derivatives and if it was a 100 variable function you would have a 100 by 100 matrix so the nice thing about having this is then we can we can talk about that by just referencing this symbol and we'll see in the next video how this makes it very nice to express for example the quadratic approximation of any kind of multi variable function not just a two variable function and the symbols don't get way out of hand because you don't have to reference each one of these individual components you can just reference the matrix as a whole and start doing matrix operations and I will see you in that next video