If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

# Vector form of multivariable quadratic approximation

## Video transcript

okay so we are finally ready to Express the quadratic approximation of a multivariable function in vector form so I have the whole thing written out here where F is the function that we are trying to approximate X naught Y naught is the constant point about which we are approximating and then this entire expression is the quadratic approximation which I've talked about in past videos and if it seems very complicated or absurd or you're unfamiliar with it and just dissecting it real quick this over here is the constant term this is just going to evaluate to a constant everything over here is the linear term because it just involves taking a variable multiplied by a constant and then the remainder every one of these components will have two variables multiplied into it so like x squared comes up and x times y and Y squared comes up so that's the quadratic term quadratic now to vectorize things first of all let's write down the input the input variable X Y as a vector and typically we'll do that with a bold-faced X to indicate that it's a vector and it's components are just going to be the the single variables x and y the non bold-faced so this is the vector representing the variable input and then correspondingly a bold-faced X with a little subscript o X naught is going to be the constant input this single point in space near which we are approximating so when we write things like that this constant term simply enough is going to look like evaluating your function at that that bold-faced X naught so that's probably the easiest one to handle now the linear term this looks like a dot product and if we kind of expand it out as the dot product it looks like we're taking the partial derivative of F with respect to X and then the partial derivative with respect to Y and we're evaluating both of those at that bold-faced X naught input X naught as its input now each one of those partial derivatives is multiplied by variable minus constant number so this looks like taking the dot product here I'm going to erase the word linear we're taking with X minus X naught and y minus why not this is just expressing the same linear term but as a dot product but the convenience here is that this is totally the same thing as saying the gradient of F gradient of F that's the vector that contains all the partial derivatives evaluated at the special input X naught and then we're taking the dot product between that and the variable vector bold-faced X minus X naught since when you do this component wise boldface X minus X naught if we kind of think here it'll be X the variable minus X naught the constant Y the variable minus y naught the constant which is what we have up there so this expression kind of vector eise's the whole linear term and now the beef here the hard part how are we going to how are we going to vectorize this quadratic term now that's what I was leading to in the last couple videos where I talked about how you express a quadratic form like this with a matrix and the way that you do it I'm just kind of scroll down to give us some room the way that you do it is we'll have a matrix whose components are all of these constants it'll be this one half times the second partial derivative evaluated there and I'm just going to for convenience sake I'm going to just take one-half times the second partial derivative with respect to X and leave it as understood that we're evaluating it at this point and then on the other diagonal you have one half times the other kind of partial derivative with respect to Y two times in a row and then we're going to multiply it by by this constant here but this term kind of gets broken apart into two different components if you'll remember in the quadratic form video it was always things where was a and then two B and C as your constants for the quadratic form so if we're interpreting this as two times something then it gets broken down and on one corner it shows up as F XY and on the other one kind of 1/2 F XY so like both of these together are going to constitute the entire mixed partial derivative and then the way that we express the quadratic form is we're going to multiply this by well by what well the first component is whatever the thing is that's squared here so it's going to be that X minus x naught and then the second component is whatever the other thing squared is which in this case is y minus y naught and of course we take that same vector but we put it in on the other side too so so let me make a little bit of room so this is going to be wide so we're going to take that same vector and then kind of put it on its side so it'll be X minus X naught as the first component and then Y minus y naught as the second component but it's written horizontally and this if you multiply out the entire matrix is going to give us the same expression that you have up here and if that seems unfamiliar if that seems you know how do you go from there to there check out the video on quadratic forms or you can check out the article where I'm talking about the quadratic approximation as a whole I kind of go through the computation there now this matrix right here is almost the Hessian matrix this is why I made a video about the Hessian matrix it's not quite because everything has a 1/2 multiplied into it so I'm just going to kind of take that out and we'll remember we have to multiply a 1/2 in at some point but otherwise it is the Hessian matrix which we denote with a kind of bold-faced H bold-faced H and emphasize that it's the Hessian of F the Hessian is something you take of a function and like I said remember each of these terms we should be thinking of as evaluated on the special input point evaluating an at that special you know bold-faced X naught input point I was just kind of too lazy to write it in each time the X naught Y naught X naught y naught X naught y naught all of that but what we have then is we're multiplying it on the right by this whole vector is the variable vector bold-faced X minus boldface X naught that's what that entire vector is and then we kind of have the same thing on the right you know boldface vector X minus X naught accept that we transpose it we kind of put it on its side and the way you do note that you have a little T therefore transpose so this term captures all of the quadratic information that we need for the approximation so just to put it all together if we go back up when we put the the constant term that we have the linear term and this quadratic form that we just found all together what we get is that the quadratic approximation of F which is a function we'll think of it as a vector input bold-faced X it equals the function itself evaluated at you know whatever point we're approximating near plus the gradient of F which is kind of its vector analog of a derivative evaluated at that point so this is a constant vector dot product with the with a variable vector X minus the constant vector X not that whole thing plus one half the we'll just copy down this whole quadratic term up there the variable minus the constant x the Hessian which is kind of like an extension of the second derivative to multivariable functions and we're evaluating that notice we're evaluating it at the at the constant at the constant X naught and then on the right side we're multiplying it by the variable X minus X naught and this this is the quadratic approximation in vector form and the important part is now it doesn't just have to be of a two variable input you could imagine plugging in a three variable input or four variable input and all of these terms make sense you know you take the gradient of a four variable function you'll get a vector with four components you take the Hessian of a four variable function you would get a four by four matrix and all of these terms make sense and I think it's also prettier to write it this way because it looks a lot more like a Taylor expansion and symbol but in the single variable world you have you know a constant term plus the value of a derivative times X minus a constant plus one half what's kind of like the second derivative term was kind of like taking an x squared but this is how it looks in the vector world so in that way it's actually maybe a little bit more familiar than writing it out in the full you know component by component term where it's easy to kind of get lost in the weeds they're so full vectorized form of the quadratic approximation of a scalar valued multivariable function but ways that a lot to say