Main content
Multivariable calculus
Course: Multivariable calculus > Unit 3
Lesson 2: Quadratic approximations- What do quadratic approximations look like
- Quadratic approximation formula, part 1
- Quadratic approximation formula, part 2
- Quadratic approximation example
- The Hessian matrix
- The Hessian matrix
- Expressing a quadratic form with a matrix
- Vector form of multivariable quadratic approximation
- The Hessian
- Quadratic approximation
© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice
Quadratic approximation
Quadratic approximations extend the notion of a local linearization, giving an even closer approximation of a function.
What we're building to
The goal, as with a local linearization, is to approximate a potentially complicated multivariable function f near some input, which I'll write as the vector start bold text, x, end bold text, start subscript, 0, end subscript. A quadratic approximation does this more tightly than a local linearization, using the information given by second partial derivatives.
Non-vector form
In the specific case where the input of f is two dimensional, and you are approximating near a point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, you will see below that the quadratic approximation ends up looking like this:
Vector form:
The general form of this, for a scalar-valued function f with any kind of multidimensional input, here's what that approximation looks like:
I know it looks a bit complicated, but I'll step through it piece by piece later on. Here's a brief outline of each term.
- f is a function with multi-dimensional input and a scalar output.
- del, f, left parenthesis, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis is the gradient of f evaluated at start bold text, x, end bold text, start subscript, 0, end subscript.
- start bold text, H, end bold text, start subscript, f, end subscript, left parenthesis, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis is the Hessian matrix of f evaluated at start bold text, x, end bold text, start subscript, 0, end subscript.
- The vector start bold text, x, end bold text, start subscript, 0, end subscript is a specific input, the one we are approximating near.
- The vector start bold text, x, end bold text represents the variable input.
- The approximation function, Q, start subscript, f, end subscript, has the same value as f at the point start bold text, x, end bold text, start subscript, 0, end subscript, all its partial derivatives have the same value as those of f at this point, and all its second partial derivatives have the same value as those of f at this point.
Tighter and tighter approximations
Imagine you are given some function f, left parenthesis, x, comma, y, right parenthesis with two inputs and one output, such as
The goal is to find a simpler function that approximates f, left parenthesis, x, comma, y, right parenthesis near some particular point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis. For example,
Zero-order approximation
The most naive approximation would be a constant function which equals the value of f at left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis everywhere. We call this a "0-order approximation".
In the example:
Written in the abstract:
Graphically:
The graph of this approximation function C, left parenthesis, x, comma, y, right parenthesis is a flat plane passing through the graph of our function at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, comma, f, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, right parenthesis. Below is a video showing how this approximation changes as we move the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis around.
The graph of f is pictured in blue, the graph of the approximation is white, and the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, comma, f, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, right parenthesis is pictured as a red dot.
First-order approximation
The constant function zero-order approximation is pretty lousy. Sure, it is guaranteed to equal f, left parenthesis, x, comma, y, right parenthesis at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, but that's about it. One step better is to use a local linearization, also known as a "First-order approximation".
In the example:
Written in the abstract:
Here, f, start subscript, x, end subscript and f, start subscript, y, end subscript denote the partial derivatives of f.
Graphically:
The graph of a local linearization is the plane tangent to the graph of f at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, comma, f, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, right parenthesis. Here is a video showing how this approximation changes as we move around the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis:
Second-order approximation.
Better still is a quadratic approximation, also called a "second-order approximation".
The remainder of this article is devoted to finding and understanding the analytic form of such an approximation, but before diving in, let's see what such approximations look like graphically. You can think of these approximations as nestling into the curves the graph at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, comma, f, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, right parenthesis, giving it a sort of mathematical hug.
"Quadratic" means product of two variables
In single variable functions, the word "quadratic" refers to any situation where a variable is squared as in the term x, squared. With multiple variables, "quadratic" refers not only to square terms, like x, squared and y, squared, but also terms that involve the product of two separate variables, such as x, y.
In general, the "order" of a term which is the product of several things, such as 3, x, squared, y, cubed, is the total number of variables multiplied into that term. In this case, the order would be 5: Two x's, three y's, and the constant doesn't matter.
Graphs of quadratic functions
One way to think of quadratic functions is in terms of their concavity, which might depend on which direction you are moving in.
If the function has an upward concavity, as is the case, for example, with f, left parenthesis, x, comma, y, right parenthesis, equals, x, squared, plus, y, squared, the graph will look something like this:
This shape, which is a three-dimensional parabola, goes by the name paraboloid.
If the function is concave up in one direction and linear in another, the graph looks like a parabolic curve has been dragged through space to trace out a surface. For example this happens in the case of f, left parenthesis, x, comma, y, right parenthesis, equals, x, squared, plus, y:
Finally, if the graph is concave up when traveling in one direction, but concave down when traveling in another direction, as is the case for f, left parenthesis, x, comma, y, right parenthesis, equals, x, squared, minus, y, squared, the graph looks a bit like a saddle. Here's what such a graph looks like:
Reminder on the local linearization recipe
To actually write down a quadratic approximation of a function f near the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, we build up from the local linearization:
It's worth walking through the recipe for finding the local linearization one more time since the recipe for finding a quadratic approximation is very similar.
- Start with the constant term f, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, so that our approximation at least matches f at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis.
- Add on linear terms start color #0c7f99, f, start subscript, x, end subscript, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, end color #0c7f99, left parenthesis, x, minus, x, start subscript, 0, end subscript, right parenthesis and start color #bc2612, f, start subscript, y, end subscript, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, end color #bc2612, left parenthesis, y, minus, y, start subscript, 0, end subscript, right parenthesis.
- Use the constants start color #0c7f99, f, start subscript, x, end subscript, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, end color #0c7f99 and start color #bc2612, f, start subscript, y, end subscript, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, end color #bc2612 to ensure that our approximation has the same partial derivatives as f at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis.
- Use the terms left parenthesis, x, minus, x, start subscript, 0, end subscript, right parenthesis and left parenthesis, y, minus, y, start subscript, 0, end subscript, right parenthesis instead of simply x and y so that we don't mess up the fact that our approximation equals f, left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis.
Finding the quadratic approximation
For the quadratic approximation, we add on the quadratic terms left parenthesis, x, minus, x, start subscript, 0, end subscript, right parenthesis, squared, left parenthesis, x, minus, x, start subscript, 0, end subscript, right parenthesis, left parenthesis, y, minus, y, start subscript, 0, end subscript, right parenthesis, and left parenthesis, y, minus, y, start subscript, 0, end subscript, right parenthesis, squared, and for now we write their coefficients as the constants start color #0c7f99, a, end color #0c7f99, start color #0d923f, b, end color #0d923f and start color #bc2612, c, end color #bc2612 which we will solve for in a moment:
In the same way that we made sure that the local linearization has the same partial derivatives as f at left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, we want the quadratic approximation to have the same second partial derivatives as f at this point.
The really nice thing about the way I wrote Q, start subscript, f, end subscript above is that the second partial derivative start fraction, \partial, squared, Q, start subscript, f, end subscript, divided by, \partial, x, squared, end fraction depends only on the start color #0c7f99, a, end color #0c7f99, left parenthesis, x, minus, x, start subscript, 0, end subscript, right parenthesis, squared term.
- Try it! Take the second partial derivative with respect to x of every term in the expression of Q, start subscript, f, end subscript, left parenthesis, x, comma, y, right parenthesis above, and notice that they all go to zero except for the start color #0c7f99, a, end color #0c7f99, left parenthesis, x, minus, x, start subscript, 0, end subscript, right parenthesis, squared term.
Did you really try it? I'm serious, take a moment to reason through it. It really helps in understanding why Q, start subscript, f, end subscript is expressed the way it is.
This fact is nice because rather than taking the second partial derivative of the entire monstrous expression, you can view it like this:
Since the goal is for this to match f, start subscript, x, x, end subscript, left parenthesis, x, comma, y, right parenthesis at the point left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis, you can solve for start color #0c7f99, a, end color #0c7f99 like this:
Test yourself: Use similar reasoning to figure out what the constants start color #0d923f, b, end color #0d923f and start color #bc2612, c, end color #bc2612 should be.
We can now write our final quadratic approximation, with all six of its terms working in harmony to mimic the behavior of f at left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, right parenthesis:
Example: Approximating sine, left parenthesis, x, right parenthesis, cosine, left parenthesis, y, right parenthesis
To see this beast in action, let's try it out on the function from the introduction.
Problem: Find the quadratic approximation of
about the point left parenthesis, x, comma, y, right parenthesis, equals, left parenthesis, start fraction, pi, divided by, 3, end fraction, comma, start fraction, pi, divided by, 6, end fraction, right parenthesis.
Solution:
To collect all the necessary information, you need to evaluate f, left parenthesis, x, comma, y, right parenthesis, equals, sine, left parenthesis, x, right parenthesis, cosine, left parenthesis, y, right parenthesis and all if its partial derivatives and all of its second partial derivatives at the point left parenthesis, start fraction, pi, divided by, 3, end fraction, comma, start fraction, pi, divided by, 6, end fraction, right parenthesis.
Almost there! As a final step, apply all these values to the formula for a quadratic approximation.
So for example, to generate the animation of quadratic approximations, this is the formula I had to plug into the graphing software.
Vector notation using the Hessian
Perhaps it goes without saying that the expression for the quadratic approximation is long. Now imagine if f had three inputs, x, y and z. In principle you can imagine how this might go, adding terms involving f, start subscript, z, end subscript, f, start subscript, x, z, end subscript, f, start subscript, z, z, end subscript, on and on with all 3 partial derivatives and all 9 second partial derivative. But this would be a total nightmare!
Now imagine you were writing a program to find the quadratic approximation of a function with 100 inputs. Madness!
It actually doesn't have to be that bad. When something is not that complicated in principle, it shouldn't be that complicated in notation. Quadratic approximations are a little complicated, sure, but they're not absurd.
Let's break this down:
- The boldfaced start bold text, x, end bold text represents the input variable(s) as a vector,Moreover, start bold text, x, end bold text, start subscript, 0, end subscript is a particular vector in the input space. If this has two components, this formula for Q, start subscript, f, end subscript is just a different way to write the one we derived before, but it could also represent a vector with any other dimension.
- The dot product del, f, left parenthesis, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis, dot, left parenthesis, start bold text, x, end bold text, minus, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis will expand into the sum of all terms of the form f, start subscript, x, end subscript, left parenthesis, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis, left parenthesis, x, minus, x, start subscript, 0, end subscript, right parenthesis, f, start subscript, y, end subscript, left parenthesis, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis, left parenthesis, y, minus, y, start subscript, 0, end subscript, right parenthesis, etc. if this is not familiar from the vector notation for local linearization, work it out for yourself in the case of 2-dimensions to see!
- The little superscript T in the expression left parenthesis, start bold text, x, end bold text, minus, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis, start superscript, T, end superscript indicates "transpose". This means you take the initial vector left parenthesis, start bold text, x, end bold text, minus, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis, which looks something like this:Then you flip it, to get something like this:
- start bold text, H, end bold text, start subscript, f, end subscript, left parenthesis, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis is the Hessian of f.
- The expression left parenthesis, start bold text, x, end bold text, minus, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis, start superscript, T, end superscript, start bold text, H, end bold text, start subscript, f, end subscript, left parenthesis, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis, left parenthesis, start bold text, x, end bold text, minus, start bold text, x, end bold text, start subscript, 0, end subscript, right parenthesis might seem complicated if you have never come across something like it before. This way of expressing quadratic terms is actually quite common in vector-calculus and vector-algebra, so it's worth expanding an expression like this at least a few times in your life. For example, try working it out in the case where start bold text, x, end bold text is two-dimensional to see what it looks like.You should find that it is exactly 2 times the quadratic portion of the non-vectorized formula we derived above.
What's the point?
In truth, it is a real pain to compute a quadratic approximation by hand, and it requires staying very organized to do so without making a little mistake. In practice, people rarely work through a quadratic approximation like the example above, but knowing how they work is useful for at least two broad reasons:
- Computation: Even if you never have to write out a quadratic approximation, you may one day need to program a computer to do it for a particular function. Or even if you are relying on someone else's program, you may need to analyze how and why the approximation is failing in some circumstance.
- Theory: Being able to reference a second-order approximation helps us to reason about the behavior of general functions near a point. This will be useful later in figuring out if a point is a local maximum or minimum.
Want to join the conversation?
- In the worked example (Approximating sin(x)cos(y)) the very last term in the solution (fyy) is written in brown as 3/4 - this is missing a minus sign(10 votes)
- In the example using sin(x)cos(y), the second derivative with respect to y (the last one) is sin(x)cos(y), but shouldn't it be -sin(x)cos(y)? If you have the first partial as -sin(x)sin(y), and take the partial of that with respect to y, you get the derivative of sin(y) = cos(y), not -cos(y), right? Why did the sign change again?(9 votes)
- during the last part ("vector notation using the hessian") I do not understand why is it necessary to transpose that vector in the quadratic term. I mean.. You can expand the quadratic term exacly in the same manner without transposing that vector right?? As it is done in the exercise you end up with 2 vectors, why would you need to have the vector on the left transposed??(2 votes)
- The dimensions must be right for matrix multiplication.(5 votes)
- fyy(x,y) = -sin(x)cos(y) not sin(x)cos(y).(4 votes)
- So, could these sorts of things be used to generalise the taylor series to higher dimension?(3 votes)
- Yep it is a generalisation, higher order terms consist of tensorlike operations (3. order fijk(x1,x2)*xi*xj*xk, while 2. order terms can be written as a matrix multiplication).(3 votes)
- What about cubic approximations? Would we need a cubical "Hessian Matrix" analogue?
And how would we define the multiplication?(2 votes) - What is the formula (not in the vector/matrix form) for a quadratic approximation when z is added to the input of the function f, making it f(x,y,z)?(2 votes)
- At the top, in your definition of Qf(x), I think the partial derivatives of Q are not the same as the partial derivatives of f, due to the presence of the quadratic term. Only the second partials match. I suppose we could modify the "coefficients" on the first-order term to include the negative of the value of the partial derivatives of the quadratic term. Would this improve the approximation? Hmm.(1 vote)
- When you evaluate at the particular point (x_0, y_0), the partial derivatives of the quadratic term go to zero.(2 votes)
- For the solution of finding the b constant, finding the first partial derivative with respect to y does not make c(y - y0)^2 zero. It would actually be 2c(y - y0). Nevertheless, this has no effect in the final answer as applying the partial derivative respect to x makes that term zero.(1 vote)
- Would it be possible to find f given Q and the input vector? so like finding a best for a particular set of data(1 vote)