If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Interpretation of Lagrange multipliers

Lagrange multipliers are more than mere ghost variables that help to solve constrained optimization problems...

Lagrange multipliers technique, quick recap

Constrained optimization
Image credit: By Nexcis (Own work) [Public domain], via Wikimedia Commons
When you want to maximize (or minimize) a multivariable function start color #0c7f99, f, left parenthesis, x, comma, y, comma, dots, right parenthesis, end color #0c7f99 subject to the constraint that another multivariable function equals a constant, start color #bc2612, g, left parenthesis, x, comma, y, comma, dots, right parenthesis, equals, c, end color #bc2612, follow these steps:
  • Step 1: Introduce a new variable start color #0d923f, lambda, end color #0d923f, and define a new function L as follows:
    L, left parenthesis, x, comma, y, comma, dots, comma, start color #0d923f, lambda, end color #0d923f, right parenthesis, equals, start color #0c7f99, f, left parenthesis, x, comma, y, comma, dots, right parenthesis, end color #0c7f99, minus, start color #0d923f, lambda, end color #0d923f, left parenthesis, start color #bc2612, g, left parenthesis, x, comma, y, comma, dots, right parenthesis, minus, c, end color #bc2612, right parenthesis
    This function L is called the "Lagrangian", and the new variable start color #0d923f, lambda, end color #0d923f is referred to as a "Lagrange multiplier"
  • Step 2: Set the gradient of L equal to the zero vector.
    del, L, left parenthesis, x, comma, y, comma, dots, comma, start color #0d923f, lambda, end color #0d923f, right parenthesis, equals, start bold text, 0, end bold text, left arrow, start color gray, start text, Z, e, r, o, space, v, e, c, t, o, r, end text, end color gray
    In other words, find the critical points of L.
  • Step 3: Consider each solution, which will look something like left parenthesis, x, start subscript, 0, end subscript, comma, y, start subscript, 0, end subscript, comma, dots, comma, start color #0d923f, lambda, end color #0d923f, start subscript, 0, end subscript, right parenthesis. Plug each one into f. Or rather, first remove the start color #0d923f, lambda, end color #0d923f, start subscript, 0, end subscript component, then plug it into f, since f does not have start color #0d923f, lambda, end color #0d923f as an input. Whichever one gives the greatest (or smallest) value is the maximum (or minimum) point your are seeking.

Budgetary constraints, revisited

The last article covering examples of the Lagrange multiplier technique included the following problem.
  • Problem: Suppose you are running a factory, producing some sort of widget that requires steel as a raw material. Your costs are predominantly human labor, which is dollar sign, 20 per hour for your workers, and the steel itself, which runs for dollar sign, 170 per ton. Suppose your revenue R is loosely modeled by the equation
    R(h,s)=200h2/3s1/3\begin{aligned} \quad R(h, s) = 200 h^{2/3} s^{1/3} \end{aligned}
    Where
    • h represents hours of labor
    • s represents tons of steel
    If your budget is dollar sign, 20, comma, 000, what is the maximum possible revenue?
You can get a feel for this problem using the following interactive diagram, which let's you see which values of left parenthesis, h, comma, s, right parenthesis yield a given revenue (blue curve) and which values satisfy the constraint (red line).
The full details of the solution can be found in the last article. For our purposes here, you just need to know what happens in principle as we follow the steps of the Lagrange multiplier technique.
  • We start by writing the Lagrangian L, left parenthesis, h, comma, s, comma, lambda, right parenthesis based on the function start color #0c7f99, R, left parenthesis, h, comma, s, right parenthesis, end color #0c7f99 and the constraint start color #bc2612, 20, h, plus, 170, s, equals, 20, comma, 000, end color #bc2612.
    L(h,s,λ)=200h2/3s1/3λ(20h+170s20,000)\begin{aligned} \quad \mathcal{L}(h, s, \lambda) = \blueE{200h^{2/3}s^{1/3}}-\lambda(\redE{20h+170s-20{,}000}) \end{aligned}
  • Then we find the critical points of L, meaning the solutions to
    L(h,s,λ)=0\begin{aligned} \quad \nabla \mathcal{L}(h, s, \lambda) = 0 \end{aligned}
  • There might be several solutions left parenthesis, h, comma, s, comma, lambda, right parenthesis to this equation,
    (h0,s0,λ0)(h1,s1,λ1)(h2,s2,λ2)\begin{aligned} \quad (h_0, &s_0, \lambda_0) \\ (h_1, &s_1, \lambda_1) \\ (h_2, &s_2, \lambda_2) \\ &\vdots \end{aligned}
    so for each one you plug in the h and s components to the revenue function R, left parenthesis, h, comma, s, right parenthesis to see which one actually corresponds with the maximum.
It's common to write this maximizing critical point as left parenthesis, h, start superscript, times, end superscript, comma, s, start superscript, times, end superscript, comma, lambda, start superscript, times, end superscript, right parenthesis, using asterisk superscripts to indicate that this is a solution. This means h, start superscript, times, end superscript and s, start superscript, times, end superscript represent the hours of labor and tons of steel you should allocate to maximize revenue subject to your budget. But how can we interpret the Lagrange multiplier lambda, start superscript, times, end superscript that comes with these maximizing values? This is the core question of the article.
It turns out that lambda, start superscript, times, end superscript tells us how much more money we can make by changing our budget.
Let's get a feel for what it means to change the budget. The following tool is similar to the one above, but now the red line representing which points left parenthesis, h, comma, s, right parenthesis satisfy the budget constraint will shift as you let the budget vary around dollar sign, 20, comma, 000. This budget is represented with the variable b.
For each value of the budget b, try to maximize R while ensuring that the curves still touch each other. Notice that the maximum R-value you can achieve changes as b changes. We are interested in studying the specifics of that change.
Let M, start superscript, times, end superscript represent the maximum revenue you achieve. In the next interactive diagram, the only variable you can change is b, and you can see how the value of M, start superscript, times, end superscript depends on b.
In other words, this maximum revenue M, start superscript, times, end superscript is a function of the budget b, so we write it as
M(b)\begin{aligned} \quad M^*(b) \end{aligned}
We can now express a truly wonderful fact: The Lagrange multiplier lambda, start superscript, times, end superscript, left parenthesis, b, right parenthesis gives the derivative of M, start superscript, times, end superscript:
dMdb(b)=λ(b)\begin{aligned} \quad \dfrac{dM}{db}(b) = \lambda^*(b) \end{aligned}
In terms of the interactive diagram above, this means lambda, start superscript, times, end superscript, left parenthesis, b, right parenthesis tells you the rate of change of the black dot representing M, start superscript, times, end superscript as you move around the green dot representing b.
Showing why this is true is a bit tricky, but first, let's take a moment to interpret it. For example, if we found that lambda, start superscript, times, end superscript, left parenthesis, b, right parenthesis, equals, 2, point, 59, it would mean each additional dollar you spend over your budget would yield another dollar sign, 2, point, 59 in revenue. Conversely, decreasing your budget by a dollar will cost you that much in lost revenue.
This interpretation of lambda, start superscript, times, end superscript comes up commonly enough in economics to deserve a name: "Shadow price". It is the money gained by loosening the constraint by a single dollar, or conversely the price of strengthening the constraint by one dollar.

Generally speaking

Let's generalize what we just did with the budget example and see why it's true. Spelling out the full result is actually quite a mouthful, but it should be made clear by holding the following mantra in the back of your mind: "How does the solution change as the constraint changes?".
We start with the usual Lagrange multiplier setup. There is a function we want to maximize,
f(x,y)\begin{aligned} \quad f(x, y) \end{aligned}
and a constraint,
g(x,y)=c\begin{aligned} \quad g(x, y) = c \end{aligned}
We start by writing the Lagrangian,
L(x,y,λ)=f(x,y)λ(g(x,y)c).\begin{aligned} \quad {\mathcal{L}(x, y, \lambda) = f(x, y) - \lambda (g(x, y)-c)}. \end{aligned}
Let left parenthesis, x, start superscript, times, end superscript, comma, y, start superscript, times, end superscript, comma, lambda, start superscript, times, end superscript, right parenthesis be the critical point of L, which solves our constrained optimization problem. In other words,
del, L, left parenthesis, x, start superscript, times, end superscript, comma, y, start superscript, times, end superscript, comma, lambda, start superscript, times, end superscript, right parenthesis, equals, 0
And left parenthesis, x, start superscript, times, end superscript, comma, y, start superscript, times, end superscript, right parenthesis maximizes f (subject to the constraint).
When we start to think of c as a variable, we must account for the fact that the solution left parenthesis, x, start superscript, times, end superscript, comma, y, start superscript, times, end superscript, comma, lambda, start superscript, times, end superscript, right parenthesis changes as the constraint c changes. To do this, we start writing each component as a function of c:
x(c)y(c)λ(c)\begin{aligned} \quad x^*(c) \\ y^*(c) \\ \lambda^*(c) \\ \end{aligned}
In other words, when the constraint equals some value c, the solution triplet to the Lagrange multiplier problem is left parenthesis, x, start superscript, times, end superscript, left parenthesis, c, right parenthesis, comma, y, start superscript, times, end superscript, left parenthesis, c, right parenthesis, comma, lambda, start superscript, times, end superscript, left parenthesis, c, right parenthesis, right parenthesis.
We now let M, start superscript, times, end superscript, left parenthesis, c, right parenthesis represent the (constrained) maximum value of f as a function of c, which can be written in terms of f, x, start superscript, times, end superscript, left parenthesis, c, right parenthesis and y, start superscript, times, end superscript, left parenthesis, c, right parenthesis as follows:
M, start superscript, times, end superscript, left parenthesis, c, right parenthesis, equals, f, left parenthesis, x, start superscript, times, end superscript, left parenthesis, c, right parenthesis, comma, y, start superscript, times, end superscript, left parenthesis, c, right parenthesis, right parenthesis
The core result we wish to show is that
start fraction, d, M, start superscript, times, end superscript, divided by, d, c, end fraction, equals, lambda, start superscript, times, end superscript, left parenthesis, c, right parenthesis
This says that the Lagrange multiplier lambda, start superscript, times, end superscript gives the rate of change of the solution to the constrained maximization problem as the constraint varies.

Want to outsmart your teacher?

Proving this result could be an algebraic nightmare, since there is no explicit formula for the functions x, start superscript, times, end superscript, left parenthesis, c, right parenthesis, y, start superscript, times, end superscript, left parenthesis, c, right parenthesis, lambda, start superscript, times, end superscript, left parenthesis, c, right parenthesis or M, start superscript, times, end superscript, left parenthesis, c, right parenthesis. This means you would have to start with the defining property of x, start superscript, times, end superscript, y, start superscript, times, end superscript and lambda, start superscript, times, end superscript, namely that del, L, left parenthesis, x, start superscript, times, end superscript, comma, y, start superscript, times, end superscript, comma, lambda, start superscript, times, end superscript, right parenthesis, equals, 0, and reason your way towards start fraction, d, M, start superscript, times, end superscript, divided by, d, c, end fraction. This is not at all straight forward (try it!).
There is a fun story, in which a professor was asked what the harshest truth he ever learned from a student was. He recalled a class he taught when he went through a long and algebraically heavy proof, only to be shown by a student that there is a much simpler approach. The lesson, he said, was that he was not as smart as he thought he was.
The result he was talking about just so happens to be what we are now trying to prove. Although the student's approach is not quite so simple as the story makes it out to be, it is still a clean way to view the problem. More importantly, it is easier to remember than other proofs, so I'll spell it out in full here. As happens so often in math, a little insight can save us from excessive algebra.

The insight

The underlying insight is that evaluating the Lagrangian itself at a solution left parenthesis, x, start superscript, times, end superscript, comma, y, start superscript, times, end superscript, comma, lambda, start superscript, times, end superscript, right parenthesis will give the maximum value M, start superscript, times, end superscript. This is because the "g, left parenthesis, x, comma, y, right parenthesis, minus, c" term in the Lagrangian goes to zero (since a solution must satisfy the constraint), so we have
L(x,y,λ)=f(x,y)λ(g(x,y)c)=f(x,y)+0=M\begin{aligned} \quad \mathcal{L}(x^*, y^*, \lambda^*) &= f(x^*, y^*) - \lambda^*(g(x^*, y^*)-c) \\ &= f(x^*, y^*) + 0 \\ &= M^* \end{aligned}
Given that we want to find start fraction, d, M, start superscript, times, end superscript, divided by, d, start color #bc2612, c, end color #bc2612, end fraction, this suggests that we should find a way to treat L as a function of start color #bc2612, c, end color #bc2612. Then we might be able to relate the derivative we want to a derivative of L with respect to start color #bc2612, c, end color #bc2612.

The followthrough

Start by treating L as a function of four variable instead of three, since start color #bc2612, c, end color #bc2612 is now modeled as a changing value:
L(x,y,λ,c)=f(x,y)λ(g(x,y)c).\begin{aligned} \quad {\mathcal{L}(x, y, \lambda, \redE{c}) = f(x, y) - \lambda (g(x, y)-\redE{c})}. \end{aligned}
Reflection question: When L is written as a four-variable function like this, what is start fraction, \partial, L, divided by, \partial, start color #bc2612, c, end color #bc2612, end fraction?
Choose 1 answer:

This partial derivative is promising, since our goal is to show that start fraction, d, M, start superscript, times, end superscript, divided by, d, start color #bc2612, c, end color #bc2612, end fraction, equals, lambda, start superscript, times, end superscript, and we know that M, start superscript, times, end superscript, equals, L at solutions. However, we still have work to do.
To encode the fact that we only care about the value of L at a solutions left parenthesis, x, start superscript, times, end superscript, comma, y, start superscript, times, end superscript, comma, lambda, start superscript, times, end superscript, right parenthesis for a given value of start color #bc2612, start color #bc2612, c, end color #bc2612, end color #bc2612, we replace start color #0c7f99, x, end color #0c7f99, comma, start color #0d923f, y, end color #0d923f and start color #a75a05, lambda, end color #a75a05 with start color #0c7f99, x, start superscript, times, end superscript, left parenthesis, start color #bc2612, c, end color #bc2612, right parenthesis, end color #0c7f99, comma, start color #0d923f, y, start superscript, times, end superscript, left parenthesis, start color #bc2612, c, end color #bc2612, right parenthesis, end color #0d923f and start color #a75a05, lambda, start superscript, times, end superscript, left parenthesis, start color #bc2612, c, end color #bc2612, right parenthesis, end color #a75a05. These are functions of start color #bc2612, c, end color #bc2612 which correspond to the solution of the Lagrangian problem for a given choice of the "constant" start color #bc2612, c, end color #bc2612.
This lets us write M, start superscript, times, end superscript as a function of start color #bc2612, c, end color #bc2612 as follows:
M(c)=L(x(c),y(c),λ(c),c)\begin{aligned} \quad M^*(\redE{c}) = \mathcal{L}(\blueE{x^*(\redE{c})}, \greenE{y^*(\redE{c})}, \goldE{\lambda^*(\redE{c})}, {\redE{c}}) \end{aligned}
Even though this expression has only one variable, start color #bc2612, c, end color #bc2612, there is a four-variable function L as an intermediary. Therefore, to take its (ordinary) derivative with respect to c, we use the multivariable chain rule:
dMdc=ddcL(x(c),y(c),λ(c),c)=Lxdxdc+Lydydc+Lλdλdc+Lcdcdc\begin{aligned} \quad \dfrac{dM^*}{d\redE{c}} &= \dfrac{d}{d\redE{c}}\mathcal{L}(\blueE{x^*}(\redE{c}), \greenE{y^*}(\redE{c}), \goldE{\lambda^*}(\redE{c}), \redE{c}) \\ \\ &= \dfrac{\partial \mathcal{L}}{\blueE{\partial x}} \dfrac{d\blueE{x^*}}{d\redE{c}} + \dfrac{\partial \mathcal{L}}{\greenE{\partial y}} \dfrac{d\greenE{y^*}}{d\redE{c}} + \dfrac{\partial \mathcal{L}}{\goldE{\partial \lambda}} \dfrac{d\goldE{\lambda^*}}{d\redE{c}} + \dfrac{\partial \mathcal{L}}{\partial \redE{c}} \dfrac{d\redE{c}}{d\redE{c}} \end{aligned}
Note, each partial derivative in the expression above should be evaluated at left parenthesis, start color #0c7f99, x, start superscript, times, end superscript, end color #0c7f99, left parenthesis, start color #bc2612, c, end color #bc2612, right parenthesis, comma, start color #0d923f, y, start superscript, times, end superscript, end color #0d923f, left parenthesis, start color #bc2612, c, end color #bc2612, right parenthesis, comma, start color #a75a05, lambda, start superscript, times, end superscript, end color #a75a05, left parenthesis, start color #bc2612, c, end color #bc2612, right parenthesis, comma, start color #bc2612, c, end color #bc2612, right parenthesis, but writing that would make the expression more messy than it already is.
This might seem like a lot, but remember where the terms start color #0c7f99, x, start superscript, times, end superscript, end color #0c7f99, start color #0d923f, y, start superscript, times, end superscript, end color #0d923f and start color #a75a05, lambda, start superscript, times, end superscript, end color #a75a05 each came from. Each partial derivative start fraction, \partial, L, divided by, start color #0c7f99, \partial, x, end color #0c7f99, end fraction, start fraction, \partial, L, divided by, start color #0d923f, \partial, y, end color #0d923f, end fraction, and start fraction, \partial, L, divided by, start color #a75a05, \partial, lambda, end color #a75a05, end fraction is zero when evaluated at left parenthesis, start color #0c7f99, x, start superscript, times, end superscript, end color #0c7f99, comma, start color #0d923f, y, start superscript, times, end superscript, end color #0d923f, comma, start color #a75a05, lambda, start superscript, times, end superscript, end color #a75a05, right parenthesis. That's how a solution left parenthesis, start color #0c7f99, x, start superscript, times, end superscript, end color #0c7f99, comma, start color #0d923f, y, start superscript, times, end superscript, end color #0d923f, comma, start color #a75a05, lambda, start superscript, times, end superscript, end color #a75a05, right parenthesis is defined! This means the first three terms go to zero.
Lxdxdc+Lydydc+Lλdλdc+Lcdcdc\begin{aligned} \quad \cancel{\dfrac{\partial \mathcal{L}}{\blueE{\partial x}}} \dfrac{d\blueE{x^*}}{d\redE{c}} + \cancel{\dfrac{\partial \mathcal{L}}{\greenE{\partial y}}} \dfrac{d\greenE{y^*}}{d\redE{c}} + \cancel{\dfrac{\partial \mathcal{L}}{\goldE{\partial \lambda}}} \dfrac{d\goldE{\lambda^*}}{d\redE{c}} + \dfrac{\partial \mathcal{L}}{\partial \redE{c}} \dfrac{d\redE{c}}{d\redE{c}} \end{aligned}
Moreover, since start fraction, d, start color #bc2612, c, end color #bc2612, divided by, d, start color #bc2612, c, end color #bc2612, end fraction, equals, 1, the entire expression simplifies to
dMdc=Lc\begin{aligned} \quad \dfrac{dM^*}{d\redE{c}} = \dfrac{\partial \mathcal{L}}{\partial \redE{c}} \end{aligned}
It's important to notice that the reason for this simplification relies on the special properties of solution points left parenthesis, start color #0c7f99, x, start superscript, times, end superscript, end color #0c7f99, comma, start color #0d923f, y, start superscript, times, end superscript, end color #0d923f, comma, start color #a75a05, lambda, start superscript, times, end superscript, end color #a75a05, right parenthesis. Otherwise, working out the full derivative based on the multivariable chain rule could have been a nightmare!
For the sake of notational cleanliness, we left out the inputs to these derivatives, but let's write them in.
dMdc(c)=Lc(x(c),y(c),λ(c),c)\begin{aligned} \quad \dfrac{dM^*}{d\redE{c}}(\redE{c}) = \dfrac{\partial \mathcal{L}}{\partial \redE{c}}(\blueE{x^*}(\redE{c}), \greenE{y^*}(\redE{c}), \goldE{\lambda^*}(\redE{c}), \redE{c}) \end{aligned}
Since we saw in the reflection question above that start fraction, \partial, L, divided by, \partial, start color #bc2612, c, end color #bc2612, end fraction, equals, lambda, this means
dMdc(c)=λ(c)\begin{aligned} \quad \boxed{\dfrac{dM^*}{d\redE{c}}(\redE{c}) = \goldE{\lambda^*}(\redE{c})} \end{aligned}
Done!

Want to join the conversation?

  • female robot amelia style avatar for user Shubham
    While calculating dM*/dc why we take partial derivative with respect to x,y and λ and not x*,y* and λ*?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • leaf red style avatar for user H Adnan
    In the previous article, there was an example with (lambda)=0, does this means that increasing the budget does not affect the revenue? and how are the constraints related to the budget now?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Cihan Baran
      Yes, this isn't explained all that clearly.

      We are implicitly assuming that you are constrained by the budget - and thus increasing your budget should give you further revenue.

      Mathematically, if you are constrained by your budget, then the optimal solution is at the boundary of the surface, meaning for optimal x*, and optimal y*, g(x*,y*) = c . In this case you have a positive lambda. Increasing c will lead to different, better x* and y*.

      If you are not constrained by your budget, in the optimal case, you have g(x*, y*) < c . Thus increasing c doesn't give you any extra juice, as x* and y* don't change. In this case, lambda is 0.

      In this article, it is implicitly assumed that you are constrained by your budget (or whatever your constraint is) so that increasing c will lead to different solutions. Otherwise, it becomes trivial.
      (1 vote)
  • blobby green style avatar for user iam_apocalypse
    very nice explanation! I'm sure why we are interested in how the solution changes with a change in c?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • aqualine ultimate style avatar for user maycol.medina
    Yeah, i can kinda undertand and track the explanation, but i still have this feeling like i am leaving something out.
    I Just can't say i'm undertand the why of the Lagrange multipliers at all.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user metavert
    Weird that so much time was spent on Lagrangians in this unit, but it doesn't appear on the unit test at all and there's not even a quiz. I'd have liked to test my understanding of it.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • purple pi teal style avatar for user suhas
    A lot of textbooks interpret the Lagrange multiplier this way (see Strang, Gilbert). But there is an easier way without having to invent an auxiliary function with four variables.

    dM*/dc = df(x*,y*)/dc
    df(x*, y*)/dc = f_x(x*, y*) (dx/dc) + f_y(x*, y*) (dy/dc)
    , where the _x and _y are subscripts representing partial derivatives

    But, f_x(x*, y*) = λ* g_x(x*, y*)
    f_y(x*, y*) = λ* g_y(x*, y*)

    Hence,
    df(x*, y*)/dc = λ*[g_x(x*, y*)(dx/dc) + g_y(x*, y*)(dy/dc)] = λdg(x, y*)/dc


    Also,
    g(x*, y*) = c
    λdg(x*, y*)/dc = λ*dc/dc = λ*
    (1 vote)
    Default Khan Academy avatar avatar for user