Main content

### Course: Multivariable calculus > Unit 3

Lesson 5: Lagrange multipliers and constrained optimization- Constrained optimization introduction
- Lagrange multipliers, using tangency to solve constrained optimization
- Finishing the intro lagrange multiplier example
- Lagrange multiplier example, part 1
- Lagrange multiplier example, part 2
- The Lagrangian
- Meaning of the Lagrange multiplier
- Proof for the meaning of Lagrange multipliers

© 2024 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Meaning of the Lagrange multiplier

In the previous videos on Lagrange multipliers, the Lagrange multiplier itself has just been some proportionality constant that we didn't care about. Here, you can see what its real meaning is. Created by Grant Sanderson.

## Want to join the conversation?

- Will there always be only one value for lambda?(7 votes)
- Each set of solutions will have one lambda. That is, if you are trying to find extrema for f(x,y) under the constraint g(x,y) = b, you will get a set of points (x1,y1), (x2,y2), etc that represent local mins and maxs. At each of these, there will be a single lambda. There is no guarantee that all the lambda will be the same (it is quite likely they will differ from each other).(8 votes)

- The insight at the end of the video is that the max lambda indicates the revenue per dollar. A lambda of 2.3 indicates that for every 1$ invested the revenue will be 2.3$. Does this hold only for small changes in the constraint as we are dealing with differential calculus (i.e. changes of a few percent in the budget constraint) or does this insight hold also for very large changes in the constraint (using the example in the video, changing the budget from 10,000 to 100,000)?(4 votes)
- This interpretation of the Lagrange Multiplier (where lambda is some constant, such as 2.3) strictly holds only for an infinitesimally small change in the constraint. It will probably be a very good estimate as you make small finite changes, and will likely be a poor estimate as you make large changes in the constraint.

That being said, if you could find lambda as a function of b (this is easier said than done), you could integrate over your increase in budget to find the increase in revenue. This would hold true for any increase in budget, no matter how large.(5 votes)

- So h* for example, is dependent on b because when b, the constraint, changes, then the critical values also change, correct?(3 votes)
- When b changes, it shifts the constraint curve. Remember when we were looking at the circular constraint in the
*Constrained Optimization Introduction*. When the constant changes, that represents the circle growing or shrinking. It's then easy to see that the x and y coordinate of the tangent kissing lines can then change.(2 votes)

- At1:54, is it possible that R(h, s) will be maximum at less budget than B(h, s) = 10k. That is R(h1, s1) > R(h2, s2), where B(h1, s1) < 10k and B(h2, s2) = 10k. We are still finding values for (h, s).(2 votes)
- In general, YES!

We would, however, have to formulate our problem slightly differently. Our problem statement said that we would spend $10,000. A better way to formulate the problem would be to say that we can spend*up to*$10,000.

Optimizing with this new constraint (an inequality constraint) tends to be more difficult, but can still be done (by combining the techniques in this module with those presented in the previous module). With this adjusted constraint, we may find out that the highest revenue will be reached when we spend*less than*$10,000. In this case, the budget would be what we call an*inactive constraint*.

If the Lagrange Multiplier were negative, it would be a sure sign that a higher revenue could be attained by spending less.(3 votes)

- Since Grant specifies that this function is being maximized under a certain constrained function, I was wondering if there's a separate procedure for handling optimizations where we need to find a minimum value within a constraint. For instance, if we want to find the input points that would give the minimum value for an error function within a certain constraint of values, would the procedure still remain the same, or would it change to handle optimizations for finding minima?(1 vote)
- It is very common to use optimization to minimize a function. Imagine, for example, that you want to minimize a cost or an error.

The procedure Grant has shown here finds both the inputs which maximize R*and*the inputs which minimize R. This happens because this method finds*all*points at which the gradient of R is proportional to the gradient of the constraint, and the gradients are proportional at both maximum*and*minimum points.

If you recall from previous videos, Grant worked through the problem to find multiple possible solutions. He then plugged each solution back into his Revenue function to find which ones yielded the highest revenue. It turns out that (at least) one of the solutions he found (and discarded) minimizes the Revenue function.

As a side note, it turns out that*all*optimization procedures which*maximize*a function can be used to*minimize*a function simply by putting a negative in front of the function you wish to minimize.(3 votes)

- My microeconomics textbook has positive signs for the lagrange multipliers.like L = f(x1,x2...xn) + lambda*g(x1,x2...xn).

is the sign of the lagrange multiplier irrelevant and we can use whichever we want?(1 vote)- You are free to flip the sign on the Lagrange Multiplier. However, you will have to interpret the meaning of lambda differently. Rather than showing how much your function will
*increase*as you increase your constraint, it will show how much your function will*decrease*as you increase your constraint.(2 votes)

- This seems like a very overly complicated way of doing this. Just create a new equation F(h,s,lambda)= R(h,s)+lambda(g(h,s)) and take the partial derivatives of each and solve.(0 votes)
- What is Lagrange multiplier?(0 votes)

## Video transcript

- [Instructor] Hey folks. In this video I wanna show you
something pretty interesting about these Lagrange multipliers
that we've been studying. So the first portion I'm just
gonna kind of get this set up, which is a lot of review
from what we've seen already, but I think you're really
gonna like where this is going in the end. So one of the examples I
showed, and I think this is a pretty nice prototypical example for constrained optimization problems, is that you're running a company and you have some kind of revenue function that's dependent on various choices you make in running the company, and I think I said the
number of hours of labor you employ and the number
of tons of steel you use, you know, if you were
manufacturing something metallic. And, you know, this might be
modeled as some multivariable function of h and s, right
now we don't really care about the specifics. And you're trying to maximize this, right, that's kind of the
whole point of this unit that we've been doing, is
that you're trying to maximize some function, but you have a constraint. This is the real world, you
can't just spend infinite money, you have some kind of budget,
some sort of amount of money you spend as a function of
those same choices you make, the hours of labor you employ,
the tons of steel you use. And this, again, it's gonna equal some multivariable function
that tells you, you know, how much money you spend
for a given amount of hours and given number of tons of steel. And you set this equal to some constant, this tells you the amount of
money you're willing to spend. And our goal has been to
maximize some function, subject to a constraint like this. And the mental model you have in mind is that you're looking in the input space, inside the x-y plane, or I guess really, it's the
h-s plane in this case, right. Your inputs are h and s,
and points in this plane tell you possible choices you can make for hours of labor and tons of steel. And you think of this budget as some kind of curve
in that plane, right. All the sets of h and s that equal $10,000 is gonna give you some kind of curve. And the core value we care about is that, when you maximize this revenue, you know, when you set
it equal to a constant I'm gonna call M star, that's
like the maximum possible revenue, that's gonna give you a contour that's just barely tangent to the constraint curve. And if that seems unfamiliar, definitely take a look at the
videos preceding this one. But, just to kind of continue the review, this gave us the really nice property that you look at the gradient vector for the thing you're
trying to maximize, R, and that's gonna be proportional
to the gradient vector for the constraint function,
for this B, so gradient of B. And this is because
gradients are perpendicular to contour lines. Again, this should feel mostly
like review at this point. So the core idea was that
we take this gradient of R, and then make it proportional,
with some kind of proportionality constant lambda, to the gradient of B, to the gradient of the constraint function. And up till this point, this value lambda has been wholly uninteresting. It's just been a
proportionality constant, right, because we couldn't guarantee
that the gradient of R is equal to the gradient of B. All we care about is that they're pointing in the same direction. So we just had this constant sitting here, and all we really said is
make sure it's not zero. But here, we're gonna get
to where this little guy actually matters. So, if you'll remember, in the last video, I introduced this function
called the Lagrangian, the Lagrangian. And it takes in multiple inputs, they'll be the same inputs that you have for your budget function
and your revenue function, or more generally, the constraint and the thing you're trying to maximize. It takes in those same variables, but, also, as another one of its inputs, it takes in lambda. So, it is a higher-dimensional function than both of these two, because
we've got this extra lambda. And the way it's defined looks
a little strange at first, it just seems kind of like this random hodgepodge of functions
that we're putting together. But, last time, I kind of walked through why this makes sense. You take the thing you're
trying to maximize, and you subtract off this lambda, multiplied by the constraint function, which is B of those inputs, minus, and then whatever
this constant is here, right. I'm gonna give it a name,
I'm gonna call this constant lowercase b. So maybe we're thinking of it as $10,000, but it's whatever your actual budget is. So we think of that, and I'm just gonna emphasize here that
that's a constant, right, that this b is being treated
as a constant right now. You know, we're thinking
of h and s and lambda all as these variables, and this gives us some multivariable function. And if you'll remember
from the last video, the reason for defining this function is it gives us a really
nice compact way to solve the constraint optimization problem. You set the gradient of L equal to zero, or really the zero vector,
right, it'll be a vector with three components here. And when you do that, you'll
find some solution, right, you'll find some solution,
which I'll call h star, s star, and lambda, here
I'll give it that green lambda color, lambda star. You'll find some value
that, when you input this into the function, the
gradient will equal zero. And, of course, you might
find multiple of these, right, you might find multiple
solutions to this problem, but what you do is, for each
one of them you're gonna take a look at h star and s star,
then you're gonna plug those into the revenue function, or
the thing that you're trying to maximize. And, typically, you only get
a handful, you get a number, then you can actually
plug each one of them into the revenue function, and
you'll just check which one of them makes this function the highest. And whatever the highest value
this function can achieve, that is M star, that is the
maximum possible revenue, subject to this budget. But it's interesting that when
you solve this, you get some specific value of lambda,
right, there's a specific lambda star that will be
associated with the solution. And, like I said, this
turns out not to just be some dummy variable. It's gonna carry information
about how much we can increase the revenue if we increase that budget. And, here, let me show you what I mean. So we've got this M star,
and I'll just write it again, M star here. And what that equals, I'm
saying that's the maximum possible revenue. So that's gonna be the revenue
when you evaluate it at h star, h star and s star. And h star and s star, they
are whatever the solution to this gradient of the Lagrangian
equals zero equation is. You set this multivariable
function equal to the zero vector, you solve when
each of its partial derivatives equal zero, and you'll
get some kind of solution. So when you plug that
solution in the revenue, that gives you the
maximum possible revenue. But what we could do is
consider this as a function of the budget. Now, this is the kind of thing
that looks a little bit wacky if you're just looking at the formulas. But if you actually
think about what it means in this context of kind
of a revenue and a budget, I think it's actually pretty sensible, where, really, if we consider
this b no longer to be constant but something that
you could change, right, you're wondering, well what
if I had a $20,000 budget, or what if I had a $15,000 budget? You wanna ask the question,
what happens as you change this b. Well, the maximizing
value, h star and s star, each one of those guys is
gonna depend on b, right. As you change what this
constant is, it's gonna change the values at which the
gradient of the Lagrangian equals zero. So, I'm gonna rewrite this function as the revenue evaluated
at h star and s star, but now I'm gonna consider
that h star and s star each as functions of b, right, because they depend on it in some way. As you change b, it changes
the solution to this problem It's very implicit and it's
kind of hard to think about. It's hard to think, okay,
as I change this b, how much does that change h star. Well that depends on what the, you know, what the definition of R
was and everything there. But, in principle, in this
context, I think it's quite intuitive. You have a maximum possible revenue, and that depends on what your budget is. So, what turns out to be a
beautiful, absolutely beautiful magical fact is that this lambda star is equal to the derivative of M star, the derivative of this
maximum possible revenue with respect to b, with
respect to the budget. And let me just show you what
that actually means, right. So if, for example, let's say you did all of your calculations and
it turned out that lambda star was equal to 2.3. You know, previously that just
seemed like some dummy number that you ignore, and you
just look at whatever the associated values here are. But if you plug this in the
computer and you see lambda star equals 2.3, what that means is,
for a tiny change in budget, like let's say your budget
increases from 10,000 to 10,001, it goes up to $10,001, you increase your budget by
just a little bit, a little db. Then the ratio of the change
in the maximizing revenue to that db is about 2.3. So what that would mean
is, increasing your budget by a dollar is gonna increase M star, over here it would mean
that M star increases by about, you know, $2.30 for every dollar that
you increase your budget. And that's information
you'd wanna know, right? If you see that this
lambda star is a number bigger than one, you'd say,
hey, maybe we should increase our budget. We increase it from $10,000
to 10,001 and we're making more money. So, maybe, as long as lambda
star is greater than one, you should keep doing
whatever it takes to increase that budget. So this fact is quite surprising, I think, and it seems like it totally
comes out of nowhere. So what I'm gonna do in the
next video is prove this to you, is prove why this is true,
why this lambda star value happens to be the rate of
change for the maximum value of the thing we're trying to maximize with respect to this constant,
with respect to whatever constant you set your
constraint function equal to. For right now, though, I
just want you to kind of try to sit back and digest
what this means in the context of this specific economic example. And, even if you never
looked into the proof and never understood it there, I think this is an
interesting and even useful tidbit of knowledge to have
about Lagrange multipliers. So with that, I'll see
you in the next video.