In example 2, why do we put a hat on u? Is it because it is a unit vector, or because it is the vector that we are looking for?

It is because it is a unit vector. Unit vectors will typically have a hat on them.

Hi everyone, I hope you all are well. When Grant writes that "therefore u-hat is proportional to vector v!" in example two, is the exclamation point representing a factorial symbol or just something for "wow" exclamation? Thanks for your help.

Just an exclamation. I do not know how factorial would work for vectors.

Main content

Course: Multivariable calculus > Unit 3

Lesson 6: Constrained optimization (articles)

Lagrange multipliers, examples

Google Classroom

Examples of the Lagrangian and Lagrange multiplier technique in action.

Background

Lagrange multiplier technique, quick recap

When you want to maximize (or minimize) a multivariable function

f (x, y, \dots)

subject to the constraint that another multivariable function equals a constant,

g (x, y, \dots) = c

, follow these steps:

Step 1: Introduce a new variable $λ$ ‍, and define a new function $L$ ‍ as follows:
$L (x, y, \dots, λ) = f (x, y, \dots) - λ (g (x, y, \dots) - c)$ ‍
This function $L$ ‍ is called the "Lagrangian", and the new variable $λ$ ‍ is referred to as a "Lagrange multiplier"
Step 2: Set the gradient of $L$ ‍ equal to the zero vector.
$\nabla L (x, y, \dots, λ) = 0 \leftarrow Zero vector$ ‍
In other words, find the critical points of $L$ ‍.
Step 3: Consider each solution, which will look something like $(x_{0}, y_{0}, \dots, λ_{0})$ ‍. Plug each one into $f$ ‍. Or rather, first remove the $λ_{0}$ ‍ component, then plug it into $f$ ‍, since $f$ ‍ does not have $λ$ ‍ as an input. Whichever one gives the greatest (or smallest) value is the maximum (or minimum) point your are seeking.

Example 1: Budgetary constraints

Problem

Suppose you are running a factory, producing some sort of widget that requires steel as a raw material. Your costs are predominantly human labor, which is

$ 20

per hour for your workers, and the steel itself, which runs for

$ 170

per ton. Suppose your revenue

R

is loosely modeled by the following equation:

$R (h, s) = 200 h^{2 / 3} s^{1 / 3}$ ‍

$h$ ‍ represents hours of labor
$s$ ‍ represents tons of steel

If your budget is

$ 20,000

, what is the maximum possible revenue?

Solution

The

$ 20

per hour labor costs and

$ 170

per ton steel costs tell us that the total cost of production, in terms of

h

and

s

, is

\begin{array}{r} 20 h + 170 s \end{array}

Therefore the budget of

$ 20,000

can be translated to the constraint

\begin{array}{r} 20 h + 170 s = 20,000 \end{array}

Before we dive into the computation, you can get a feel for this problem using the following interactive diagram. You can see which values of

(h, s)

yield a given revenue (blue curve) and which values satisfy the constraint (red line).

Since we need to maximize a function

R (h, s)

, subject to a constraint,

20 h + 170 s = 20,000

, we begin by writing the Lagrangian function for this setup:

$L (h, s, λ) = 200 h^{2 / 3} s^{1 / 3} - λ (20 h + 170 s - 20,000)$ ‍

Next, set the gradient

\nabla L

equal to the

0

vector. This is the same as setting each partial derivative equal to

0

. First, we handle the partial derivative with respect to

h

\begin{aligned} 0 & = \frac{\partial L}{\partial h} \\ 0 & = \frac{\partial}{\partial h} (200 h^{2 / 3} s^{1 / 3} - λ (20 h + 170 s - 20,000)) \\ 0 & = 200 \cdot \frac{2}{3} h^{- 1 / 3} s^{1 / 3} - 20 λ \end{aligned}

Next, we handle the partial derivative with respect to

s

\begin{aligned} 0 & = \frac{\partial L}{\partial s} \\ 0 & = \frac{\partial}{\partial s} (200 h^{2 / 3} s^{1 / 3} - λ (20 h + 170 s - 20,000)) \\ 0 & = 200 \cdot \frac{1}{3} h^{2 / 3} s^{- 2 / 3} - 170 λ \end{aligned}

Finally we set the partial derivative with respect to

λ

equal to

0

, which as always is just the same thing as the constraint. In practice, you can of course just write the constraint itself, but I'll write out the partial derivative here just to make things clear.

\begin{aligned} 0 & = \frac{\partial L}{\partial λ} \\ 0 & = \frac{\partial}{\partial λ} (200 h^{2 / 3} s^{1 / 3} - λ (20 h + 170 s - 20,000)) \\ 0 & = - 20 h - 170 s + 20,000 \\ 20 h & + 170 s = 20,000 \end{aligned}

Putting it together, the system of equations we need to solve is

\begin{aligned} 0 & = 200 \cdot \frac{2}{3} h^{- 1 / 3} s^{1 / 3} - 20 λ \\ 0 & = 200 \cdot \frac{1}{3} h^{2 / 3} s^{- 2 / 3} - 170 λ \\ 20 h & + 170 s = 20,000 \end{aligned}

In practice, you should almost always use a computer once you get to a system of equations like this. Especially because the equation will likely be more complicated than these in real applications. Once you do, you'll find that the answer is

\begin{aligned} h & = \frac{2,000}{3} \approx 666.667 \\ s & = \frac{2,000}{51} \approx 39.2157 \\ λ & = \sqrt[3]{\frac{8,000}{459}} \approx 2.593 \end{aligned}

This means you should employ about

667

hours of labor, and purchase

39

tons of steel, which will give a maximum revenue of

\begin{array}{r} R (667, 39) = 200 (667)^{2 / 3} (39)^{1 / 3} \approx $ 51,777 \end{array}

The interpretation of this constant

λ = 2.593

is left to the next article

Example 2: Maximizing dot product

Problem: Let the three-dimensional vector

\vec{v}

be defined as follows.

\begin{array}{r} \vec{v} = [\begin{array}{c} 2 \\ 3 \\ 1 \end{array}] \end{array}

Consider every possible unit vector

\hat{u}

in three-dimensional space. For which one is the dot product

\hat{u} \cdot \vec{v}

the greatest?

The diagram below is two-dimensional, but not much changes in the intuition as we move to three dimensions.

If you are fluent with dot products, you may already know the answer. It's one of those mathematical facts worth remembering. If you don't know the answer, all the better! Because we will now find and prove the result using the Lagrange multiplier method.

Solution:

First, we need to spell out how exactly this is a constrained optimization problem. Write the coordinates of our unit vectors as

x

y

and

z

\begin{array}{r} \hat{u} = [\begin{array}{c} x \\ y \\ z \end{array}] \end{array}

The fact that

\hat{u}

is a unit vector means its magnitude is

1

\begin{aligned} | | \hat{u} | | = \sqrt{x^{2} + y^{2} + z^{2}} & = 1 \\ ⇓ \\ x^{2} + y^{2} + z^{2} & = 1 \end{aligned}

This is our constraint.

Maximizing

\hat{u} \cdot \vec{v}

means maximizing the following quantity:

$[\begin{matrix} x \\ y \\ z \end{matrix}] \cdot [\begin{matrix} 2 \\ 3 \\ 1 \end{matrix}] = 2 x + 3 y + z$ ‍

The Lagrangian, with respect to this function and the constraint above, is

$L (x, y, z, λ) = 2 x + 3 y + z - λ (x^{2} + y^{2} + z^{2} - 1) .$ ‍

We now solve for

\nabla L = 0

by setting each partial derivative of this expression equal to

0

$\begin{aligned} \frac{\partial}{\partial x} (2 x + 3 y + z - λ (x^{2} + y^{2} + z^{2} - 1)) & = 2 - λ 2 x = 0 \\ \frac{\partial}{\partial y} (2 x + 3 y + z - λ (x^{2} + y^{2} + z^{2} - 1)) & = 3 - λ 2 y = 0 \\ \frac{\partial}{\partial z} (2 x + 3 y + z - λ (x^{2} + y^{2} + z^{2} - 1)) & = 1 - λ 2 z = 0 \end{aligned}$ ‍

Remember, setting the partial derivative with respect to

λ

equal to

0

just restates the constraint.

\begin{aligned} \frac{\partial}{\partial λ} (2 x + 3 y + z - λ (x^{2} + y^{2} + z^{2} - 1)) & = - x^{2} - y^{2} - z^{2} + 1 = 0 \end{aligned}

Solving for

x

y

and

z

in the first three equations above, we get

\begin{aligned} x & = 2 \cdot \frac{1}{2 λ} \\ y & = 3 \cdot \frac{1}{2 λ} \\ z & = 1 \cdot \frac{1}{2 λ} \end{aligned}

Ah, what beautiful symmetry. Each of these expressions has the same

\frac{1}{2 λ}

factor, and the coefficients

2

3

and

1

match up with the coordinates of

\vec{v}

. Being good math students as we are, we won't let good symmetry go to waste. In this case, combining the three equations above into a single vector equation, we can relate

\hat{u}

and

\vec{v}

as follows:

\begin{array}{r} \hat{u} = [\begin{array}{c} x \\ y \\ z \end{array}] = \frac{1}{2 λ} [\begin{array}{c} 2 \\ 3 \\ 1 \end{array}] = \frac{1}{2 λ} \vec{v} \end{array}

Therefore

\hat{u}

is proportional to

\vec{v}

! Geometrically, this means

\hat{u}

points in the same direction as

\vec{v}

. There are two unit vectors proportional

\vec{v}

One which points in the same direction, this is the vector that $maximizes$ ‍ $\hat{u} \cdot \vec{v}$ ‍.
One which points in the opposite direction. This one $minimizes$ ‍ $\hat{u} \cdot \vec{v}$ ‍.

We can write these two unit vectors by normalizing

\vec{v}

, which just means dividing

\vec{v}

by its magnitude:

\begin{aligned} {\hat{u}}_{max} & = \frac{\vec{v}}{| | \vec{v} | |} \\ {\hat{u}}_{min} & = - \frac{\vec{v}}{| | \vec{v} | |} \end{aligned}

The magnitude

| | \vec{v} | |

\sqrt{2^{2} + 3^{2} + 1^{2}} = \sqrt{14}

, so we can write the maximizing unit vector

{\hat{u}}_{max}

explicitly as like this:

${\hat{u}}_{max} = [\begin{matrix} 2 / \sqrt{14} \\ 3 / \sqrt{14} \\ 1 / \sqrt{14} \end{matrix}]$ ‍

Just skip the Lagrangian

If you read the last article, you'll recall that the whole point of the Lagrangian

L

is that setting

\nabla L = 0

encodes the two properties a constrained maximum must satisfy:

Gradient alignment between the target function and the constraint function,
$\begin{aligned} \nabla f (x, y) & = λ \nabla g (x, y) \end{aligned}$ ‍
The constraint itself,
$\begin{array}{r} g (x, y) = c \end{array}$ ‍

When working through examples, you might wonder why we bother writing out the Lagrangian at all. Wouldn't it be easier to just start with these two equations rather than re-establishing them from

\nabla L = 0

every time? The short answer is yes, it would be easier. If you find yourself solving a constrained optimization problem by hand, and you remember the idea of gradient alignment, feel free to go for it without worrying about the Lagrangian.

In practice, it's often a computer solving these problems, not a human. Given that there are many highly optimized programs for finding when the gradient of a given function is

0

, it's both clean and useful to encapsulate our problem into the equation

\nabla L = 0

Furthermore, the Lagrangian itself, as well as several functions deriving from it, arise frequently in the theoretical study of optimization. In this light, reasoning about the single object

L

rather than multiple conditions makes it easier to see the connection between high-level ideas. Not to mention, it's quicker to write down on a blackboard.

In either case, whatever your future relationship with constrained optimization might be, it is good to be able to think about the Lagrangian itself and what it does. The examples above illustrate how it works, and hopefully help to drive home the point that

\nabla L = 0

encapsulates both

\nabla f = λ \nabla g

and

g (x, y) = c

in a single equation.

Want to join the conversation?

Sort by:

clara.vdw
Posted 8 years ago. Direct link to clara.vdw's post “In example 2, why do we p...”
In example 2, why do we put a hat on u? Is it because it is a unit vector, or because it is the vector that we are looking for?
Button navigates to signup pageButton navigates to signup page
(10 votes)
Answer
- u.yu16
  Posted 8 years ago. Direct link to u.yu16's post “It is because it is a uni...”
  It is because it is a unit vector. Unit vectors will typically have a hat on them.
  Button navigates to signup page
  (14 votes)
Kathy M
Posted 4 years ago. Direct link to Kathy M's post “I have seen some question...”
I have seen some questions where the constraint is added in the Lagrangian, unlike here where it is subtracted. e.g. Lagrangian = f(x) + λg(x)
How does one decide whether to add or subtract?
Thanks!
Button navigates to signup pageButton navigates to signup page
(6 votes)
Answer
LazarAndrei260
Posted 4 years ago. Direct link to LazarAndrei260's post “Hello, I have been thinki...”
Hello, I have been thinking about this and can't really understand what is happening. So suppose I want to maximize
the function:

f(x,y,z) = 5xy + 8xz + 3yz, with the constraint 2xyz = 1920.

After doing the multiplier method, I only get one solution. Does that mean that the function does not have a maximum or a minimum? Or how can I tell if the solution I get represents or not a maximum?
Button navigates to signup pageButton navigates to signup page
(5 votes)
Answer
- zjleon2010
  Posted 3 years ago. Direct link to zjleon2010's post “the determinant of hessia...”
  the determinant of hessian evaluated at a point indicates the concavity of f at that point
  Button navigates to signup page
  (1 vote)
bgao20
Posted 4 years ago. Direct link to bgao20's post “Hi everyone, I hope you a...”
Hi everyone, I hope you all are well. When Grant writes that "therefore u-hat is proportional to vector v!" in example two, is the exclamation point representing a factorial symbol or just something for "wow" exclamation? Thanks for your help.
Button navigates to signup pageButton navigates to signup page
(4 votes)
Answer
- loumast17
  Posted 4 years ago. Direct link to loumast17's post “Just an exclamation. I d...”
  Just an exclamation. I do not know how factorial would work for vectors.
  Comment on loumast17's post “Just an exclamation. I d...”
  (3 votes)
hamadmo77
Posted 7 years ago. Direct link to hamadmo77's post “Instead of constraining o...”
Instead of constraining optimization to a curve on x-y plane, is there which a method to constrain the optimization to a region/area on the x-y plane. Like the region
x^2+y^2<=2 which r all the points in the unit circle including the boundary.
Button navigates to signup pageComment on hamadmo77's post “Instead of constraining o...”
(3 votes)
Answer
nikostogas
Posted 6 years ago. Direct link to nikostogas's post “Hello and really thank yo...”
Hello and really thank you for your amazing site. Can you please explain me why we dont use the whole Lagrange but only the first part? Why we dont use the 2nd derivatives
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
Amos Didunyk
Posted 5 years ago. Direct link to Amos Didunyk's post “In the step 3 of the reca...”
In the step 3 of the recap, how can we tell we don't have a saddlepoint? The fact that you don't mention it makes me think that such a possibility doesn't exist. But it does right?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
luluping06023
Posted 5 years ago. Direct link to luluping06023's post “how to solve ∇L=0 when th...”
how to solve ∇L=0 when they are not linear equations?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- Dinoman44
  Posted a year ago. Direct link to Dinoman44's post “When you have non-linear ...”
  When you have non-linear equations for your variables, rather than compute the solutions manually you can use computer to do it. Apps like Mathematica, GeoGebra and Desmos allow you to graph the equations you want and find the solutions. I myself use a Graphic Display Calculator(TI-NSpire CX 2) for this.
  Button navigates to signup page
  (1 vote)
TornikeO
Posted 8 years ago. Direct link to TornikeO's post “what shuld we do if we ha...”
what shuld we do if we have constraints as well as boundaries and we need a local extrima?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- Naman
  Posted a year ago. Direct link to Naman's post “It should pretty much be ...”
  It should pretty much be the same thing, except you only look at the solutions that are within the boundary, it's possible that the highest or lowest value can be at the boundary, but that would be a global extreme, not a local one.
  Button navigates to signup page
  (1 vote)
Elite Dragon
Posted 6 years ago. Direct link to Elite Dragon's post “Is there a similar method...”
Is there a similar method of using Lagrange multipliers to solve constrained optimization problems for integer solutions?
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer