What is the partial derivative, how do you compute it, and what does it mean.

What we're building to

  • For a multivariable function, like f(x,y)=x2yf(x, y) = x^2y, computing partial derivatives looks something like this:
  • This swirly-d symbol, \partial , called "del", is used to distinguish partial derivatives from ordinary single-variable derivatives. Or, should I say ... to differentiate them.
  • The reason for a new type of derivative is that when the input of a function is made up of multiple variables, we want to see how the function changes as we let just one of those variables change while holding all the others constant.
  • With respect to three-dimensional graphs, you can picture the partial derivative fx\dfrac{\partial f}{\partial x} by slicing the graph of ff with a plane representing a constant yy-value and measuring the slope of the resulting curve along the cut.
Intersecting y=0 plane with the graph

What is a partial derivative?

We'll assume you are familiar with the ordinary derivative dfdx\dfrac{df}{dx} from single variable calculus. I actually quite like this notation for the derivative, because you can interpret it as follows:
  • Interpret dxdx as "a very tiny change in xx".
  • Interpret dfdf as "a very tiny change in the output of ff", where it is understood that this tiny change is whatever results from the tiny change dxdx to the input.
In fact, I think this intuitive feel for the symbol dfdx\dfrac{df}{dx} is one of the most useful takeaways from single-variable calculus, and when you really start feeling it in your bones, most of the concepts around derivatives start to click.
For example, when you apply it to the graph of ff, you can interpret this "ratio" dfdx\dfrac{df}{dx} as the rise-over-run slope of the graph of ff, which depends on the point where you started.
Interpretation of in a single variable function.

How does this work for multivariable functions?

Consider some function with a two-dimensional input and a one-dimensional output.
f(x,y)=x22xyf(x, y) = x^2-2xy
There's nothing stopping us from writing the same expression, dfdx\dfrac{df}{dx}, and interpreting it the same way:
  • dxdx can still represent a tiny change in the variable xx, which is now just one component of our input.
  • dfdf can still represent the resulting change to the output of the function f(x,y)f(x, y).
However, this ignores the fact that there is another input variable yy. The input space now has multiple dimensions, so we can change the input in many directions other than the xx-direction. For example, what about changing yy slightly by some small value dydy? Now if we re-interpret dfdf to represent the tiny change to the function that this dydy shift brings about, we would have a different derivative dfdy\dfrac{df}{dy}.
Indication that the input of a multivariable function can change in many directions.
Neither one of these derivatives tells the full story of how our function f(x,y)f(x, y) changes when its input changes slightly, so we call them partial derivatives. To emphasize the difference, we no longer use the letter dd to indicate tiny changes, but instead introduce a newfangled symbol \partial to do the trick, writing each partial derivative as fx\dfrac{\partial f}{\partial x}, fy\dfrac{\partial f}{\partial y}, etc.
You read the symbol fx\dfrac{\partial f}{\partial x} out loud by saying "the partial derivative of ff with respect to xx".

Example: Computing a partial derivative

Consider this function:
f(x,y)=x2y3 f(\blueE{x}, \redE{y}) = \blueE{x}^2 \redE{y}^3
Suppose I asked you to evaluate fx\dfrac{\partial f}{\blueE{\partial x}}, the partial derivative with respect to xx, at the input (3,2)(\blueE{3}, \redE{2}).
"What? But I haven't learned how yet!"
Don't worry, it's mostly just the same mechanics as an ordinary derivative.
From the introduction above, you should know that this is asking about the rate at which the output of ff changes as we nudge the xx-component of the input slightly, perhaps moving from (3,2)(\blueE{3}, \redE{2}) to (3.01,2)(\blueE{3.01}, \redE{2}).
Since we only care about movement in the x\blueE{x}-direction, we might as well treat the y\redE{y}-value as a constant. In fact, we can just plug in y=2\redE{y=2} ahead of time before computing any derivatives:
f(x,2)=x2(2)3=8x2 f(\blueE{x}, \redE{2}) = \blueE{x}^2 (\redE{2})^3 = 8\blueE{x}^2
Now, asking how ff changes in response to a small shift in x\blueE{x} is just an ordinary, single-variable derivative.
Concept check: What is the derivative of this function f(x,2)=8x2 f(\blueE{x}, \redE{2}) = 8\blueE{x}^2 evaluated at x=3\blueE{x = 3}?
ddxf(x,2)=ddx(8x2)=16x \dfrac{d}{\blueE{dx}}f(\blueE{x}, 2) = \dfrac{d}{\blueE{dx}}(8\blueE{x}^2) = 16\blueE{x}
Plugging in x=3\blueE{x=3}, we see the answer must be 16(3)=4816(\blueE{3}) = 48.

Without pre-evaluating yy

Now suppose I asked you to find fx\dfrac{\partial f}{\blueE{\partial x}}, but I didn't ask you to evaluate it at a specific point. In other words, you should give me new multivariable function which takes any point (x,y)(\blueE{x}, \redE{y}) as its input and tells me what the rate of change of ff near that point is as we move purely in the x\blueE{x}-direction.
You can start the same way, treating the y\redE{y} value as a constant. However, this time, you cannot plug in an actual constant value, like y=2\redE{y = 2}. Instead, pretend that y\redE{y} is ​constant and take the derivative:
ddxf(x,y)=ddx(x2y3)Pretend  is constanty=2xy3 \dfrac{d}{\blueE{dx}}f(\blueE{x}, y) = \underbrace{ \dfrac{d}{\blueE{dx}}(\blueE{x}^2y^3) }_{\text{Pretend $y$ is constant}} = 2\blueE{x}y^3
Or rather, since to emphasize that this is a multivariable function, we use the symbol \partial instead of dd:
xf(x,y)=x(x2y3)=2xy3 \dfrac{\partial}{\blueE{\partial x}}f(\blueE{x}, y) = \dfrac{\partial}{\blueE{\partial x}}(\blueE{x}^2y^3) = 2\blueE{x}y^3
As a sanity check, you can plug in (3,2)(\blueE{3}, \redE{2}) to see that we get the same result as above.
"So, what's the difference between ddx\dfrac{d}{dx} and x\dfrac{\partial}{\partial x}? They seem to be used the same way."
Honestly, as far as I'm concerned, there's not really a difference between these operations. You could be pedantic and say one is only defined for single variable functions. But as far as intuition and computation go, they are one and the same, and the difference is just meant to clarify what type of function is being differentiated.

Interpreting partial derivatives with graphs

Consider this function:
f(x,y)=15(x22xy)+3f(x, y) = \frac{1}{5}(x^2 - 2xy) + 3,
Here is a video showing its graph rotating, just to get a feel for the three-dimensional nature of it.
Think about the partial derivative of ff with respect to x\blueE{x}, perhaps evaluated at the point (2,0)(2, 0).
fx(2,0) \dfrac{\partial f}{\blueE{\partial x}}(2, 0)
In terms of the graph, what does the value of this expression tell us about the behavior of the function ff at the point (2,0)(2, 0)?

Treat yy as constant \rightarrow slice graph with plane

The first step when computing this value is to treat yy as a constant. Specifically, if we are limiting our view to what happens at the point (2,0)(2, 0), we should only look at the set of points where y=0y = 0. In three-dimensional space, this set is plane perpendicular to the yy-axis, passing through the origin.
Intersecting y=0 plane with the graph
This plane y=0y = 0, shown in white, slices into the graph of f(x,y)f(x,y) along a parabolic curve, shown faintly in red. We can interpret fx\dfrac{\partial f}{\blueE{\partial x}} as giving the slope of a tangent line to this curve. Why? Because x\partial x is a slight nudge in the xx-direction, the run, and f\partial f is the resulting change in the zz-direction, the rise.
What about fy\dfrac{\partial f}{\redE{\partial y}} at that same point (2,0)(2, 0)? The points where x=2\blueE{x=2} also make up a plane, but this time it's a plane perpendicular to the xx-axis intersecting the point x=2x=2. This slices the graph along a new curve, and fy\dfrac{\partial f}{\redE{\partial y}} will give the slope of that new curve.
Intersecting x=2 plane with the graph.
Reflection Question: In the picture to the right, the "curve" where the graph of f(x,y)=15(x22xy)+3f(x, y) = \frac{1}{5}(x^2 - 2xy) + 3 intersects the plane defined by x=2x=2 looks like it might be a straight line. Is it really a line?
Choose 1 answer:
Choose 1 answer:
It is!
Intersecting the graph of ff with the plane x=2x = 2 corresponds with treating xx as the constant 22. That is, we look at f(2,y)f(2, y) as a single variable function of yy:
f(2,y)=15(222(2)y)+3=4545y+3=45y+195\begin{aligned} f(2, y) &= \frac{1}{5}(2^2 - 2(2)y) + 3 \\ &= \frac{4}{5} - \frac{4}{5}y + 3 \\ &= -\frac{4}{5}y + \frac{19}{5} \end{aligned}
This function is linear, so its graph is a line with slope 4/5-4/5.
The partial derivative fy\dfrac{\partial f}{\partial y} evaluated at any point (2,y)(2, y) gives the slope of this line, 45-\dfrac{4}{5}, no matter what yy is. In fact, the partial derivative of this function doesn't depend on yy at all:
fy=y(15(x22xy))=15(02x)=25x\begin{aligned} \dfrac{\partial f}{\partial y} &= \dfrac{\partial}{\partial y} \left( \frac{1}{5}(x^2 - 2xy) \right) \\ &= \frac{1}{5}(0 - 2x) \\ &= -\frac{2}{5}{x} \end{aligned}
Graphically, this means as we choose different values of xx, and the plane representing the constant xx value shifts left or right, it will always intersect the graph along a line, since its slope is constant with respect to yy. Moreover, the slope of that line will always be 25-\dfrac{2}{5} times the value of xx.

Phrasing and notation

Here are some of the phrases you might hear in reference to this fx\dfrac{\partial f}{\partial x} operation:
  • "The partial derivative of ff with respect to xx"
  • "Del f, del x"
  • "Partial f, partial x"
  • "The partial derivative (of ff) in the xx-direction"

Alternate notation

In the same way that people sometimes prefer to write ff' instead of dfdx\dfrac{df}{dx}, we have the following notation:
fxfxfyfyfSome variable fThat same variable\begin{aligned} f_\blueE{x} &\leftrightarrow \dfrac{\partial f}{\blueE{\partial x}} \\ f_\redE{y} &\leftrightarrow \dfrac{\partial f}{\redE{\partial y}} \\ f_{\greenE{\langle\text{Some variable }\rangle}} &\leftrightarrow \dfrac{\partial f}{\greenE{\partial \langle\text{That same variable} \rangle}} \end{aligned}

A more formal definition

Although thinking of dxdx or x\partial x as really tiny changes in the value of xx is a useful intuition, it is healthy to occasionally step back and remember that defining things precisely requires introducing limits. After all, what specific small value would x\partial x be? One one hundredth? One one millionth? 10101010^{-10^{10}}?
The point of calculus is that we don't use any one tiny number, but instead consider all possible values and analyze what tends to happen as they approach a limiting value. The single variable derivative, for example, is defined like this:
dfdx(x0)=limh0f(x0+h)f(x0)h\begin{aligned} \dfrac{{df}}{\blueE{dx}}(x_0) = \lim_{\blueE{h}\to 0} \frac{{f(x_0\blueE{+h}) - f(x_0)}}{\blueE{h}} \end{aligned}
  • hh represents the "tiny value" that we intuitively think of as dxdx.
  • The h0h \to 0 under the limit indicates that we care about very small values of hh, those approaching 00.
  • f(x0+h)f(x0)f(x_0 + h) - f(x_0) is the change in the output that results from adding hh to the input, which is what we think of as dfdf.
Formally defining the partial derivative looks almost identical. If f(x,y,)f(x, y, \dots) is a function with multiple inputs, here's how that looks:
fx(x0,y0,)=limh0f(x0+h,y0,)f(x0,y0,)h\begin{aligned} \dfrac{\partial f}{\blueE{\partial x}}(x_0, y_0, \dots) &= \lim_{\blueE{h} \to 0} \dfrac{f(\blueE{x_0\blueE{+h}}, y_0, \dots) - f(x_0, y_0, \dots)} {\blueE{h}} \end{aligned}
Similarly, here's how the partial derivative with respect to yy looks:
fy(x0,y0,)=limh0f(x0,y0+h,)f(x0,y0,)h\begin{aligned} \dfrac{\partial f}{\redD{\partial y}}(x_0, y_0, \dots) &= \lim_{\redD{h} \to 0} \frac{f(x_0, \redD{y_0+h}, \dots) - f(x_0, y_0, \dots)}{\redD{h}} \\ \end{aligned}
The point is that hh, which represents a tiny tweak to the input, is added to different input variables depending on which partial derivative we are taking.
People will often refer to this as the limit definition of a partial derivative.
Reflection question: How can we think about this limit definition in the context of the graphical interpretation above? What is hh? What does it look like for h0h \to 0?

Summary

  • For a multivariable function, like f(x,y)=x2yf(x, y) = x^2y, computing partial derivatives looks something like this:
  • This swirly-d symbol \partial , called "del", is used to distinguish partial derivatives from ordinary single-variable derivatives.
  • The reason for a new type of derivative is that when the input of a function is made up of multiple variables, we want to see how the function changes as we let just one of those variables change while holding all the others constant.
  • With respect to three-dimensional graphs, you can picture the partial derivative fx\dfrac{\partial f}{\partial x} by slicing the graph of ff with a plane representing a constant yy-value, and measuring the slope of the resulting cut.
Intersecting y=0 plane with the graph
Loading