If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

# Multivariable chain rule intuition

Get a feel for what the multivariable is really saying, and how thinking about various "nudges" in space makes it intuitive. Created by Grant Sanderson.

## Want to join the conversation?

• But what if df=df/dx*dx/dt*dt multiplied by df/dy*dy/dt*dt? Here both x and y affect df, so have do you know you add them? • Even in this video, x and y both affect f. But if you notice, f is a single variable output. This implies that the output space is a number "line" and not a plane. Essentially and change in x produces a certain change on the number line and a change in y produces another change on the number line. So the total change in the output space is given by the addition of the individual changes by x and y respectively. The total magnitude of change would be a summation of the ratio of change (doh F by doh x) times the actual change in x and similarly for y.
• How wrong is it to view dy/dx as a fraction? Does dy=2 dx mean the same as dy/dx =2? • dy/dx is a fraction, but there is also some information you might loose if you treat it like one. d(x^2)/dx=2x. here dx is defined as dx=limh->0 (h). And the definition of d(x^2) is in the def. of the derivative. But if you write d(x^4)/d(x^2) you mean that x^2 = h and the top differential is a function of the lower. Using this examples
d(x^2)(from the second) / (dx)(from the first)=1.

The bottom differential is always h, and the top differential is a function of h and whatever.

Now a new example d(x^2)/dx=2x. (d(x^2)/dx)*dx=d(x^2). But now that the bottom differential is gone and you get d(x^2) you don't know if d(x^2)=h, or is it
(x+h)^2-x^2=2xdx
unless you look at your previous step.
• Why is the change in z given by 'adding' the change in x and change in y? Yes, change in y and change in z is responsible for a change in Z, but why is this statement expressed as a simple addition? • It is a vectorial notation. Since the direction is already implied by i and j notation, we can simply "add" both the magnitude and the direction to get z. It's really just like saying 3 steps to the right and 4 steps ahead are the same thing as 5 steps diagonally(of the 3-4-5 right trangle), assuming our definition of the directions "right" and "ahead" are perpendicular to and therefore do not interfere with each other.
• Let r(t)=x(t)i+y(t)j. The can we say that d/dt[f(r(t))] is the directional derivative of f(r(t)) in the direction of r’(t)? • • how about finding the multivariable chain rule without paramtrizing with the parameters t. Since Z=x²y itself is a function.. • Why does he say that f is a one dimensional number line? Isn't f ultimately taking t as an input, hence making it a 2D graph with x-axis at 't' and y-axis as 'f'?
(1 vote) • I believe the graph you mention is the graph of the outputs from f(t) vs. the inputs of different values of t. A two dimension graph needs a set of ordered pairs (x,y). The inputs are used as x coordinates while the corresponding outputs are used as y coordinates. (x,y) = (input,output).
So for a two dimension plot you need to consider both the inputs and outputs of f(t), as if you ignored the inputs and wrote out the outputs of f(t), you would have a list of single numbers not pair of numbers.
It seems in the video he just considers the possible outputs from f(t). f(t) would output a list of single numbers (not ordered pairs) so if you only graphed the output it would be a number line.
Admittedly at first it doesn't seem like much sense to focus just on the plot of the outputs from a function (and not consider the inputs too), but if you think of functions as transformations it makes more sense. We started with a single number line t, then we got ordered pairs of points (x(t),y(t)) which is a two dimensional plot, and finally we had f( x(t), y(t) ) which collapsed the two dimensional inputs x(t), y(t) back to a single list of numbers which is a number line when plotted.
• There is still one thing which intuitively doesn't make sense to me.

This splitting into a sum of two components dx and dy makes sense if we are talking about vectors, but in our case the thing we want to get is a scalar, the magnitude of change. If we think of it as a length of vector we should take into account the triangle inequality theorem that states that the sum of lengths of any two sides of the triangle is always greater than the length of the third side. Therefore our estimation of the change should be always greater than the actual change.

I guess I am missing something, but I'm not sure what.
(1 vote) • Cancelling the the dxs and dys also add an intuitive feel.
df/dt = partialf/dt + partialf/dt
both of the partials would add up to the full derivative.
(1 vote) • The intuition behind df being affected by a combination of the changes due to dx and that due to dy is very nice. At , what is the intuition to determine that df is simply the sum of these two components? How do we know that these two factors do not combine in some other way to affect df? How do we know, for example, that df is not the product of these two, or perhaps the change caused by dx plus twice the change caused by dy, or some other more complicated function involving both of the changes?
(1 vote) 