Main content

## AP®︎ Calculus BC (2017 edition)

### Unit 3: Lesson 9

Chain rule- Chain rule
- Chain rule
- Identifying composite functions
- Identify composite functions
- Worked example: Derivative of cos³(x) using the chain rule
- Worked example: Derivative of √(3x²-x) using the chain rule
- Common chain rule misunderstandings
- Chain rule intro
- Worked example: Chain rule with table
- Chain rule with tables
- Proving the chain rule

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Chain rule

AP.CALC:

FUN‑3 (EU)

, FUN‑3.C (LO)

, FUN‑3.C.1 (EK)

The chain rule states that the derivative of f(g(x)) is f'(g(x))⋅g'(x). In other words, it helps us differentiate *composite functions*. For example, sin(x²) is a composite function because it can be constructed as f(g(x)) for f(x)=sin(x) and g(x)=x². Using the chain rule and the derivatives of sin(x) and x², we can then find the derivative of sin(x²). Created by Sal Khan.

## Want to join the conversation?

- What is the standard formula or the chain rule?(51 votes)
- sal says it at the start of the next video(2 votes)

- Can anyone please help me undrrstand how there are two functions in y=sin x^2 but one in y=2x^2+3x?....really getting confused..

thank you(18 votes)- The function in the video is y = (sin x)² [more commonly written as y = sin² x]. I think of this as square(sin(x)), that is, a square function of a sine function of x.

Think of y = 2x² + 3x as y = f(x) + g(x) where f(x) is 2x² and g(x) is 3x. The functions of x are not being composed/chained as above (so the chain rule doesn't apply), and they are not being multiplied (so the product rule doesn't apply). They're simply being added. In this situation, the derivative of a sum is the sum of the derivatives, and each function of x is so simple that we can apply the power rule to each term.

* thanks to John Kollar for pointing out how my answer could be clarified.(26 votes)

- At4:23, Sal says that we can't cancel d(sinx) . Why is that so ?(26 votes)
- Treating differentials as regular numbers is in some cases useful, but it is not entirely correct. In some cases, it can simply confuse you.

If it helps you to remember the chain rule, you can think of the canceling of the differentials. However, do not make the mistake of thinking that is actually what is happening.(2 votes)

- Why is it called the Chain rule ?(11 votes)
- When functions are composed, they operate as if they were in a chain: the input goes into the first function which spits out an output that becomes the input to the next function which spits out an output that may become the input of a third function and so on. We don't necessarily see this immediately from the way the function is written, but that's the way a compound function operates. So if our function is sin^2(x + pi), we feed the input (x) into the first function which increases it by pi and hands it over to the next function which finds the sine and hands that result over to the next function which squares it.(27 votes)

- Near3:41, 2sin(x)cos(x) is equal to sin(2x). Is there any deeper meaning to this?(16 votes)
- A well known trig identity that you must learn at some point is that sin(A+B) = sinAcosB + cosBsinA. The video showing a geoemtric proof for this fact is here: https://www.khanacademy.org/math/trigonometry/less-basic-trigonometry/trig_iden_tutorial/v/proof--sin-a-b------cos-a--sin-b-----sin-a--cos-b

So if A and B are the same, we can write sin(A + A) or simply sin(2A), and using the identity we've just learned, we know that is sinAcosA + cosAsinA which can be rewritten as 2sinAcosA. As for deeper meaning, I don't know.(12 votes)

- can u give some sort of logic behind chain rule on a graph ?? we learnt about basic differentiation using graph, why is it that chain rule is stated here and in my textbook as a fact ?(9 votes)
- Hmm... very interesting question...

When you apply the chain rule, you're taking into account how the slope of the function is behaving by the influence of the internal variables... for example...

Say f(x) = (2x+1)^2

then, f'(x) = 2(2x+1)(2) = 4(2x+1) = 8x+4

If you graph (2x+1)^2 you will see that it is a parabola... then, if you graph 8x+4 on the same sheet of graphing paper... you will see something very interesting...

Towards the top of of the parabola on the left side, it almost looks like a straight line... and we know the derivative is decreasing... on the same x-value for the derivative... we see that the x-value produces an incredibly negative y-value, which is the slope of the function f(x) at the x-value of interest (sometimes called a)... Even when the slope of f(x) is 0 at x = -2, we see the graph of the derivative crosses the x-axis at x = -2...

A lot of this has to do with looking at a graphs of a function and their derivatives on the same graphing sheet. That's the only time you will make sense of it all.

In sum, basically, the chain rule takes into consideration of how the functions within a function determine the function's slope at some input.

Hope that helps... You may want to review some of Sal's videos on derivatives - especially the ones where he graphs the derivatives intuitively. They seem to get the point across very efficiently.

Happy learning =)(17 votes)

- Okay, so I sort of understand how we have to go through the function by layers. It's like we unzip it. But I don't understand why we multiply the different layers once we've taken it apart. In the video's example, at3:22we get h'(x)=2sinx * cosx. Why isn't it h'(x)= 2cosx? When we solve for the inner function, why doesn't our answer replace the sinx in the final answer?(7 votes)
- Consider what is the derivative of 2sinx? The answer is 2cosx, and if that's the derivative of 2sinx we shouldn't expect it to also be the derivative of (sinx)^2.

Why do we multiply when applying the chain rule? Derivatives are rates of change, which means they're essentially multipliers. For example, the derivative of x^2 is 2x, which means at any point on the curve, y is growing at a rate of two times x. If we apply another function to that function, we have another multiplier applied to the first one. That's the essence of why the chain rule works the way it does.(14 votes)

- what is the derivative of a complex number(4 votes)
- Since a complex number in itself is a constant, its derivative is zero. Did you mean to ask about the differentiation of complex-valued functions defined on subsets of the complex plane? Such functions may (sometimes) be differentiated. Let

denote the set of complex numbers, and suppose**C**`U`

is some subset of

. Suppose further that**C**`ƒ: U →`

is a complex-valued function defined on**C**`U`

, and suppose`w`

is an interior point of`U`

. If the limit`lim (z → w) [ƒ(z) - ƒ(w)] / [z - w]`

exists, we say that`ƒ`

is*(complex) differentiable*at`w`

, and we denote the value of this limit by`ƒ'(w)`

. If`U`

is open, and if`ƒ`

is differentiable at every point of`U`

, we say that`ƒ`

is differentiable on`U`

. If`ƒ`

is differentiable on an open set`U`

, one also says that`ƒ`

is*holomorphic*on`U`

, or sometimes that`ƒ`

is*analytic*on`U`

. Holomorphic functions are central in the theory of complex functions.

More specifically, to say that`ƒ: U →`

is differentiable at an interior point**C**`w`

in`U`

means the following: there exists some complex number`L`

such that for every real number`ε > 0`

there exists a real number`δ > 0`

with the property that for all complex numbers`z`

in`U`

with`0 < |z - w| < δ`

, we have`|[ƒ(z) - ƒ(w)]/[z - w] - L| < ε`

. If such a number`L`

exists, we usually denote it by`ƒ'(w)`

. This property may also be cast in terms of convergent sequences in`U`

.

The process of differentiation of complex-valued functions defined on subsets of the complex plane shares many properties with differentiation of real-valued functions defined on subsets of the real numbers. For instance, the differentiation operator is linear. Furthermore, the product rule, the quotient rule, and the chain rule all hold for such complex functions.

As an example, consider the function`ƒ:`

defined by**C**→**C**`ƒ(z) = (1 - 3𝑖)z - 2`

. It can be shown that`ƒ`

is holomorphic, and that`ƒ'(z) = 1 - 3𝑖`

for every complex number`z`

.(8 votes)

- Hello.... I have a question that I am unable to solve.....

the question is to differentiate cos x^3 . sin^2*(x^5) w.r.t x

could u please guide me on how to go about this problem ?(3 votes)- Did you mean d/dx{cos(x³) * sin²(x⁵)}?

If so, this is a bit of a tricky one. Here's how to do it:

Step 1: Use the power rule.

d/dx{cos(x³) * sin²(x⁵)}

= cos(x³)d/dx{sin²(x⁵)} + sin²(x⁵)d/dx{cos(x³)}

Step 2: Now we have the sum of two derivatives. So, we will find d/dx{sin²(x⁵)} and d/dx{cos(x³)} separately and then plug in the results to cos(x³)d/dx{sin²(x⁵)} + sin²(x⁵)d/dx{cos(x³)}

Step 2a:First, let us do d/dx{sin²(x⁵)}

We need to use the chain rule twice:

d/dx{sin²(x⁵)}

= 2sin(x⁵) d/dx(sin(x⁵))

= 2 sin(x⁵)cos(x⁵) d/dx(x⁵)

= 2 cos(x⁵) sin(x⁵)[5x⁴]

Simplify:

= 10 x⁴ cos(x⁵) sin(x⁵)

Step 2b: Now let us do d/dx{cos(x³)}

We use the chain rule:

d/dx{cos(x³)}

=- sin(x³) d/dx(x³)

=- sin(x³)[3x²]

= -3x²sin(x³)

Step 3: Now let us plug the derivatives we found in steps 2a and 2b into

cos(x³)d/dx{sin²(x⁵)} + sin²(x⁵)d/dx{cos(x³)}

=cos(x³)[10 x⁴ cos(x⁵) sin(x⁵)] + sin²(x⁵)d/dx{cos(x³)}

=cos(x³)[10 x⁴ cos(x⁵) sin(x⁵)] + sin²(x⁵)[-3x²sin(x³)]

Simplify.

=10 x⁴ cos(x³) cos(x⁵) sin(x⁵) -3x²sin²(x⁵) sin(x³)(7 votes)

- With "inner" and "outer" functions, Sal is talking about composite functions, right? Like f(x) = g(h(x)) = g ∘ h, right?(4 votes)
- Yes, that is correct. The chain rule allows you to take the derivative of composite functions.(3 votes)

## Video transcript

- [Instructor] What we're going to go over in this video is one of the
core principles in calculus, and you're going to use it any
time you take the derivative, anything even reasonably complex. And it's called the chain rule. And when you're first exposed to it, it can seem a little daunting
and a little bit convoluted. But as you see more and more examples, it'll start to make sense,
and hopefully it'd even start to seem a little bit simple
and intuitive over time. So let's say that I had a function. Let's say I have a function
h of x, and it is equal to, just for example, let's say
it's equal to sine of x, let's say it's equal to sine of x squared. Now, I could've written that, I could've written it like this, sine squared of x, but it'll
be a little bit clearer using that type of notation. So let me make it so I have h of x. And what I'm curious about
is what is h prime of x? So I want to know h prime of x, which another way of writing it is the derivative of h with respect to x. These are just different notations. And to do this, I'm going
to use the chain rule. I'm going to use the chain rule, and the chain rule comes
into play every time, any time your function can
be used as a composition of more than one function. And as that might not
seem obvious right now, but it will hopefully, maybe by the end of this
video or the next one. Now, what I want to do is a little bit of a thought experiment, a little bit of a thought experiment. If I were to ask you what is the derivative with respect to x, if I were to just apply
the derivative operator to x squared with respect
to x, what do I get? Well, this gives me two x. We've seen that many,
many, many, many times. Now, what if I were to take the
derivative with respect to a of a squared? Well, it's the exact same thing. I just swapped an a for the x's. This is still going to be equal to two a. Now I will do something that might be a little bit more bizarre. What if I were to take the
derivative with respect to sine of x, with respect to sine of x of, of sine of x, sine of x squared? Well, wherever I had the x's
up here, the a's over here, I just replace it with a sine of x. So this is just going to be
two times the thing that I had, so whatever I'm taking the
derivative with respect to. Here it was with respect to x. Here with respect to a. Here's with respect to sine of x. So it's going to be two times sine of x. Now, so the chain rule tells us that this derivative is
going to be the derivative of our whole function with respect, or the derivative of this
outer function, x squared, the derivative of x squared, the derivative of this outer function with respect to sine of x. So that's going to be two sine of x, two sine of x. So we could view it as the
derivative of the outer function with respect to the inner, two sine of x. We could just treat sine of
x like it's kind of an x. And it would've been just two x, but instead it's a sine of x. We say two sine of x times, times the derivative, do this is green, times the derivative of
sine of x with respect to x. Times the derivative of
sine of x with respect to x, well, that's more straightforward, a little bit more intuitive. The derivative of sine of x with respect to x, we've
seen multiple times, is cosine of x, so times cosine of x. And so there we've applied the chain rule. It was the derivative
of the outer function with respect to the inner. So derivative of sine of x squared with respect to sine
of x is two sine of x, and then we multiply that times
the derivative of sine of x with respect to x. So let me make it clear. This right over here is the derivative. We're taking the derivative of, we're taking the derivative
of sine of x squared. So let me make it clear. That's what we were
taking the derivative of with respect to sine of x, with respect to sine of x. And then we're multiplying that times the derivative of sine of x, the derivative of sine of x with respect to, with respect to x. And this is where it might start making a little bit of intuition. You can't really treat
these differentials, this d whatever, this
dx, this d sine of x, as a number. And you really can't, this notation makes it
look like a fraction because intuitively
that's what we're doing. But if you were to treat
'em like fractions, then you could think about
canceling that and that. And once again, this isn't
a rigorous thing to do, but it can help with the intuition. And then what you're left
with is the derivative of this whole sine of x
squared with respect to x. So you're left with, you're left with the derivative of essentially our original
function, sine of x squared with respect to x, with respect to x, which
is exactly what dh/dx is. This right over here, this right over here is
our original function h. That's our original function h. So it might seem a
little bit daunting now. What I'll do in the next video
is another several examples, and then we'll try to
abstract this a little bit.