If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Taylor polynomial remainder (part 1)

The more terms we have in a Taylor polynomial approximation of a function, the closer we get to the function. But HOW close? Let's embark on a journey to find a bound for the error of a Taylor polynomial approximation. Created by Sal Khan.

Want to join the conversation?

  • blobby green style avatar for user SoldOut4Him85
    Why wouldn't the n+1'th derivative of the function be 0 as well?
    (59 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user RagnarG
      Remember also that the function f(x) isn't necessarily a polynomial. p(x) is its polynomial representation, but f(x) can be any function - sin(x), e^x, sqrt(x), etc. Those are examples of functions that will never go to 0 no matter how many times you take the derivative.
      (124 votes)
  • blobby green style avatar for user lazy111
    when he derives p(a), and says it is equal to f'(a), how is that possible when the term that has f'(x) is multiplied by (x-a), so at (a) this term should also be zero?
    (11 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user jacdegfor
      I know this was posted four years ago, but I may as well write this in case it helps anyone. The reason p'(a) = f'(a) (and p''(a) = f''(a), etc) is because of the following:

      We are given:
      p(x)=f(a)+f'(a)(x-a)+f''(a)((x-a)^2)/2!+...

      To find p'(x), we have to take the derivative of each term in p(x). Since f(a) is a constant (since a is just a number that the function is centered around), the derivative of that would be 0. When taking the derivative of the second term, ie f'(a)(x-a), using the product rule (u'v +uv') gives you f''(a)(x-a)+f'(a)(1), which, when evaluated at a, is f''(a)(a-a)+f'(a)(1) which simplifies to f'(a) since f''(a)(a-a)=0. All of the terms after that are 0 since the product rule doesn't lead to the (x-a)^n part of each term 'cancelling out' as it did here. You can verify this last bit by trying the product rule on them yourself. This whole concept applies to every derivative of the polynomial up through the degree of the polynomial, as far as I know. Hope this helps someone!
      (24 votes)
  • male robot donald style avatar for user donthate35
    What do you mean by bound toward the end of the video?
    (7 votes)
    Default Khan Academy avatar avatar for user
    • leafers ultimate style avatar for user Brett Coryell
      Being "bound" means that you know that a value is definitely between two limits. For instance, you might be interested in knowing that your approximation is good to 1% or to 0.01% or to one part in a million.

      With calculators, it's often easy enough to add another term and get your error to be bound to any level you want. However, when math meets the physical world, you could get a scenario where you have 3000 cables supporting a load on a bridge. The math of exactly what's happening on each cable is too hard so you turn to approximations. If you can show your approximation is valid (or bound) to within 2% and you know that all materials have a 25% safety tolerance, you don't need to spend any more time or money on figuring out the details of your cable loads more precisely.
      (22 votes)
  • leaf green style avatar for user SteveSargentJr
    Is this related to Big O Notation?

    And, by the way, what is "Big O Notation"?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • aqualine seed style avatar for user Sunny Shah
      Big O notation is computer science term, It is used to describe the processing complexity of the algorithm.

      For example, If you have n random numbers and you want to find minimum number out of it then Big O for it will be n. Which means that we will have to check every number once to find minimum number out of it.
      (10 votes)
  • leafers ultimate style avatar for user Marco Merlini
    Well, there's the obvious problem that the degree of functions such as the sine or e^x is infinite. Does this mess up the principle of bounding the error function? Why or why not?
    (6 votes)
    Default Khan Academy avatar avatar for user
  • leafers sapling style avatar for user Dorian T Lazare
    At , When we want to find the N+1 dervivative of the error (E(X)), why is the N+1 derivative of theTaylor Polynomial (P(x)) equals to zero, and not the N+1 derivative of the function, Why is the last on equal to the N+1 derivative of the error?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Abdo Reda
      Remember that P(x) is an nth polynomial if you try to figure out the 3rd derivative of x^2 you will get zero, In fact if you have a polynomial function with highest degree n and you get the (n+1)th derivative you get zero that is because every time you take the derivative you apply the power rule where you decrease the power by one until it becomes 0 in which case you have a constant function ex: f(x) = 5 x^0 = 5 and the derivative of a constant function will equal zero, point is that take the derivative of a polynomial enough times you will have a constant function take the derivative one more time you will get 0.
      one more thing is that we are assuming we don't know what kind of function is f(x) it could be a polynomial in which case f'^(n+1)(x) will equal zero or it could be an exponential function like 2e^x in which case f'^(n+1)(x) will equal 2e^a at x=a so we actually don't know which one could it be so we leave it as it is and we write it f'^(n+1)(x).
      I hope I was helpful to you.
      Good luck and never give up :)
      (5 votes)
  • mr pink red style avatar for user Enya Hsiao
    since this approximation completely matched f(x), why is the (n+1)th derivative of P(x) zero since it goes on and on to infinity? Which means, why is P(x) confined to the nth term?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • leafers ultimate style avatar for user Brett Coryell
      Or another way of saying it is that you assume you've created an approximation of order n. When you take the (n+1)th derivative you get 0 and I assume the video explains that well enough.

      Why, then, would you not do an infinite number of terms? Practical considerations. Calculators used to use Taylor series expansions to calculate sin, cos, and e^x. (I think most use lookup tables and interpolation now.) However, your calculator can't do an infinite number of terms. Instead, they know they're going to show the answer to, say, 8 decimal places. How many terms do you need for your sin(x) approximation in order to know that your Taylor's series expansion is good to at least 8 decimal places? That's an example of bounding your error AND of why you'd stop at some point.
      (3 votes)
  • piceratops ultimate style avatar for user Tushar Pal
    At , is there any proof that the derivative of E is also equal to the difference of derivatives of f and P? I mean, we only defined E(x) as the difference of a function's value and the value of its taylor polynomial at some x. But how can we be sure about the derivatives too?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user earl kraft
    at Sal says, "if we assume that this is higher than degree one, we know that these derivatives are going to be the same at a". Is the "this" that he refers to the degree of the function? If so, wouldn't the equivalence hold even if the function (or the polynomial for that matter) were of zero degree? Why does he qualify the equivalence?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user robshowsides
      Good question! At first, I thought you were 100% correct, but now I think I know what he was trying to say. The "this" is P(x), and he should really have said "if we assume P(x) has degree of AT LEAST one", or "if we assume P(x) is higher than degree ZERO". He is writing down the fact that f'(a) and P'(a) are equal (i.e., f'(a) - P'(a) =0), but that will only be true if our Taylor polynomial is of degree at least one (i.e., we are at least making a tangent line, so that the slope matches the slope of f(x) ). It would be pretty silly to work with a zeroth degree Taylor polynomial (i.e. a constant horizontal line at y = f(a) ), but it would not be technically forbidden.

      One other quick note is to remember that the FUNCTION, f(x), very likely DOES NOT HAVE ANY DEGREE AT ALL. f(x) is often something like sin(x), e^x, ln(x), which is not a polynomial, and that is generally the reason we are trying to find a Taylor polynomial approximation. Of course, we can (and sometimes do) find Taylor polynomials to approximate other (higher-degree) polynomials, but that is fairly rare.
      (2 votes)
  • blobby green style avatar for user Yames Yamb
    Mr. Sal says that Taylor series is useful because the derivative at a is equal to the derivative of the function at a... But I don't understand why we need to use Taylor series in the first place. Why not just use the function?

    In other words.. what's the point of approximating a function when you actually have the"real" function?
    (2 votes)
    Default Khan Academy avatar avatar for user

Video transcript

- [Voiceover] Let's say that we have some function f of x right over here. And let me graph an arbitrary f of x. So, that's my y-axis, that is my x-axis and maybe f of x looks something like that. And what I wanna do is I wanna approximate f of x with a Taylor polynomial centered around x is equal to a. So this is the x-axis, this is the y-axis. So I want a Taylor polynomial centered around there. And we've seen how this works. The Taylor polynomial comes out of the idea that for all of the derivatives up to and including the degree of the polynomial, those derivatives of that polynomial evaluated at a should be equal to the derivatives of our function evaluated at a. And that polynomial evaluated at a should also be equal to that function evaluated at a. So our polynomial, our Taylor polynomial approximation would look something like this. So, I'll call it P of x. And sometimes you might see a subscript, a big N there to say it's an Nth degree approximation and sometimes you'll see something like this. Sometimes you'll see something like N comma a to say it's an Nth degree approximation centered at a. Actually, I'll write that right now. Maybe we might lose it if we have to keep writing it over and over but you should assume that it is an Nth degree polynomial centered at a. And it's going to look like this. It is going to be f of a, plus f prime of a, times x minus a, plus f prime prime of a, times x minus a squared over-- Either you could write two or two factorial, they're the same value. I'll write two factorial. You could write a divided by one factorial over here, if you like. And then plus, you go to the third derivative of f at a times x minus a to the third power, I think you see where this is going, over three factorial. And you keep going, I'll go to this line right here, all the way to your Nth degree term which is the Nth derivative of f evaluated at a times x minus a to the N over N factorial. And this polynomial right over here, this Nth degree polynomial centered at a, f or P of a is going to be the same thing as f of a. And you can verify that because all of these other terms have an x minus a here. So if you put an a in the polynomial, all of these other terms are going to be zero. And you'll have P of a is equal to f of a. Let me write that down. P of a is equal to f of a. And so it might look something like this. And it's going to fit the curve better the more of these terms that we actually have. So it might look something like this. I'll try my best to show what it might look like. So this is all review, I have this polynomial that's approximating this function. The more terms I have, the higher degree of this polynomial, the better that it will fit this curve the further that I get away from a. But what I wanna do in this video is think about if we can bound how good it's fitting this function as we move away from a. So what I wanna do is define a remainder function. Or sometimes, I've seen some text books call it an error function. And I'm going to call this-- I'll just call it an error-- Just so you're consistent with all the different notations you might see in a book, some people will call this a remainder function and sometimes they'll write a remainder function for an Nth degree polynomial centered at a. Sometimes you'll see this as an error function. The error function is sometimes avoided because it looks like expected value from probability. But you'll see this often, this is E for error. E for error, R for remainder. And sometimes they'll also have the subscripts over there like that. And what we'll do is, we'll just define this function to be the difference between f of x and our approximation of f of x for any given x. So it's really just going to be, I'll do it in the same colors, it's going to be f of x minus P of x. Where this is an Nth degree polynomial centered at a. So for example, if someone were to ask you, or if you wanted to visualize. What are they talking about if they're saying the error of this Nth degree polynomial centered at a when we are at x is equal to b. What is thing equal to or how should you think about this. Well, if b is right over here. So the error of b is going to be f of b minus the polynomial at b. So f of b there, the polynomial's right over there. So it'll be this distance right over here. So if you measure the error at a, it would actually be zero. Because the polynomial and the function are the same there. F of a is equal to P of a, so the error at a is equal to zero. And let me actually write that down because that's an interesting property. It'll help us bound it eventually so let me write that. The error function at a. And for the rest of this video you can assume that I could write a subscript. This is for the Nth degree polynomial centered at a. I'm just gonna not write that everytime just to save ourselves a little bit of time in writing, to keep my hand fresh. So the error at a is equal to f of a minus P of a. And once again, I won't write the sub-N, sub-a. You can assume it, this is an Nth degree polynomial centered at a. And these two things are equal to each other. So this is going to be equal to zero. And we see that right over here. The distance between the two functions is zero there. Now let's think about something else. Let's think about what the derivative of the error function evaluated at a is. Well that's going to be the derivative of our function at a minus the first derivative of our polynomial at a. And if we assume that this is higher than degree one, we know that these derivates are going to be the same at a. You can try to take the first derivative here. If you take the first derivative of this whole mess-- And this is actually why Taylor polynomials are so useful, is that up to and including the degree of the polynomial when you evaluate the derivatives of your polynomial at a they're going to be the same as the derivatives of the function at a. And that's what starts to make it a good approximation. But if you took a derivative here, this term right here will disappear, it'll go to zero. I'll cross it out for now. This term right over here will just be f prime of a and then all of these other terms are going to be left with some type of an x minus a in them. And so when you evaluate it at a, all the terms with an x minus a disappear, because you have an a minus a on them. This one already disappeared and you're literally just left with P prime of a will equal f prime of a. And we've seen that before. So let me write that. So because we know that P prime of a is equal to f prime of a, when you evaluate the error function, the derivative of the error function at a, that also is going to be equal to zero. And this general property right over here, is true up to an including N. So let me write this down. So we already know that P of a is equal to f of a. We already know that P prime of a is equal to f prime of a. This really comes straight out of the definition of the Taylor polynomials. And this is going to be true all the way until the Nth derivative of our polynomial is going, evaluated at a, not everywhere, just evaluated at a, is going to be equal to the Nth derivative of our function evaluated at a. So what that tells us is that we can keep doing this with the error function all the way to the Nth derivative of the error function evaluated at a is going to be equal to, well that's just going to be the Nth derivative of f evaluated at a, minus the Nth derivative of our polynomial evaluated at a. And we already said that these are going to be equal to each other up to the Nth derivative when we evaluate them at a. So these are all going to be equal to zero. So this is an interesting property and it's also going to be useful when we start to try to bound this error function. And that's the whole point of where I'm going with this video and probably the next video, is we're gonna try to bound it so we know how good of an estimate we have. Especially as we go further and further from where we are centered. >From where are approximation is centered. Now let's think about when we take a derivative beyond that. So let's think about what happens when we take the N plus oneth derivative. What's a good place to write? Well I have some screen real estate right over here. What is the N plus oneth derivative of our error function? And not even if I'm just evaluating at a. If I just say generally, the error function E of x, what's the N plus oneth derivative of it? Well it's going to be the N plus oneth derivative of our function minus the N plus oneth derivative of our-- We're not just evaluating at a here either. Let me write a x there. I'm literally just taking the N plus oneth derivative of both sides of this equation right over here. So it's literally the N plus oneth derivative of our function minus the N plus oneth derivative of our Nth degree polynomial. The N plus oneth derivative of our Nth degree polynomial. I could write a N here, I could write an a here to show it's an Nth degree centered at a. Now, what is the N plus onethe derivative of an Nth degree polynomial? And if you want some hints, take the second derivative of y is equal to x. It's a first degree polynomial, take the second derivative, you're gonna get zero. Take the third derivative of y is equal to x squared. The first derivative is 2x, the second derivative is 2, the third derivative is zero. In general, if you take an N plus oneth derivative of an Nth degree polynomial, and you could prove it for yourself, you could even prove it generally but I think it might make a little sense to you, it's going to be equal to zero. It is going to be equal to zero. So this thing right here, this is an N plus oneth derivative of an Nth degree polynomial. This is going to be equal to zero. Let me write this over here. The N plus oneth derivative of our error function or our remainder function, we could call it, is equal to the N plus oneth derivative of our function. And so, what we could do now and we'll probably have to continue this in the next video, is figure out, at least can we bound this? Can we bound this and if we are able to bound this, if we're able to figure out an upper bound on its magnitude-- So actually, what we want to do is, we wanna bound its overall magnitude. We wanna bound its absolute value. If we can determine that it is less than or equal to some value M, so if we can actually bound it, maybe we can do a little bit of calculus, we could keep integrating it and maybe we can go back to the original function and bound that in some way. If we do know some type of bound like this over here. So I'll take that up in the next video.