If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Asymptotic notation

So far, we analyzed linear search and binary search by counting the maximum number of guesses we need to make. But what we really want to know is how long these algorithms take. We're interested in time, not just guesses. The running times of linear search and binary search include the time needed to make and check guesses, but there's more to these algorithms.
The running time of an algorithm depends on how long it takes a computer to run the lines of code of the algorithm—and that depends on the speed of the computer, the programming language, and the compiler that translates the program from the programming language into code that runs directly on the computer, among other factors.
Let's think about the running time of an algorithm more carefully. We can use a combination of two ideas. First, we need to determine how long the algorithm takes, in terms of the size of its input. This idea makes intuitive sense, doesn't it? We've already seen that the maximum number of guesses in linear search and binary search increases as the length of the array increases. Or think about a GPS. If it knew about only the interstate highway system, and not about every little road, it should be able to find routes more quickly, right? So we think about the running time of the algorithm as a function of the size of its input.
The second idea is that we must focus on how fast a function grows with the input size. We call this the rate of growth of the running time. To keep things manageable, we need to simplify the function to distill the most important part and cast aside the less important parts. For example, suppose that an algorithm, running on an input of size n, takes 6, n, squared, plus, 100, n, plus, 300 machine instructions. The 6, n, squared term becomes larger than the remaining terms, 100, n, plus, 300, once n becomes large enough, 20 in this case. Here's a chart showing values of 6, n, squared and 100, n, plus, 300 for values of n from 0 to 100:
We would say that the running time of this algorithm grows as n, squared, dropping the coefficient 6 and the remaining terms 100, n, plus, 300. It doesn't really matter what coefficients we use; as long as the running time is a, n, squared, plus, b, n, plus, c, for some numbers a, is greater than, 0, b, and c, there will always be a value of n for which a, n, squared is greater than b, n, plus, c, and this difference increases as n increases. For example, here's a chart showing values of 0, point, 6, n, squared and 1000, n, plus, 3000 so that we've reduced the coefficient of n, squared by a factor of 10 and increased the other two constants by a factor of 10:
The value of n at which 0, point, 6, n, squared becomes greater than 1000, n, plus, 3000 has increased, but there will always be such a crossover point, no matter what the constants.
By dropping the less significant terms and the constant coefficients, we can focus on the important part of an algorithm's running time—its rate of growth—without getting mired in details that complicate our understanding. When we drop the constant coefficients and the less significant terms, we use asymptotic notation. We'll see three forms of it: big-\Theta notation, big-O notation, and big-\Omega notation.

This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.

Want to join the conversation?

  • aqualine seedling style avatar for user Siayan God
    Please Explain How is f(x) = 4x^2 - 5x + 3 is O(x^2) derived

    |f(x)| = |4x^2 – 5x + 3|
    <= |4x^2|+ |- 5x| + |3|
    <= 4x^2 + 5x + 3, for all x > 0
    <= 4x^2 + 5x^2 + 3x^2, for all x > 1
    <= 12x^2, for all x > 1

    Hence we conclude that f(x) is O(x^2)

    Can someone explain the above proof step by step?
    i. Why do we take absolute value?
    ii. Why and how were all the term replaced by x^2 term?
    (22 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      By definition, f(n) is O( g(n) ) if:
      There exists contants k, N where k > 0, such that for all n > N:
      f(n) <= k * g(n)

      So to prove that f(x) = 4x^2 - 5x + 3 is O(x^2) we need to show that:
      There exists contants k, N where k > 0, such that for all x > N:
      f(x) <= k * g(x)
      4x^2 - 5x + 3 <= k * x^2

      The way we show that is by finding some k and some N that will work.

      The basic strategy is:
      - break up f(x) into terms
      - for each term find some term with a coefficient * x^2 that is clearly equal to or larger than it
      - this will show that f(x) <= the sum of the larger x^2 terms
      - the coefficient for the sum of the x^2 terms will be our k

      Explanation of provided proof:
      f(x) = 4x^2 - 5x + 3
      a number is always <= its absolute value e.g. -1 <= | -1 | and 2 <= | 2 |
      so we can say that:
      f(x) <= | f(x) |
      f(x) <= |f(x)| = |4x^2 – 5x + 3|
      4x^2 + 3 will always be positive, but -5x will be negative for x > 0
      so we know that -5x is <= | - 5 x |, so we can say that:
      f(x) <= |4x^2|+ |- 5x| + |3|
      For x > 0 |4x^2|+ |- 5x| + |3| = 4x^2 + 5x + 3, so we can say that:
      f(x) <= 4x^2 + 5x + 3, for all x > 0
      Suppose x > 1. Multiply both sides by x to show that x^2 > x
      So we can say x <= x^2.
      This let's us replace each of our x terms with x^2 so we can say that:
      f(x) <= 4x^2 + 5x^2 + 3x^2, for all x > 1
      4x^2 + 5x^2 + 3x^2 = 12x^2 so we can say that:
      f(x) <= 12x^2, for all x > 1

      So our k= 12 and since we had to assume x > 1 we pick N = 1


      A slightly different approach that I would use would be:
      Suppose N = 1, i.e x > 1 (This usually has all the nice properties you want for simple big-Oh proofs )
      f(x) = 4x^2 - 5x + 3
      f(x) <= 4x^2 + 5x + 3 ( for x > 1 a positive number * x is larger than a negative number * x)
      Since x > 1, we can say x^2 > x (by multiplying both sides of inequality by x)
      f(x) <= 4x^2 + 5x^2 + 3x^2
      f(x) <= 12x^12
      So our k = 12 and N=1 (which we assumed at the beginning)

      Hope this makes sense
      (65 votes)
  • blobby green style avatar for user Amanda Geib
    Having worked with limits, dropping the n and constant terms make sense to me. As n becomes really huge, 1000n's dinky contribution doesn't matter. But why is the coefficient on n^2 non-essential? Isn't 6n^2 is fundamentally different from n^2 in terms of curve shape, especially if you're looking for functions to bound it like in the next section? Reducing 100,000,000n^2 and .0000001n^2 both to n^2 seems weird to me, if we were trying to compare the two algorithms that contained those terms.
    (15 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      Here's why we drop the leading coefficient:
      -Typically, with algorithms, the coefficient isn't that important. Often, we just want to compare the running times of algorithms that perform the same task. Often, these algorithm will have different dominant terms (i.e. they are in different complexity classes), which will allow us to easily identify which algorithm have faster running times for large values of n
      -Calculating the coefficients for the running time of algorithms is often time consuming, error prone and inexact. However, identifying the most significant term for the running time is often straight forward.
      -Often, for algorithms in the same complexity class that perform the same task, we would expect the coefficients to be similar (we would expect small differences and improvements between algorithms in the same complexity class )

      However, sometimes, the coefficients do matter.
      e.g.
      We often use n^2 sorting algorithms (like insertion sort) for small values of n
      We often use n log n sorting algorithms (like merge sort) for large values of n
      Why don't we always use the n log n sorting algorithm?
      The n^2 algorithms have small coefficients, and the n log n algorithms have large coefficients. Only when the value of n starts to get large do we see these n^2 algorithms running slower than the n log n algorithms.

      So, while asymptotic notation can be a really useful to talk about and compare algorithms, it is definitely not without its limitations.

      Hope this makes sense
      (30 votes)
  • stelly orange style avatar for user totoro
    So, this article is suggesting that when we calculate how much time is needed to run the code and have an polynomial like this -

    (a * n^2) + (b * n) + c,

    we just calculate the n^2, dropping of the coefficients and less significant terms, right?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seed style avatar for user shamathmika76
    What would happen when a is negative in the equation (a * n^2 + b * n + c) ? What would the rate of growth be?
    (1 vote)
    Default Khan Academy avatar avatar for user
    • leafers seed style avatar for user seanbcampbell
      negative. If you have an x^2 graph, that's a parabola concave up, so it looks like a U. If you make it negative, it's flipped upside down so it looks like an n. More specifically, negative growth in this type of graph is called decay. You wouldn't really see it in an algorithm, I don't think, since there can't be negative choices.
      (5 votes)
  • mr pink orange style avatar for user Timspagnola
    What does "Asymptotic Notation" mean? It sounds like it is the measuring of algorithm speeds.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user TheRealJax
    Okay i kinda get it but don't get it
    (3 votes)
    Default Khan Academy avatar avatar for user
  • leaf green style avatar for user parikshit.navgire
    The article mentions about the combination of two ideas to determine the running time of an algorithm. I understood the first idea that the time taken increases as the input size increases, but I didn't understand the second idea, 'how fast a function a grows' mean ? It sounds similar to the first idea.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user swaradsable
    what is an asymptotic notation
    (1 vote)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      The asymptotic notation is explained in this and the following articles.
      The basic idea is that you analyze functions by how they behave, if you let them run towards infinity, instead of worrying about what they'll do at a specific place you look at the overall behavior.
      (5 votes)
  • blobby green style avatar for user sagardevkota0106
    how can we find the best and worst cases by seeing the algorithm?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seedling style avatar for user and.jim
    So, exponencial always grow faster than linear
    (1 vote)
    Default Khan Academy avatar avatar for user
    • male robot donald style avatar for user profjpbaugh
      Exponential, which in computer science is usually 2^n is one of the worst cases possible. That's because adding just 1 more to the input size (say, solving a size 4 problem versus a size 3 problem) will DOUBLE the time it takes to solve the problem (theoretically.)

      So, if you have a size 4 problem (n = 4), then linear algorithm will take 4 time units to solve it. Exponential will take 2^4 = 16 time units to solve it.

      That doesn't seem so bad, but if you have a size 10 input problem (n = 10), linear takes 10 time units. Exponential takes 2^10 = 1,024 time units.

      If n = 20, linear takes 20 time units. Exponential takes 2^20 = 1,048,576 time units. BIG jumps for small increases of input sizes.
      (5 votes)