If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Quick sort

# Analysis of quicksort

How is it that quicksort's worst-case and average-case running times differ? Let's start by looking at the worst-case running time. Suppose that we're really unlucky and the partition sizes are really unbalanced. In particular, suppose that the pivot chosen by the partition function is always either the smallest or the largest element in the n-element subarray. Then one of the partitions will contain no elements and the other partition will contain n, minus, 1 elements—all but the pivot. So the recursive calls will be on subarrays of sizes 0 and n, minus, 1.
As in merge sort, the time for a given recursive call on an n-element subarray is \Theta, left parenthesis, n, right parenthesis. In merge sort, that was the time for merging, but in quicksort it's the time for partitioning.

### Worst-case running time

When quicksort always has the most unbalanced partitions possible, then the original call takes c, n time for some constant c, the recursive call on n, minus, 1 elements takes c, left parenthesis, n, minus, 1, right parenthesis time, the recursive call on n, minus, 2 elements takes c, left parenthesis, n, minus, 2, right parenthesis time, and so on. Here's a tree of the subproblem sizes with their partitioning times:
Diagram of worst case performance for Quick Sort, with a tree on the left and partition times on the right. The tree is labeled "Subproblem sizes" and the right is labeled "Total partitioning time for all subproblems of this size." The first level of the tree shows a single node n and corresponding partitioning time of c times n. The second level of the tree shows two nodes, 0 and n minus 1, and a partitioning time of c times n minus 1. The third level of the tree shows two nodes, 0 and n minus 2, and a partitioning time of c times n minus 2. The fourth level of the tree shows two nodes, 0 and n minus 3, and a partitioning time of c times n minus 3. Underneath that level, dots indicate that the tree continues like that. The second to last level in the tree has a single node 2 with a partitioning time of 2 times c and the last level has two nodes of 0 and 1, with a partitioning time of 0.
When we total up the partitioning times for each level, we get
\begin{aligned} cn + c(n-1) + c(n-2) + \cdots + 2c &= c(n + (n-1) + (n-2) + \cdots + 2) \\ &= c((n+1)(n/2) - 1) \ . \end{aligned}
The last line is because 1, plus, 2, plus, 3, plus, \@cdots, plus, n is the arithmetic series, as we saw when we analyzed selection sort. (We subtract 1 because for quicksort, the summation starts at 2, not 1.) We have some low-order terms and constant coefficients, but when we use big-Θ notation, we ignore them. In big-Θ notation, quicksort's worst-case running time is \Theta, left parenthesis, n, squared, right parenthesis.

### Best-case running time

Quicksort's best case occurs when the partitions are as evenly balanced as possible: their sizes either are equal or are within 1 of each other. The former case occurs if the subarray has an odd number of elements and the pivot is right in the middle after partitioning, and each partition has left parenthesis, n, minus, 1, right parenthesis, slash, 2 elements. The latter case occurs if the subarray has an even number n of elements and one partition has n, slash, 2 elements with the other having n, slash, 2, minus, 1. In either of these cases, each partition has at most n, slash, 2 elements, and the tree of subproblem sizes looks a lot like the tree of subproblem sizes for merge sort, with the partitioning times looking like the merging times:
Diagram of best case performance for Quick Sort, with a tree on the left and partitioning times on the right. The tree is labeled "Subproblem size" and the right is labeled "Total partitioning time for all subproblems of this size." The first level of the tree shows a single node n and corresponding partitioning time of c times n. The second level of the tree shows two nodes, each of less than or equal to 1/2 n, and a partitioning time less than or equal to 2 times c times 1/2 n, the same as c times n. The third level of the tree shows four nodes, each of less than or equal to 1/4 n, and a partitioning time less than or equal to 4 times c times 1/4 n, the same as c times n. The fourth level of the tree shows eight nodes, each of less than ot equal to 1/8 n, and a partitioning time less than or equal to 8 times c times 1/8 n, the same as c times n. Underneath that level, dots are shown to indicate the tree continues like that. A final level is shown with n nodes of 1, and a partitioning time of less than or equal to n times c, the same as c times n.
Using big-Θ notation, we get the same result as for merge sort: \Theta, left parenthesis, n, log, start base, 2, end base, n, right parenthesis.

### Average-case running time

Showing that the average-case running time is also \Theta, left parenthesis, n, log, start base, 2, end base, n, right parenthesis takes some pretty involved mathematics, and so we won't go there. But we can gain some intuition by looking at a couple of other cases to understand why it might be O, left parenthesis, n, log, start base, 2, end base, n, right parenthesis. (Once we have O, left parenthesis, n, log, start base, 2, end base, n, right parenthesis, the \Theta, left parenthesis, n, log, start base, 2, end base, n, right parenthesis bound follows because the average-case running time cannot be better than the best-case running time.) First, let's imagine that we don't always get evenly balanced partitions, but that we always get at worst a 3-to-1 split. That is, imagine that each time we partition, one side gets 3, n, slash, 4 elements and the other side gets n, slash, 4. (To keep the math clean, let's not worry about the pivot.) Then the tree of subproblem sizes and partitioning times would look like this:
Diagram of average case performance for Quick Sort
The left child of each node represents a subproblem size 1/4 as large, and the right child represents a subproblem size 3/4 as large. Since the smaller subproblems are on the left, by following a path of left children, we get from the root down to a subproblem size of 1 faster than along any other path. As the figure shows, after log, start base, 4, end base, n levels, we get down to a subproblem size of 1. Why log, start base, 4, end base, n levels? It might be easiest to think in terms of starting with a subproblem size of 1 and multiplying it by 4 until we reach n. In other words, we're asking for what value of x is 4, start superscript, x, end superscript, equals, n? The answer is log, start base, 4, end base, n. How about going down a path of right children? The figure shows that it takes log, start base, 4, slash, 3, end base, n levels to get down to a subproblem of size 1. Why log, start base, 4, slash, 3, end base, n levels? Since each right child is 3/4 of the size of the node above it (its parent node), each parent is 4/3 times the size of its right child. Let's again think of starting with a subproblem of size 1 and multiplying the size by 4/3 until we reach n. For what value of x is left parenthesis, 4, slash, 3, right parenthesis, start superscript, x, end superscript, equals, n? The answer is log, start base, 4, slash, 3, end base, n.
In each of the first log, start base, 4, end base, n levels, there are n elements (again, including pivots that in reality are no longer being partitioned), and so the total partitioning time for each of these levels is c, n. But what about the rest of the levels? Each has fewer than n nodes, and so the partitioning time for every level is at most c, n. Altogether, there are log, start base, 4, slash, 3, end base, n levels, and so the total partitioning time is O, left parenthesis, n, log, start base, 4, slash, 3, end base, n, right parenthesis. Now, there's a mathematical fact that
log, start base, a, end base, n, equals, start fraction, log, start base, b, end base, n, divided by, log, start base, b, end base, a, end fraction
for all positive numbers a, b, and n. Letting a, equals, 4, slash, 3 and b, equals, 2, we get that
log, start base, 4, slash, 3, end base, n, equals, start fraction, log, start base, 2, end base, n, divided by, log, start base, 2, end base, left parenthesis, 4, slash, 3, right parenthesis, end fraction, space, comma
and so log, start base, 4, slash, 3, end base, n and log, start base, 2, end base, n differ by only a factor of log, start base, 2, end base, left parenthesis, 4, slash, 3, right parenthesis, which is a constant. Since constant factors don't matter when we use big-O notation, we can say that if all the splits are 3-to-1, then quicksort's running time is O, left parenthesis, n, log, start base, 2, end base, n, right parenthesis, albeit with a larger hidden constant factor than the best-case running time.
How often should we expect to see a split that's 3-to-1 or better? It depends on how we choose the pivot. Let's imagine that the pivot is equally likely to end up anywhere in an n-element subarray after partitioning. Then to get a split that is 3-to-1 or better, the pivot would have to be somewhere in the "middle half":
So, if the pivot is equally likely to end up anywhere in the subarray after partitioning, there's a 50% chance of getting at worst a 3-to-1 split. In other words, we expect a split of 3-to-1 or better about half the time.
The other case we'll look at to understand why quicksort's average-case running time is O, left parenthesis, n, log, start base, 2, end base, n, right parenthesis is what would happen if the half of the time that we don't get a 3-to-1 split, we got the worst-case split. Let's suppose that the 3-to-1 and worst-case splits alternate, and think of a node in the tree with k elements in its subarray. Then we'd see a part of the tree that looks like this:
Therefore, even if we got the worst-case split half the time and a split that's 3-to-1 or better half the time, the running time would be about twice the running time of getting a 3-to-1 split every time. Again, that's just a constant factor, and it gets absorbed into the big-O notation, and so in this case, where we alternate between worst-case and 3-to-1 splits, the running time is O, left parenthesis, n, log, start base, 2, end base, n, right parenthesis.
Bear in mind that this analysis is not mathematically rigorous, but it gives you an intuitive idea of why the average-case running time might be O, left parenthesis, n, log, start base, 2, end base, n, right parenthesis.

### Randomized quicksort

Suppose that your worst enemy has given you an array to sort with quicksort, knowing that you always choose the rightmost element in each subarray as the pivot, and has arranged the array so that you always get the worst-case split. How can you foil your enemy?
You could not necessarily choose the rightmost element in each subarray as the pivot. Instead, you could randomly choose an element in the subarray, and use that element as the pivot. But wait—the partition function assumes that the pivot is in the rightmost position of the subarray. No problem—just swap the element that you chose as the pivot with the rightmost element, and then partition as before. Unless your enemy knows how you choose random locations in the subarray, you win!
In fact, with a little more effort, you can improve your chance of getting a split that's at worst 3-to-1. Randomly choose not one, but three elements from the subarray, and take median of the three as the pivot (swapping it with the rightmost element). By the median, we mean the element of the three whose value is in the middle. We won't show why, but if you choose the median of three randomly chosen elements as the pivot, you have a 68.75% chance (11/16) of getting a 3-to-1 split or better. You can go even further. If you choose five elements at random and take the median as the pivot, your chance of at worst a 3-to-1 split improves to about 79.3% (203/256). If you take the median of seven randomly chosen elements, it goes up to about 85.9% (1759/2048). Median of nine? About 90.2% (59123/65536). Median of 11? About 93.1% (488293/524288). You get the picture. Of course, it doesn't necessarily pay to choose a large number of elements at random and take their median, for the time spent doing so could counteract the benefit of getting good splits almost all the time.

This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.

## Want to join the conversation?

• Can some explain the math behind this...
c(n+(n−1)+(n−2)+⋯+2) = c((n+1)(n/2)−1) • The formula for the sum of the arithmetic sequence:
1 + 2 + 3 + ... + (n-2) + (n-1) + n = (n+1) * (n/2)

How do we derive it ?
Call the sum S.S =      1 +     2 +     3 + ... + (n-2) + (n-1) + nWe could also write it like this:S =      n + (n-1) + (n-2) + ... +     3 +     2 + 1If we add the two equations together we get:2S = (n+1) + (n+1) + (n+1) + ... + (n+1) + (n+1) + (n+1)2S = n groups of (n+1)2S = (n+1) * nS = (n+1) * (n/2)

But what if we have ? :
c( n + (n−1) + (n−2) + ⋯ + 2 )
The stuff inside the brackets is just:
1 + 2 + 3 + ... + (n-2) + (n-1) + n
with the 1 missing.
So the stuff in the brackets is just: S-1
S - 1 = (n+1) * (n/2) - 1
And finally we multiply that by c to get:
c * ( (n+1) * (n/2) - 1)
• I have an array of N numbers which are same. I am applying Quick sort on it. What should be the time complexity of the sorting in this case? • Given the described implementation it will be O(N^2)
Each time the partition is performed, it will put all the elements on the same side.
The problem size will only be reduced by 1 (the size of the pivot), because the recursive calls to quick sort are made on the elements on either side of the partition (which will have N-1 elements on 1 side and 0 on the other).

So, since partitioning is O(N) and the problem size reduces by 1 each partition it will be:
O( N + (N-1) + (N-2) + ... + 3 + 2 + 1) = O(N^2)

There are some implementations of quick sort that do handle equal valued elements which would require only 1 partition. Those implementations would only be O(N) in this case (the time for 1 partition).
• Can anyone explain me about "Average-case running time" in easy English ? • The Big Idea :
To get an idea of the average case running time let's look at:
- the best case running time
- the running time of a case which is clearly worse than average

The average case running time will be between those two cases

Best Case

For quick sort, look at the best case:
- you always choose the pivot so that half of the elements are on either side
- each side of the pivot is the new subproblem, which is half the size of the previous problem
- so, it takes log_2(n) levels until the size of the problem is 1
(Why ?
n * (1/2)^x = 1 tells us how many times, x, that we have to halve n to get 1
Solving for x gives us x = log_2(n)
)

The average case can't be better than this best case.

What should we use as a worse than average case ?

In the average case all elements are equally likely to be chosen as the pivot
After partitioning, we would expect half the time the pivot to end up in the middle two quarters and half the time for it to end up in the outer two quarters.

We can make a worse than average case that:
-chooses the worst possible pivot half the time (instead of ending up in the outer two quarters)
-chooses a pivot with 3/4 of the elements on one side (instead of choosing a pivot in the middle two quarters)

Worse than Average Case

For quick sort, we could imagine a worse than average case where we get unlucky and:
- for odd levels we choose the worst possible pivot i.e. all elements are to the left or right of the pivot
- for even levels we choose a pivots where 3/4 of the elements are on one side and 1/4 on the other side
- so roughly, after every 2 levels our biggest sub problem is 3/4 of the size it was 2 levels before
- so now it takes 2 * log_(4/3) (n) levels until the size of the problem is 1
(Why ?
n * (3/4)^x = 1, tells us how many times, x, that we have to multiply n by 3/4 to get 1
Solving for x gives us x - log_(4/3) (n) , but we have to double it since we only multiply
by 3/4 on every second level)

The average case is better than this worse than average case.

So, the average case is between the best case of log_2(n) levels and the worst than average case of 2 * log_(4/3) (n) levels

Conclusion

But in big-O notation we ignore constants and bases on logs so they are both O(log(n)) levels
So, if the average is between O(log(n)) and O(log(n)) levels, it must also have O(log(n)) levels
So, with O(n) work done each level, the average case running time is: O(n log(n))

Hope this makes sense
• You have given the best case of quicksort as a binary tree with complexity O(nlogn). But the best case could be an array of already sort sequence.? • I understand that Quicksort takes at most Θ(n^2) time. Does this only happen when the pivot is the greatest or smallest item in each subarray? Or can it happen in any other case.

Thanks! • Another possible case, depending on the implementation, is having a large number of duplicate items.

In general, quick sort degrades to O(n^2) any time that the pivot is consistently chosen poorly i.e. where the pivot is chosen so that the vast majority of the elements are located to one side of the pivot.
e.g. if the pivot was somehow chosen that you always had 95% of the elements on one side, you would expect O(n^2) degradation.

I can't quite recall the percentage where it starts to degrade, but it is >70% (since use of a median of medians pivot guarantees < 70% and O(n log n) performance).
• What is the rigorous mathematical proof behin quicksort being O(n log n)? I imagine that it involved an integral, but I have no idea how to set it up. • • why we get different worst case running times for Quick- sort, and Mergesort? plz explain in brief
(1 vote) • Merge sort always does the same work O(n log n) regardless of the contents of the array, while quicksort depends on choosing good pivots. On random data we expect the pivots to split the subarrays near the middle most of the time. In that case we have log_2(n) steps before we reach subarrays of size 1. However, if we choose the pivots poorly, such that each time we partition the subarrays into 1 chunk with 1 element and the other chunk with all the other elements, then our problem size is only reduce by 1 each time we partition. That means it will take n steps before we reach subarrays of size 1. The total work to partition at each of these steps is O(n), so in our average case it costs O(n log n), but in our worst case it costs us O(n^2).

Hope this makes sense
(1 vote)
• In the average case why is it easier to assume it is log to the base 4 ? Shouldn't it be log to the base 2 ?
(1 vote) • can someone give an example of each of the best case scenarios? I have trouble picturing how the best case scenario works with quicksort.
(1 vote) 