Main content

# Analysis of quicksort

How is it that quicksort's worst-case and average-case running times differ? Let's start by looking at the worst-case running time. Suppose that we're really unlucky and the partition sizes are really unbalanced. In particular, suppose that the pivot chosen by the $n$ -element subarray. Then one of the partitions will contain no elements and the other partition will contain $n-1$ elements—all but the pivot. So the recursive calls will be on subarrays of sizes 0 and $n-1$ .

`partition`

function is always either the smallest or the largest element in the As in merge sort, the time for a given recursive call on an $n$ -element subarray is $\mathrm{\Theta}(n)$ . In merge sort, that was the time for merging, but in quicksort it's the time for partitioning.

### Worst-case running time

When quicksort always has the most unbalanced partitions possible, then the original call takes $cn$ time for some constant $c$ , the recursive call on $n-1$ elements takes $c(n-1)$ time, the recursive call on $n-2$ elements takes $c(n-2)$ time, and so on. Here's a tree of the subproblem sizes with their partitioning times:

When we total up the partitioning times for each level, we get

The last line is because $1+2+3+\cdots +n$ is the arithmetic series, as we saw when we analyzed selection sort. (We subtract 1 because for quicksort, the summation starts at 2, not 1.) We have some low-order terms and constant coefficients, but when we use big-Θ notation, we ignore them. In big-Θ notation, quicksort's worst-case running time is $\mathrm{\Theta}({n}^{2})$ .

### Best-case running time

Quicksort's best case occurs when the partitions are as evenly balanced as possible: their sizes either are equal or are within 1 of each other. The former case occurs if the subarray has an odd number of elements and the pivot is right in the middle after partitioning, and each partition has $(n-1)/2$ elements. The latter case occurs if the subarray has an even number $n$ of elements and one partition has $n/2$ elements with the other having $n/2-1$ . In either of these cases, each partition has at most $n/2$ elements, and the tree of subproblem sizes looks a lot like the tree of subproblem sizes for merge sort, with the partitioning times looking like the merging times:

Using big-Θ notation, we get the same result as for merge sort: $\mathrm{\Theta}(n{\mathrm{log}}_{2}n)$ .

### Average-case running time

Showing that the average-case running time is also $\mathrm{\Theta}(n{\mathrm{log}}_{2}n)$ takes some pretty involved mathematics, and so we won't go there. But we can gain some intuition by looking at a couple of other cases to understand why it might be $O(n{\mathrm{log}}_{2}n)$ . (Once we have $O(n{\mathrm{log}}_{2}n)$ , the $\mathrm{\Theta}(n{\mathrm{log}}_{2}n)$ bound follows because the average-case running time cannot be better than the best-case running time.) First, let's imagine that we don't always get evenly balanced partitions, but that we always get at worst a 3-to-1 split. That is, imagine that each time we partition, one side gets $3n/4$ elements and the other side gets $n/4$ . (To keep the math clean, let's not worry about the pivot.) Then the tree of subproblem sizes and partitioning times would look like this:

The left child of each node represents a subproblem size 1/4 as large, and the right child represents a subproblem size 3/4 as large. Since the smaller subproblems are on the left, by following a path of left children, we get from the root down to a subproblem size of 1 faster than along any other path. As the figure shows, after ${\mathrm{log}}_{4}n$ levels, we get down to a subproblem size of 1. Why ${\mathrm{log}}_{4}n$ levels? It might be easiest to think in terms of starting with a subproblem size of 1 and multiplying it by 4 until we reach $n$ . In other words, we're asking for what value of $x$ is ${4}^{x}=n$ ? The answer is ${\mathrm{log}}_{4}n$ . How about going down a path of right children? The figure shows that it takes ${\mathrm{log}}_{4/3}n$ levels to get down to a subproblem of size 1. Why ${\mathrm{log}}_{4/3}n$ levels? Since each right child is 3/4 of the size of the node above it (its $n$ . For what value of $x$ is $(4/3{)}^{x}=n$ ? The answer is ${\mathrm{log}}_{4/3}n$ .

**parent**node), each parent is 4/3 times the size of its right child. Let's again think of starting with a subproblem of size 1 and multiplying the size by 4/3 until we reachIn each of the first ${\mathrm{log}}_{4}n$ levels, there are $n$ elements (again, including pivots that in reality are no longer being partitioned), and so the total partitioning time for each of these levels is $cn$ . But what about the rest of the levels? Each has fewer than $cn$ . Altogether, there are ${\mathrm{log}}_{4/3}n$ levels, and so the total partitioning time is $O(n{\mathrm{log}}_{4/3}n)$ . Now, there's a mathematical fact that

`n`

nodes, and so the partitioning time for *every*level is*at most*for all positive numbers $a$ , $b$ , and $n$ . Letting $a=4/3$ and $b=2$ , we get that

and so ${\mathrm{log}}_{4/3}n$ and ${\mathrm{log}}_{2}n$ differ by only a factor of ${\mathrm{log}}_{2}(4/3)$ , which is a constant. Since constant factors don't matter when we use big-O notation, we can say that if all the splits are 3-to-1, then quicksort's running time is $O(n{\mathrm{log}}_{2}n)$ , albeit with a larger hidden constant factor than the best-case running time.

How often should we expect to see a split that's 3-to-1 or better? It depends on how we choose the pivot. Let's imagine that the pivot is equally likely to end up anywhere in an $n$ -element subarray after partitioning. Then to get a split that is 3-to-1 or better, the pivot would have to be somewhere in the "middle half":

So, if the pivot is equally likely to end up anywhere in the subarray after partitioning, there's a 50% chance of getting at worst a 3-to-1 split. In other words, we expect a split of 3-to-1 or better about half the time.

The other case we'll look at to understand why quicksort's average-case running time is $O(n{\mathrm{log}}_{2}n)$ is what would happen if the half of the time that we don't get a 3-to-1 split, we got the worst-case split. Let's suppose that the 3-to-1 and worst-case splits alternate, and think of a node in the tree with $k$ elements in its subarray. Then we'd see a part of the tree that looks like this:

instead of like this:

Therefore, even if we got the worst-case split half the time and a split that's 3-to-1 or better half the time, the running time would be about twice the running time of getting a 3-to-1 split every time. Again, that's just a constant factor, and it gets absorbed into the big-O notation, and so in this case, where we alternate between worst-case and 3-to-1 splits, the running time is $O(n{\mathrm{log}}_{2}n)$ .

Bear in mind that this analysis is $O(n{\mathrm{log}}_{2}n)$ .

*not*mathematically rigorous, but it gives you an intuitive idea of why the average-case running time might be### Randomized quicksort

Suppose that your worst enemy has given you an array to sort with quicksort, knowing that you always choose the rightmost element in each subarray as the pivot, and has arranged the array so that you

*always*get the worst-case split. How can you foil your enemy?You could not necessarily choose the rightmost element in each subarray as the pivot. Instead, you could randomly choose an element in the subarray, and use that element as the pivot. But wait—the

`partition`

function assumes that the pivot is in the rightmost position of the subarray. No problem—just swap the element that you chose as the pivot with the rightmost element, and then partition as before. Unless your enemy knows how you choose random locations in the subarray, you win!In fact, with a little more effort, you can improve your chance of getting a split that's at worst 3-to-1. Randomly choose not one, but three elements from the subarray, and take median of the three as the pivot (swapping it with the rightmost element). By the

**median**, we mean the element of the three whose value is in the middle. We won't show why, but if you choose the median of three randomly chosen elements as the pivot, you have a 68.75% chance (11/16) of getting a 3-to-1 split or better. You can go even further. If you choose five elements at random and take the median as the pivot, your chance of at worst a 3-to-1 split improves to about 79.3% (203/256). If you take the median of seven randomly chosen elements, it goes up to about 85.9% (1759/2048). Median of nine? About 90.2% (59123/65536). Median of 11? About 93.1% (488293/524288). You get the picture. Of course, it doesn't necessarily pay to choose a large number of elements at random and take their median, for the time spent doing so could counteract the benefit of getting good splits almost all the time.This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.

## Want to join the conversation?

- Can some explain the math behind this...

c(n+(n−1)+(n−2)+⋯+2) = c((n+1)(n/2)−1)(7 votes)- The formula for the sum of the arithmetic sequence:
`1 + 2 + 3 + ... + (n-2) + (n-1) + n = (n+1) * (n/2)`

How do we derive it ?`Call the sum S.`

S = 1 + 2 + 3 + ... + (n-2) + (n-1) + n

We could also write it like this:

S = n + (n-1) + (n-2) + ... + 3 + 2 + 1

If we add the two equations together we get:

2S = (n+1) + (n+1) + (n+1) + ... + (n+1) + (n+1) + (n+1)

2S = n groups of (n+1)

2S = (n+1) * n

S = (n+1) * (n/2)

But what if we have ? :`c( n + (n−1) + (n−2) + ⋯ + 2 )`

The stuff inside the brackets is just:`1 + 2 + 3 + ... + (n-2) + (n-1) + n`

with the 1 missing.

So the stuff in the brackets is just:`S-1`

`S - 1 = (n+1) * (n/2) - 1`

And finally we multiply that by c to get:`c * ( (n+1) * (n/2) - 1)`

(24 votes)

- I have an array of N numbers which are same. I am applying Quick sort on it. What should be the time complexity of the sorting in this case?(5 votes)
- Given the described implementation it will be O(N^2)

Each time the partition is performed, it will put all the elements on the same side.

The problem size will only be reduced by 1 (the size of the pivot), because the recursive calls to quick sort are made on the elements on either side of the partition (which will have N-1 elements on 1 side and 0 on the other).

So, since partitioning is O(N) and the problem size reduces by 1 each partition it will be:

O( N + (N-1) + (N-2) + ... + 3 + 2 + 1) = O(N^2)

There are some implementations of quick sort that do handle equal valued elements which would require only 1 partition. Those implementations would only be O(N) in this case (the time for 1 partition).(12 votes)

- Can anyone explain me about "Average-case running time" in easy English ?

Thanks in advance.(2 votes)**The Big Idea**:

To get an idea of the average case running time let's look at:

- the best case running time

- the running time of a case which is clearly worse than average

The average case running time will be between those two cases**Best Case**

For quick sort, look at the best case:

- you always choose the pivot so that half of the elements are on either side

- each side of the pivot is the new subproblem, which is half the size of the previous problem

- so, it takes log_2(n) levels until the size of the problem is 1

(Why ?

n * (1/2)^x = 1 tells us how many times, x, that we have to halve n to get 1

Solving for x gives us x = log_2(n)

)

The average case can't be better than this best case.**What should we use as a worse than average case**?

In the average case all elements are equally likely to be chosen as the pivot

After partitioning, we would expect half the time the pivot to end up in the middle two quarters and half the time for it to end up in the outer two quarters.

We can make a worse than average case that:

-chooses the worst possible pivot half the time (instead of ending up in the outer two quarters)

-chooses a pivot with 3/4 of the elements on one side (instead of choosing a pivot in the middle two quarters)**Worse than Average Case**

For quick sort, we could imagine a worse than average case where we get unlucky and:

- for odd levels we choose the worst possible pivot i.e. all elements are to the left or right of the pivot

- for even levels we choose a pivots where 3/4 of the elements are on one side and 1/4 on the other side

- so roughly, after every 2 levels our biggest sub problem is 3/4 of the size it was 2 levels before

- so now it takes 2 * log_(4/3) (n) levels until the size of the problem is 1

(Why ?

n * (3/4)^x = 1, tells us how many times, x, that we have to multiply n by 3/4 to get 1

Solving for x gives us x - log_(4/3) (n) , but we have to double it since we only multiply

by 3/4 on every second level)

The average case is better than this worse than average case.

So, the average case is between the best case of log_2(n) levels and the worst than average case of 2 * log_(4/3) (n) levels**Conclusion**

But in big-O notation we ignore constants and bases on logs so they are both O(log(n)) levels

So, if the average is between O(log(n)) and O(log(n)) levels, it must also have O(log(n)) levels

So, with O(n) work done each level, the average case running time is: O(n log(n))

Hope this makes sense(8 votes)

- im drowning in confusion, help me(4 votes)
- You have given the best case of quicksort as a binary tree with complexity O(nlogn). But the best case could be an array of already sort sequence.?(2 votes)
- No, because in quicksort you do the partition anyway, you don't check if the elements are sorted or not.

Indeed choosing the left element as the pivot in a array completly sorted is the worst case that you can have because every time the partition is n-1.(4 votes)

- I understand that Quicksort takes at most Θ(n^2) time. Does this
*only*happen when the pivot is the greatest or smallest item in each subarray? Or can it happen in any other case.

Thanks!(2 votes)- Another possible case, depending on the implementation, is having a large number of duplicate items.

In general, quick sort degrades to O(n^2) any time that the pivot is consistently chosen poorly i.e. where the pivot is chosen so that the vast majority of the elements are located to one side of the pivot.

e.g. if the pivot was somehow chosen that you always had 95% of the elements on one side, you would expect O(n^2) degradation.

I can't quite recall the percentage where it starts to degrade, but it is >70% (since use of a median of medians pivot guarantees < 70% and O(n log n) performance).(2 votes)

- What is the rigorous mathematical proof behin quicksort being
*O(n log n)*? I imagine that it involved an integral, but I have no idea how to set it up.(2 votes) - I made the implement partition, my code gives the correct output. I can see no error and the website is not giving me any yellow warning or error otherwise but it won't let me complete the exercise...

I don't really know where to ask for help. And if it's fine to post my code here.

Edit: I noticed the report form has a section just for that.

Edit 2: resolved, thanks. I just had to re-read the comment in the code to figure out my mistake.(1 vote)- If you post your code here, I'll have a look at it.(2 votes)

- What is the run time of the quick sort algorithm?(1 vote)
- On Average Θ(nlog 2n)

Not always thought because it is dependent on how close the pivot is to the arrays median.(1 vote)

- why we get different worst case running times for Quick- sort, and Mergesort? plz explain in brief(1 vote)
- Merge sort always does the same work O(n log n) regardless of the contents of the array, while quicksort depends on choosing good pivots. On random data we expect the pivots to split the subarrays near the middle most of the time. In that case we have log_2(n) steps before we reach subarrays of size 1. However, if we choose the pivots poorly, such that each time we partition the subarrays into 1 chunk with 1 element and the other chunk with all the other elements, then our problem size is only reduce by 1 each time we partition. That means it will take n steps before we reach subarrays of size 1. The total work to partition at each of these steps is O(n), so in our average case it costs O(n log n), but in our worst case it costs us O(n^2).

Hope this makes sense(1 vote)