Main content

# Linear-time partitioning

The real work of quicksort happens during the divide step, which partitions subarray

`array[p..r]`

around a pivot drawn from the subarray. Although we can choose any element in the subarray as the pivot, it's easy to implement partitioning if we choose the rightmost element of the subarray, `A[r]`

, as the pivot.Having chosen a pivot, we partition the subarray by going through it, left to right, comparing each element with the pivot. We maintain two indices

`q`

and `j`

into the subarray that divide it up into four groups. We use the variable name `q`

because that index will eventually point at our pivot. We use `j`

because it's a common counter variable name, and the variable will be discarded once we're done.- The elements in
`array[p..q-1]`

are "group L," consisting of elements known to be__l__ess than or equal to the pivot. - The elements in
`array[q..j-1]`

are "group G," consisting of elements known to be__g__reater than the pivot. - The elements in
`array[j..r-1]`

are "group U," consisting of elements whose relationship to the pivot is__u__nknown, because they have not yet been compared. - Finally,
`array[r]`

is "group P," the__p__ivot.

Initially, both

`q`

and `j`

equal `p`

. At each step, we compare `A[j]`

, the leftmost element in group U, with the pivot. If `A[j]`

is greater than the pivot, then we just increment `j`

, so that the line dividing groups G and U slides over one position to the right:If instead

`A[j]`

is less than or equal to the pivot, then we swap `A[j]`

with `A[q]`

(the leftmost element in group G), increment `j`

, and increment `q`

, thereby sliding the line dividing groups L and G and the line dividing groups G and U over one position to the right:Once we get to the pivot, group U is empty. We swap the pivot with the leftmost element in group G: swap

`A[r]`

with `A[q]`

. This swap puts the pivot between groups L and G, and it does the right thing even if group L or group G is empty. The

`partition`

function that implements this idea also returns the index `q`

where it ended up putting the pivot, so that the `quicksort`

function, which calls it, knows where the partitions are. Here are the steps in partitioning the subarray [12, 7, 14, 9, 10, 11] in

`array[4..9]`

:Partitioning a subarray with $n$ elements takes $\mathrm{\Theta}(n)$ time. Each element $n$ elements, the time to partition is $\mathrm{\Theta}(n)$ : linear-time partitioning.

`A[j]`

is compared once with the pivot. `A[j]`

may or may not be swapped with `A[q]`

, `q`

may or may not be incremented, and `j`

is always incremented. The total number of lines executed per element of the subarray is a constant. Since the subarray has This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.

## Want to join the conversation?

- Hi. I feel like my code is correct. I've stepped through it mentally many times. It returns the correct pivot value when I run it on my own. But for some reason, it doesn't seems to change the array except to insert an undefined value! Please help me figure this out.

var q = p;

for (var j = p; j < r; j++) {

if (array[j] <= array[r]) {

swap(array, array[j], array[q]);

q++;

}

}

swap(array, array[r], array[q]);

return q;(5 votes)- The 2nd and 3rd arguments you supply to
`swap`

should be array indexes, not values(16 votes)

- In the partitioning step, why are the values 12 and 14 switched three times in a row?(3 votes)
- Thanks! This explanation makes more sense than the original.(0 votes)

- Why are values in the last picture not in order (14 and 12) ?(0 votes)
- They aren't in order, because we are
**partitioning**the array here, not sorting it.

The purpose of partitioning is to put all the elements <= the value of the pivot to the left of the pivot and all the elements > the value of the pivot to the right of the pivot.

In the picture, the pivot has a value of 11.

The elements > the value of the pivot (14 and 12) are to the right of the pivot.

The elements <= the value of the pivot (7,9 and 10) are to the left of the pivot.

In this case, the elements to the left of the pivot are sorted, but this is just a coincidence i.e. they don't have to be sorted.

Partitioning is a "cheaper" operation than sorting (it is only O(n) ). It is such a cheap operation that we can use it as a part of a sorting algorithm e.g. we use partitioning as part of the quicksort algorithm.

Hope this makes sense(12 votes)

- Can someone explain why we can just implement quickSort like this:

function quickSort(array, start, end) {

if (start < end) {

var pivot = array[end];

for (var i = start; i < end; i++) {

if (array[i] <= array[end]) continue;

var elem = array.splice(i, 1).pop();

array.splice(end + 1, 0, elem);

}

var pivotIndex = array.indexOf(pivot);

quickSort(array, start, pivotIndex - 1);

quickSort(array, pivotIndex + 1, end);

}

}

is the concept with splice() in some way less efficient then the concept with swap()?(2 votes)- Here are some issues with the code above:

1) using splice:

splice is O(n),

(if you remove an element from the middle of the array it needs to shift all the elements above it down one spot, otherwise the array indexes won't line up. Inserting the element in the middle of the array causes it to shift all the elements above it one spot...)

swap, on the other hand, is O(1)

2) using indexOf:

indexOf is O(n) as well, (it just performs a linear search on the array), so this will increase the time compared to just keeping track of where the pivot is ( it will degrade the algorithm to worse than O(n^2) )

indexOf is not guaranteed to give you a result between start and end, An element with the same value of the pivot found before start would break this code

Hope this makes sense(7 votes)

- "Each element A[j] is compared once with the pivot. A[j] may or may not be swapped with A[q], q may or may not be incremented, and j is always incremented. The total number of lines executed per element of the subarray is a constant".

How are line executed constant if A[j] is swapped in some cases and not in others?(2 votes)- "is a constant" does not mean "is always the same number"

constant means it does not depend on the size of the problem n

e.g.

If A[j] is swapped suppose it requires 9 lines (9 is a constant)

if A[j] is not swapped suppose it requires 5 lines (5 is a constant)

in either case the number of lines required is a constant (9 and 5 are both constants)

For further clarification:

If for some reason a swap required 5 * n lines, then the number of lines required would not be a constant.

Hope this makes sense(4 votes)

- In the first paragraph, the array is named
`array`

.

Everywhere else, the array is named`A`

.

Why the discrepancy?(2 votes)- It appears that array[x..y] is being used to identify the sub array, while A[x] is being used to identify specific elements. (It's not a naming convention that I have seen before, but it's possible that it is used in some circles)(3 votes)

- I ran into an interesting quirk in this exercise. At first, I did the following and was confused why the grader wouldn't let me pass:
`var partition = function(array, p, r) {`

var q = p;

var pivot = array[r];

for(var j = p; j <= r ; j++){

if(array[j] <= pivot){

swap(array,q,j);

q++;

}

}

return q-1; // the q index was shifted one extra time after the partition completed

};

It turns out, I had to change`j <= r`

to`j < r`

and do the last step "manually" like so:`var partition = function(array, p, r) {`

var q = p;

var pivot = array[r];

for(var j = p; j < r ; j++){

if(array[j] <= pivot){

swap(array,q,j);

q++;

}

}

swap(array,q,j);

return q;

};

What is the common programming practice here? Is it good convention to not let the index increment one time too many and correct for it, or is it better convention to iterate until the index is in the correct place and do the last computation manually outside of the loop?(1 vote)- The grader doesn't recognize all possible solutions (since there are infinitely many), so once in a while it may reject a valid solution it doesn't recognize.

That being said, I would suggest that the 2nd partition function is easier to follow than the 1st partition function.

Letting a loop increment a counter and then adjusting the counter outside the loop almost always looks like some hocus-pocus is going on. You can tell that it looks like hocus-pocus, because, whenever it happens, programmers feel compelled to write comments in the code explaining what the hocus-pocus is all about. For the sake of readability and maintainability of the code, it's preferable to try to avoid anything that feels like hocus-pocus. When the edge case is dealt with explicitly, usually it is pretty obvious what is going on.

Something else you may want to consider is:

In the 2nd partition function, for the final swap consider using`swap(array,q,r)`

instead of`swap(array,q,j)`

:

- In most languages (block scoped languages), using a loop counter that was declared inside a for loop, outside of that loop, would be illegal.

- Using an index of`j`

requires other programmers to read the code to figure out that`j`

will have a value of`r`

. If you just use`r`

, then it's obvious what is being swapped.(3 votes)

- I just wanted to comment in a more constructive way that it would help students if the variables were named what they are, rather than single char for understanding these concepts.(2 votes)
- This would have been better written as: "the variable j represents the number that we are comparing the pivot to in each loop".

Also, showing the first set of graphics before the second set of graphics is confusing. It would be easier to understand if the graphics were in order of how the algorithm sorted the numbers. The first set of graphics is showing mid-sort.(2 votes) - I passed the default test case, but for some reason in my custom case the final swap is messing up the pivot
`var partition = function(array, p, r) {`

var q = p;

for (var j = p + 1; j < r; j++) {

printPartition(array, q, j);

if (array[j] <= array[r]) {

swap(array, j, q);

q++;

}

}

swap(array, r, q);

printPartition(array, j, q);

return q;

};

And the end result looks like this ( i | denotes the q position, [i] shows the j position)`1| [9] 2, 3, 7,`

1| 9, [2] 3, 7,

2, 9| 1, [3] 7,

2, 3, [7] 9, 1|

Pivot 2 (7)

Array after partitioning: 2,3,7,9,1

What's going on?(1 vote)- This is what is actually happening :
`INSIDE FOR LOOP`

| 1 [9] 2 3 7

Notice that the | barrier between the lower half and upper half is before the first element (not after it).

But something is strange here.

Everything before your [] should be on the correct side of the barrier.

But 1 is on the upper half. Did the function even look at that first element to make sure it was on the correct side ?

| 1 9 [2] 3 7

2 | 9 1 [3] 7

AFTER FOR LOOP

Before the swap:

2 3 | 1 9 7

After the swap:

2 3 | 7 9 1(3 votes)