If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Overview of merge sort

Because we're using divide-and-conquer to sort, we need to decide what our subproblems are going to look like. The full problem is to sort an entire array. Let's say that a subproblem is to sort a subarray. In particular, we'll think of a subproblem as sorting the subarray starting at index p and going through index r. It will be convenient to have a notation for a subarray, so let's say that array[p..r] denotes this subarray of array. Note that this "two-dot" notation is not legal JavaScript; we're using it just to describe the algorithm, rather than a particular implementation of the algorithm in code. In terms of our notation, for an array of n elements, we can say that the original problem is to sort array[0..n-1].
Here's how merge sort uses divide-and-conquer:
  1. Divide by finding the number q of the position midway between p and r. Do this step the same way we found the midpoint in binary search: add p and r, divide by 2, and round down.
  2. Conquer by recursively sorting the subarrays in each of the two subproblems created by the divide step. That is, recursively sort the subarray array[p..q] and recursively sort the subarray array[q+1..r].
  3. Combine by merging the two sorted subarrays back into the single sorted subarray array[p..r].
We need a base case. The base case is a subarray containing fewer than two elements, that is, when pr, since a subarray with no elements or just one element is already sorted. So we'll divide-conquer-combine only when p<r.
Let's see an example. Let's start with array holding [14, 7, 3, 12, 9, 11, 6, 2], so that the first subarray is actually the full array, array[0..7] (p=0 and r=7). This subarray has at least two elements, and so it's not a base case.
  • In the divide step, we compute q=3.
  • The conquer step has us sort the two subarrays array[0..3], which contains [14, 7, 3, 12], and array[4..7], which contains [9, 11, 6, 2]. When we come back from the conquer step, each of the two subarrays is sorted: array[0..3] contains [3, 7, 12, 14] and array[4..7] contains [2, 6, 9, 11], so that the full array is [3, 7, 12, 14, 2, 6, 9, 11].
  • Finally, the combine step merges the two sorted subarrays in the first half and the second half, producing the final sorted array [2, 3, 6, 7, 9, 11, 12, 14].
How did the subarray array[0..3] become sorted? The same way. It has more than two elements, and so it's not a base case. With p=0 and r=3, compute q=1, recursively sort array[0..1] ([14, 7]) and array[2..3] ([3, 12]), resulting in array[0..3] containing [7, 14, 3, 12], and merge the first half with the second half, producing [3, 7, 12, 14].
How did the subarray array[0..1] become sorted? With p=0 and r=1, compute q=0, recursively sort array[0..0] ([14]) and array[1..1] ([7]), resulting in array[0..1] still containing [14, 7], and merge the first half with the second half, producing [7, 14].
The subarrays array[0..0] and array[1..1] are base cases, since each contains fewer than two elements. Here is how the entire merge sort algorithm unfolds:
Most of the steps in merge sort are simple. You can check for the base case easily. Finding the midpoint q in the divide step is also really easy. You have to make two recursive calls in the conquer step. It's the combine step, where you have to merge two sorted subarrays, where the real work happens.
In the next challenge, you'll focus on implementing the overall merge sort algorithm, to make sure you understand how to divide and conquer recursively. After you've done that, we'll dive deeper into how to merge two sorted subarrays efficiently and you'll implement that in the later challenge.

This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.

Want to join the conversation?

  • hopper cool style avatar for user Agustin G.
    What about... array.prototype.sort()? Isn't that waaay easier? And numerical sort is just... array.prototype.sort(function(a,b) {return a-b} .
    (0 votes)
    Default Khan Academy avatar avatar for user
  • piceratops seedling style avatar for user Hung Duc Nguyen
    Based on pseudocode
    MergeSort(array A, int p, int r) {
    if (p < r) { // we have at least 2 items
    q = (p + r)/2
    MergeSort(A, p, q) // sort A[p..q]
    MergeSort(A, q+1, r) // sort A[q+1..r]
    Merge(A, p, q, r) // merge everything together
    }
    }
    I dont understand one thing, after the mergesort left right are done, the merge will be run. It is supposed to end there, but it it said that the control run back to the mergesort right to keep going on. Why is that happen while I dont see any code to do that after the merge? Please explain to me clearly, I am not so smart :(
    (4 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      When you use recursion, there may be several copies of a function, all at different stages in their execution. When one function returns the function that called it continues to execute.

      So, if the currently executing copy of the MergeSort function was called by another copy of the MergeSort function to sort the calling functions left subarray, then after the currently executing version finishes, the version that called it will then move on to its next step, which is to call MergeSort to sort its right subarray.
      It is probably best illustrated with an example.

      In the below example:
      - For MergeSort on arrays with <2 elements the function is: Do Nothing
      - For MergeSort on array with >= 2 elements the function is:
        MergSort(left subarray)
      MergeSort(right subarray)
      Merge Above Together

      - I'll denote each step that each version of the function is executing with a >.



      Suppose we call MergeSort on [4,3,2]

      The 1st copy of the function will be made which looks like this:
      >  MergeSort([4,3])
      MergeSort([2])
      Merge Above Together


      The call to MergeSort([4,3]) will then generate a 2nd copy of the function
      (shown below the 1st copy)

      >  MergeSort([4,3])
      MergeSort([2])
      Merge Above Together

      > MergeSort([4])
      MergeSort([3])
      Merge Above Together


      The call to MergeSort([4]) will then generate a 3rd copy of the function
      (shown below the 2nd copy)

      >  MergeSort([4,3])
      MergeSort([2])
      Merge Above Together

      > MergeSort([4])
      MergeSort([3])
      Merge Above Together

      > Do Nothing


      The Do Nothing step will finish. The 3rd copy of the function will return (and vanish).
      The 2nd copy of the function will move on to the next line.

      >  MergeSort([4,3])
      MergeSort([2])
      Merge Above Together

      MergeSort([4])
      > MergeSort([3])
      Merge Above Together


      The call to MergeSort([3]) will then generate a 3rd copy of the function
      (shown below the 2nd copy)

      >  MergeSort([4,3])
      MergeSort([2])
      Merge Above Together

      MergeSort([4])
      > MergeSort([3])
      Merge Above Together

      > Do Nothing


      The Do Nothing step will finish. The 3rd copy of the function will return (and vanish).
      The 2nd copy of the function will move on to the next line.
      >  MergeSort([4,3])
      MergeSort([2])
      Merge Above Together

      MergeSort([4])
      MergeSort([3])
      > Merge Above Together


      The Merge Above Together step will finish. The 2nd copy of the function will return (and vanish).
      The 1st copy of the function will move on to the next line.
         MergeSort([4,3])
      > MergeSort([2])
      Merge Above Together


      The call to MergeSort([2]) will then generate a 2nd copy of the function
      (shown below the 1st copy)
         MergeSort([4,3])
      > MergeSort([2])
      Merge Above Together

      > Do Nothing


      The Do Nothing step will finish. The 2nd copy of the function will return (and vanish).
      The 1st copy of the function will move on to the next line.

         MergeSort([4,3])
      MergeSort([2])
      > Merge Above Together

      The Merge Above Together step will finish. The 1st copy of the function will return (and vanish).
      The array should now be sorted.


      Hope this makes sense
      (20 votes)
  • orange juice squid orange style avatar for user Rick Mac Gillis
    I spent hours trying to figure out the challenge while I kept getting overflow issues. Hours later I found out that the above tutorial does not properly state the "Divide" portion.

    Incorrect: "Divide by finding the number qq of the position midway between pp and rr. Do this step the same way we found the midpoint in binary search: add pp and rr, divide by 2, and round down."

    I found the correct way to perform the "Divide" step by researching the code online. I came across a post with the code written in C, and the way to find "q" is completely different. Pay attention to the top portion of the code where it talks about the "mid" variable.

    http://stackoverflow.com/questions/12030683/implementing-merge-sort-in-c#answer-12030723

    I hope this post saves someone from the same headaches as I had to endure.
    (4 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      It's unfortunate that you had problems with the challenge, but the technique describe in the article is not incorrect. In fact, it is a fairly standard technique. I just checked it and it works for me. I suspect you made an error when you tried to implement the technique described.

      'mid = low + floor((high-low)/2)' works just fine too (they are mathematically equivalent), but the technique described in the article tends to be more intuitive for most people.

      As far as overflow issues go, you would exceed the largest possible array size (4294967295) long, long before the calculation described in the article would lose precision or have a numeric overflow (the largest int you can represent in javaScript is 9007199254740991 ). However, an incorrectly implemented midpoint calculation might cause the function to infinitely recurse and result in a stack overflow...

      If you would like, you could post a copy of your code, that was causing overflows, as a comment to this answer, and I could have a look at it for you.
      (7 votes)
  • leaf green style avatar for user ravisankaranr
    Hi,

    My program runs fine and the sorted array looks good. However I am getting an instruction in the beginning with respect to the code for calculating the midpoint.

    var q = floor((p + r - 1) / 2);

    Any idea what is the issue here?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      p is the index of the 1st element of the subarray.
      r is the index of the last element of the subarray
      Suppose p=2 and r=4
      q= floor( (p+r-1)/2)
      q= floor((2+4-1)/2)
      q= floor( 5/2)
      q= floor(2.5)
      q= 2
      The above calculation gives a result of 2, but the midpoint of 2 and 4 should be 3 not 2.
      (9 votes)
  • leaf blue style avatar for user Dave de Heer
    I don't understand why you need all the divide steps. Can't you just start by merging the individual members of the array in pairs - i.e. just go directly to the first merge step? What value is there in doing all the iterative divide computations?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Patricia Daoust
      Because you're not starting with "individual members", you're starting with an array, and you need to break that array into it's individual members. That "divide" step might seem trivial to most humans, but it's an important detail to the "divide"-and-conquer logic. And a very important detail to remember to write, for your code to run properly! As the lesson says, the "real" work is mostly done in the merge step.

      The later lesson "analysis of merge sort" goes deep into the details, but the short answer is that merge sort ends up being faster.
      (1 vote)
  • stelly blue style avatar for user Amjed O
    quoting "The base case is a subarray containing fewer than two elements, that is, when p >= r"

    How is "p >= r" possible? How is it possible for p (the start index) to be larger than r (the end index)?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • piceratops ultimate style avatar for user CleanCutBloons
    I used the correct code but the thing says "Maximum call stack exceeded."
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user halleyisanimeh
    I'm confused as to how the merge step sorts anything. Doesn't it need a rule to know how to sort the numbers (the rule being sorting them in ascending order)?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Cameron
      The merge step takes two sorted subarrays and produces one big sorted subarray with all those elements. It just repeatedly looks at the front of the two subarrays and takes the smallest element, until it runs out of elements. It only works because the two subarrays were already sorted.

      In the example above (last merge) we have:
      [3,7,12,14] and [2,6,9,11] being merged.
      So, let's repeatedly take the smallest element from the front of the two arrays and remove them.
      1st array, 2nd array, result
      [3,7,12,14], [2,6,9,11], []
      [3,7,12,14], [6,9,11], [2]
      [7,12,14], [6,9,11], [2,3]
      [7,12,14], [9,11], [2,3,6]
      [12,14], [9,11], [2,3,6,7]
      [12,14], [11], [2,3,6,7,9]
      [12,14], [], [2,3,6,7,9,11]
      [14], [], [2,3,6,7,9,11]
      [], [], [2,3,6,7,9,11,14]

      That's how it sorts
      (3 votes)
  • aqualine ultimate style avatar for user SD
    The example given shows subarrays of array[0..0] and array[1..1] as the base case. When is there the case that p > r instead of p === r?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • winston baby style avatar for user Fandy Akhmad
    I still confused how "merge the first half with the second half" works?
    (2 votes)
    Default Khan Academy avatar avatar for user