Running time of binary search

We know that linear search on an array of n n elements might have to make as many as n n guesses. You probably already have an intuitive idea that binary search makes fewer guesses than linear search. You even might have perceived that the difference between the worst-case number of guesses for linear search and binary search becomes more striking as the array length increases. Let's see how to analyze the maximum number of guesses that binary search makes.
The key idea is that when binary search makes an incorrect guess, the portion of the array that contains reasonable guesses is reduced by at least half. If the reasonable portion had 32 elements, then an incorrect guess cuts it down to have at most 16. Binary search halves the size of the reasonable portion upon every incorrect guess.
If we start with an array of length 8, then incorrect guesses reduce the size of the reasonable portion to 4, then 2, and then 1. Once the reasonable portion contains just one element, no further guesses occur; the guess for the 1-element portion is either correct or incorrect, and we're done. So with an array of length 8, binary search needs at most four guesses.
What do you think would happen with an array of 16 elements? If you said that the first guess would eliminate at least 8 elements, so that at most 8 remain, you're getting the picture. So with 16 elements, we need at most five guesses.
By now, you're probably seeing the pattern. Every time we double the size of the array, we need at most one more guess. Suppose we need at most m m guesses for an array of length n n . Then, for an array of length 2n 2n , the first guess cuts the reasonable portion of the array down to size n n , and at most m m guesses finish up, giving us a total of at most m+1 m+1 guesses.
Let's look at the general case of an array of length n n . We can express the number of guesses, in the worst case, as "the number of times we can repeatedly halve, starting at n n , until we get the value 1, plus one." But that's inconvenient to write out.
Fortunately, there's a mathematical function that means the same thing as the number of times we repeatedly halve, starting at n n , until we get the value 1: the base-2 logarithm of n n . That's most often written as log2n\log_2 n, but you may also see it written as lgn\lg n in computer science writings. (Want to review logarithms? Learn more here.)
Here's a table showing the base-2 logarithms of various values of n n :
n n log2n \log_2 n
We can view this same table as a graph:
Zooming in on smaller values of n:
The logarithm function grows very slowly. Logarithms are the inverse of exponentials, which grow very rapidly, so that if log2n=x \log_2 n = x , then n=2x n = 2^x . For example, because log2128=7 \log_2 128 = 7 , we know that 27=128 2^7 = 128 .
That makes it easy to calculate the runtime of a binary search algorithm on an nn that's exactly a power of 2. If nn is 128, binary search will require at most 8 (log2128+1\log_2 128 + 1) guesses.
What if nn isn't a power of 2? In that case, we can look at the closest lower power of 2. For an array whose length is 1000, the closest lower power of 2 is 512, which equals 29 2^{9} . We can thus estimate that log21000\log_2 1000 is a number greater than 9 and less than 10, or use a calculator to see that its about 9.97. Adding one to that yields about 10.97. In the case of a decimal number, we round down to find the actual number of guesses. Therefore, for a 1000-element array, binary search would require at most 10 guesses.
For the Tycho-2 star catalog with 2,539,913 stars, the closest lower power of 2 is 221 2^{21} (which is 2,097,152), so we would need at most 22 guesses. Much better than linear search!
Compare nn vs log2n\log _{2} n below:
In the next tutorial, we'll see how computer scientists characterize the running times of linear search and binary search, using a notation that distills the most important part of the running time and discards the less important parts.

This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.