Main content
Computer science
Running time of binary search
We know that linear search on an array of n elements might have to make as many as n guesses. You probably already have an intuitive idea that binary search makes fewer guesses than linear search. You even might have perceived that the difference between the worst-case number of guesses for linear search and binary search becomes more striking as the array length increases. Let's see how to analyze the maximum number of guesses that binary search makes.
The key idea is that when binary search makes an incorrect guess, the portion of the array that contains reasonable guesses is reduced by at least half. If the reasonable portion had 32 elements, then an incorrect guess cuts it down to have at most 16. Binary search halves the size of the reasonable portion upon every incorrect guess.
If we start with an array of length 8, then incorrect guesses reduce the size of the reasonable portion to 4, then 2, and then 1. Once the reasonable portion contains just one element, no further guesses occur; the guess for the 1-element portion is either correct or incorrect, and we're done. So with an array of length 8, binary search needs at most four guesses.
What do you think would happen with an array of 16 elements? If you said that the first guess would eliminate at least 8 elements, so that at most 8 remain, you're getting the picture. So with 16 elements, we need at most five guesses.
By now, you're probably seeing the pattern. Every time we double the size of the array, we need at most one more guess. Suppose we need at most m guesses for an array of length n. Then, for an array of length 2, n, the first guess cuts the reasonable portion of the array down to size n, and at most m guesses finish up, giving us a total of at most m, plus, 1 guesses.
Let's look at the general case of an array of length n. We can express the number of guesses, in the worst case, as "the number of times we can repeatedly halve, starting at n, until we get the value 1, plus one." But that's inconvenient to write out.
Fortunately, there's a mathematical function that means the same thing as the number of times we repeatedly halve, starting at n, until we get the value 1: the base-2 logarithm of n. That's most often written as log, start base, 2, end base, n, but you may also see it written as \lg, n in computer science writings. (Want to review logarithms? Learn more here.)
Here's a table showing the base-2 logarithms of various values of n:
n | log, start base, 2, end base, n |
---|---|
1 | 0 |
2 | 1 |
4 | 2 |
8 | 3 |
16 | 4 |
32 | 5 |
64 | 6 |
128 | 7 |
256 | 8 |
512 | 9 |
1024 | 10 |
1,048,576 | 20 |
2,097,152 | 21 |
We can view this same table as a graph:
Zooming in on smaller values of n:
The logarithm function grows very slowly. Logarithms are the inverse of exponentials, which grow very rapidly, so that if log, start base, 2, end base, n, equals, x, then n, equals, 2, start superscript, x, end superscript. For example, because log, start base, 2, end base, 128, equals, 7, we know that 2, start superscript, 7, end superscript, equals, 128.
That makes it easy to calculate the runtime of a binary search algorithm on an n that's exactly a power of 2. If n is 128, binary search will require at most 8 (log, start base, 2, end base, 128, plus, 1) guesses.
What if n isn't a power of 2? In that case, we can look at the closest lower power of 2. For an array whose length is 1000, the closest lower power of 2 is 512, which equals 2, start superscript, 9, end superscript. We can thus estimate that log, start base, 2, end base, 1000 is a number greater than 9 and less than 10, or use a calculator to see that its about 9.97. Adding one to that yields about 10.97. In the case of a decimal number, we round down to find the actual number of guesses. Therefore, for a 1000-element array, binary search would require at most 10 guesses.
For the Tycho-2 star catalog with 2,539,913 stars, the closest lower power of 2 is 2, start superscript, 21, end superscript (which is 2,097,152), so we would need at most 22 guesses. Much better than linear search!
Compare n vs log, start base, 2, end base, n below:
In the next tutorial, we'll see how computer scientists characterize the running times of linear search and binary search, using a notation that distills the most important part of the running time and discards the less important parts.
This content is a collaboration of Dartmouth Computer Science professors Thomas Cormen and Devin Balkcom, plus the Khan Academy computing curriculum team. The content is licensed CC-BY-NC-SA.
Want to join the conversation?
- I just don't get the +1 part. If lg(1000) = 9.96 so you add 1 I could understand but lg(1024) equals 10 right away, why + 1 to make it 11?(30 votes)
- Let me start out by saying that, I really wouldn't worry too much about the +1 too much. It's not important. What's important is that the number of guesses is on the order of log_2(n). If you still want to know where the +1 comes from, then read on.
For the implementation of the binary search specified:
max. # guesses = floor(log_2(n))+1
Which means that:
n=512 to 1023 require max. of 10 guesses
n=1024 to 2047 requires max. of 11 guesses
So, where does the +1 come from ?
Don't forget that, even when you have only 1 element, you still have to guess it, because it is possible that the target is not in the array. So we need to add on the +1 for the last guess when we are down to the final remaining element.
It may be clearer if you think about when n=1. We still need 1 guess, but log_2(1) = 0. So we need to add a +1 to fix up our formula.
(Alternatively, if you are familiar with binary trees https://en.wikipedia.org/wiki/Binary_tree you can make a "complete" binary tree where each level is a choice and the nodes are the guesses. The first level is level 0. The number of guesses required to reach an element is the level it is sitting on +1. )
If you really want to understand what is going on, then you could try to confirm the following by working through the algorithm by hand:n max. # of guesses
1 1
2 2
3 2
4 3
5 3
6 3
7 3
8 4
9 4
...
Example:
Suppose we have this array: [2, 3, 5, 7, 11, 13, 17, 19] (8 elements)
and we search for: 19
Here's the values of min, max and guess on each iteration:
min: 0 max: 7 guess: 3
min: 4 max: 7 guess: 5
min: 6 max: 7 guess: 6
min: 7 max: 7 guess: 7
# of guesses: 4
Hope this makes sense(134 votes)
- Why is it called binary search if there are three options?
1. Higher
2. Lower
3. Correct(13 votes)- "Binary" doesn't refer to the number of possible outcomes from the guess, but to the method of dividing the pool of possibilities into two parts and treating each of them as only one "thing."(51 votes)
- I know this is not related to the article. Still...
In the previous challenge, how do I
the total guesses? (Step 3)println()
(0 votes)- You need to create a variable, increment the variable each time a guess is made, and finally print it out when the target is found.(14 votes)
- I want to make sure +1 thing.
If n=3, the maximum guesses are 3?
since 2^2=4, 2+1 =3(as the article said)
But I still don't get it.
When I work through the algorithm by hand, it goes like this.
n max. # of guesses
1 1
2 2
3 2
4 3
5 3
6 3
7 3
8 4
9 4
...
Then when n=1000, it seems like the maximum guesses are 10.(3 votes)- The error in your calculations is that you did
log₂(4)+1
instead oflog₂(2)+1
. You need to take the logarithm of the closest lower power of 2 (2), not closest higher (4). You can see thatlog₂(2)+1 = 1+1 = 2
.(3 votes)
- At the end of the 3rd paragraph it states that...with an array of length 8, binary search needs at most four guesses. But how can this be true as if it starts with 8 the most guesses possible would be 3 and on the 3rd guess you would end up with the answer. I am not sure if this is what would happen cause I don't get it...(4 votes)
- In an array of length 8, how is it that we need four guesses? Please make me clear.(1 vote)
- Three to narrow down the position, plus one more to tell if your final guess is right or not.
Let's suppose that you're looking for an element that happens to be at position 3 (where the first element is position 1). Before you make any guesses, your element could be anywhere in the range 1-8. After the first guess, you can rule out half the elements; you now know that the element you're looking for is in elements 1-4. On your next guess, you can rule out another half, so you're down to elements 3-4. On your third guess, you can tell that it's either at element 3, or doesn't exist. But you need to make one more guess to make sure that element 3 is the one you're after; it might be different, after all!(6 votes)
- for the challenge; I have created the below code with Python language, appreciate your commends if you have any suggestions to make it better: ( it seems that the indented lines look like normal lines here, sorry for that )
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
mn = 0
mx = len(primes)
target = 100
for i in range(10):
guess = int((mn+mx)/2)
if primes[guess] == target:
print ("target found in index number : "+ str(primes.index(primes[guess])))
break
elif mx == 0 or mn == len(primes)-1 or mx == mn + 1:
print ("number not found")
break
elif primes[guess] > target:
mx = guess
elif primes[guess] < target:
mn = guess(2 votes) - From this, are we then concluding that the running time of a binary search is ((log n) + 1) to base 2 where n is the length of the array?(3 votes)
- The worst case of binary search is O(log n)
The best case (right in the middle) is O(1)
The average is O(log n)
We can get this from cutting the array into two.
We continue this until the target is found.
Thus, the time complexity would be O(log n).
Note: The bases of the logarithms above are all two.(0 votes)
- In the second to last paragraph, it says "When n is not a power of 2, we can just go up to the next higher power of 2. For an array whose length is 1000, the next higher power of 2 is 1024, which equals 2
10. Therefore, for a 1000-element array, binary search would require at most 11 (10 + 1) guesses." I am confused about that "the next higher power of 2" means and how it comes to 1024. Can someone please explain this more clearly? Thanks!(3 votes)- 2 to the 9th power is 512, which is less than 1000, but the 10th power is 1024, which is greater than 1000. So in the worst condition, binary search would need 10 steps to separate the remaining numbers and 1 final step to check whether the only one number left is what you want, or it's not in the array.(0 votes)
- Previous challenge: Binary Search
We make a variable guess that is the average of of Max/Min...
Then we use the guess variable as in index?? I dont understand this can someone explain why we use array[guess] instead of guess to filter targetValue? if(guess < targetValue)
Guess = 47.
array[guess] is not an answer, array[47] should return nothing since there are not 47 indices in the array there are only 25. Why are we not using the value of guess itself?(2 votes)- An example with letters in the array instead of numbers it might make it clearer.
Suppose, instead of numbers, we were searching for letters in:
array = [ 'a', 'e', 'i', 'o', 'u', 'y' ].
And I want to see where 'u' is located in it.
Again, the purpose of Binary Search is to find where the target is located (or to return -1 if it doesn't exist in the array).
So min = 0, max =5
guess = 2guess
is the index where I am guessing that the target is located.
I guess it is located half way between min and max
I check array[2], which has the value of 'i'
'u' > 'i', so I can discard all indexes <= 2
so I set min to 3
So min = 3, max =5
guess = 4guess
is the index where I am guessing that the target is located.
I guess it is located half way between min and max
I check array[4], which has the value of 'u'
'u' = 'u', so I have found u located at index 4
I return 4, which is the location of the target 'u'
Hope this makes sense(1 vote)