Binary Search
Binary Search¶
Binary Search is a different approach to searching a list. Unlike Linear Search, it only works on sorted lists. If you have a sorted list, Binary Search will be faster than linear search.
Let us return to our list of numbers from the Linear Search example.
Array
haystack
contains [2,4,6,8]The number of elements
size
is 4The target value
needle
is 7
We know the list is ordered. We can just look at it and tell. If we know there is an ordering to our search space, we can improve efficiency.
We can ask “Is the number between 2 and 4?”. If the answer is yes, we eliminated 6 and 8. If the answer is no, we eliminated 2 and 4. By asking the right questions in the right order, we can drastically cut down the number of questions we need to ask.
There is one minor flaw in this plan. We need to track more information. In the Linear Search, we just needed to know where in the list the current value was. With this new search, we will be eliminating ranges of values. We need to values to denote the range, a start
and a stop
. This is similar to our recursive linear search implementation.
This search is called Binary Search. We ask a question that eliminates half the values in the list. By dividing the list is half, we cut down the search space by powers of two.
Initially, the range is obvious. It should be every value. The array starts at \(0\) and ends at \(\text{size}-1\).
Once we have a range, we can find the middle element. Three things can happen:
Target is equal to the middle element
The target is bigger than the middle element, eliminate everything smaller
The target is smaller than the middle element, eliminate everything larger
We can repeat this pattern. Eventually, one of two things will happen.
We find what we want
We eliminate all possible values
The algorithm is given below. We again use a trivial helper function to map the more standard of size
into the initial start
and stop
values.
Function binaryRecursive(integer n, array pointer H, integer size)
Return binaryHelper(n, H, 0, size-1)
End Function
func binaryHelper(integer n, array pointer H, integer start, integer stop)
//No elements to search
If start > stop then
Return false
End If
//Determine the middle
Let middle = start + (stop-start)/2
//Check for target
If n == H[middle] then
Return true
End If
//Search either left or right side
If n < H[middle] then
Return binaryHelper(n, H, start, middle-1)
End If
Return binaryHelper(n, H, middle+1, stop)
End Function
Why do we compute the middle like this?
We could also compute the middle as floor((stop_range+start_range)/2)
. This is a shorter expression. There is a good reason not to use this one.
Let us assume we are working on an 8-bit system. Then the largest number we could handle would be 255. What happens if our range is from 16 to 250? We know that \(16+250=265\) will cause an overflow. Alternatively, \(250-16=234\). By using subtraction first, we can ensure the ranges never overflow. As long as the indexes fit in memory, we can find the middle.
We need a longer list to make for an interesting walk-through of this algorithm
Let L=[2, 4, 6, 8, 10, 12, 14]
We can search for 7, which will not be found.
The middle element is
L[3]=8
. We know that7 < 8
, so we eliminate 8, 10, 12, and 14.The middle of the remaining elements is
L[1]=4
. We know that7 > 4
, so we can eliminate 2 and 4.We are left with just one element
L[2]=6
. We know that7 > 6
, therefore the number seven is not in our list.Return False
It only took us three compares to determine 7 was not in the list.
What about a number in the list? We next search for 12.
The middle element is
L[3]=8
. We know that12 > 8
. We eliminate 2, 4, 6, and 8.The middle remaining element is
L[5]=12
. We found 12!Return True
This method requires fewer questions than the linear search.
We can also implement the algorithm iteratively. Again, the key is to track the start
and stop
values. This allows us to narrow the range the result must reside in.
Function binaryIterative(integer n, array pointer H, integer size)
Let start = 0
Let stop = size - 1
While start <= stop do
Let middle = start + (stop-start)/2
If n == H[middle] then
Return true
End If
If n < H[middle] then
//Eliminate Upper Section
stop = middle - 1
Else
//Eliminate Lower Section
start = middle + 1
End If
End While
//Never Found
Return false
End Function
In both cases, the iterative and recursive have the same running time. They both make the same number of comparisons and define the same algorithm. There might be minor practical differences between them for implementation purposes. The recursive version will might need to use more memory depending on how the program handles recursions. The return value is just returned, no calculations are done on it. This means you don’t need to store the previous recursive calls, but a compiler would need to optimize that. The linear version would use less memory in most implementations.