Binary Search¶

Binary Search is a different approach to searching a list. Unlike Linear Search, it only works on sorted lists. If you have a sorted list, Binary Search will be faster than linear search.

Let us return to our list of numbers from the Linear Search example.

Array haystack contains [2,4,6,8]
The number of elements size is 4
The target value needle is 7

We know the list is ordered. We can just look at it and tell. If we know there is an ordering to our search space, we can improve efficiency.

We can ask “Is the number between 2 and 4?”. If the answer is yes, we eliminated 6 and 8. If the answer is no, we eliminated 2 and 4. By asking the right questions in the right order, we can drastically cut down the number of questions we need to ask.

There is one minor flaw in this plan. We need to track more information. In the Linear Search, we just needed to know where in the list the current value was. With this new search, we will be eliminating ranges of values. We need to values to denote the range, a start and a stop. This is similar to our recursive linear search implementation.

This search is called Binary Search. We ask a question that eliminates half the values in the list. By dividing the list is half, we cut down the search space by powers of two.

Initially, the range is obvious. It should be every value. The array starts at \(0\) and ends at \(\text{size}-1\).

Once we have a range, we can find the middle element. Three things can happen:

Target is equal to the middle element
The target is bigger than the middle element, eliminate everything smaller
The target is smaller than the middle element, eliminate everything larger

We can repeat this pattern. Eventually, one of two things will happen.

We find what we want
We eliminate all possible values

The algorithm is given below. We again use a trivial helper function to map the more standard of size into the initial start and stop values.

Function binaryRecursive(integer n, array pointer H, integer size)
    Return binaryHelper(n, H, 0, size-1)
End Function
func binaryHelper(integer n, array pointer H, integer start, integer stop)
    //No elements to search
    If start > stop then
        Return false
    End If
    //Determine the middle
    Let middle = start + (stop-start)/2
    //Check for target
    If n == H[middle] then
        Return true
    End If
    //Search either left or right side
    If n < H[middle] then
        Return binaryHelper(n, H, start, middle-1)
    End If
    Return binaryHelper(n, H, middle+1, stop)
End Function

Why do we compute the middle like this?

We could also compute the middle as floor((stop_range+start_range)/2). This is a shorter expression. There is a good reason not to use this one.

Let us assume we are working on an 8-bit system. Then the largest number we could handle would be 255. What happens if our range is from 16 to 250? We know that \(16+250=265\) will cause an overflow. Alternatively, \(250-16=234\). By using subtraction first, we can ensure the ranges never overflow. As long as the indexes fit in memory, we can find the middle.

We need a longer list to make for an interesting walk-through of this algorithm

Let L=[2, 4, 6, 8, 10, 12, 14]

We can search for 7, which will not be found.

The middle element is L[3]=8. We know that 7 < 8, so we eliminate 8, 10, 12, and 14.
The middle of the remaining elements is L[1]=4. We know that 7 > 4, so we can eliminate 2 and 4.
We are left with just one element L[2]=6. We know that 7 > 6, therefore the number seven is not in our list.
Return False

It only took us three compares to determine 7 was not in the list.

What about a number in the list? We next search for 12.

The middle element is L[3]=8. We know that 12 > 8. We eliminate 2, 4, 6, and 8.
The middle remaining element is L[5]=12. We found 12!
Return True

This method requires fewer questions than the linear search.

We can also implement the algorithm iteratively. Again, the key is to track the start and stop values. This allows us to narrow the range the result must reside in.

Function binaryIterative(integer n, array pointer H, integer size)
    Let start = 0
    Let stop = size - 1
    While start <= stop do
        Let middle = start + (stop-start)/2
        If n == H[middle] then
            Return true
        End If
        If n < H[middle] then
            //Eliminate Upper Section
            stop = middle - 1
        Else
            //Eliminate Lower Section
            start = middle + 1
        End If
    End While
    //Never Found
    Return false
End Function

In both cases, the iterative and recursive have the same running time. They both make the same number of comparisons and define the same algorithm. There might be minor practical differences between them for implementation purposes. The recursive version will might need to use more memory depending on how the program handles recursions. The return value is just returned, no calculations are done on it. This means you don’t need to store the previous recursive calls, but a compiler would need to optimize that. The linear version would use less memory in most implementations.

Algorithmic Foundations of Computer Science

Binary Search

Binary Search¶