15-100 Lecture 22 (Wednesday, March 19, 2008)

15-100 Lecture 22 (Wednesday, March 19, 2008)

Binary Search

Last class, when we implemented contains(), we did it with what is often known as a "brute force" or "linear" search. It is known as a linear search because we walk straight down the line doing the search. the name "brute force" comes from the fact that we are considering all persistently considering all possibilities, rather than using any smarts, in order to seach for the item.
In the case of an unordered list, there was nothing else we could do. The number could be anywhere. And, there was no telling where. But, this time through, the list is ordered. Does that change anything?
Well, I think we'd all agree that we're more likely to find "zebra" near the end of a dictionary -- and "aardvark" near the beginning. So, using intuition, we see that it does.
If we search from beginning to end, each item we consider is either the correct item or eliminates exactly one possibility -- leaving the rest. But, what happens if, instead of starting at the beginning or end, we start in the middle? Then, if it isn't right, we know whether it comes before or after the middle, right? Consider the dictionary -- if it isn't the word we are reading, it either comes before it or after it under the alphabet. So, right away, we can throw away the other half of the words, those that live on the wrong half.
If we perform this "divde and conquer" strategy over-and-over, we've got a winning approach. Eventually, we'll either find it, or we'll be looking at a one-word list -- and have no where to go.
This technique is known as a "binary search", because it cuts the list in half each time. Its power is easy to see, especially in early iterations, when it throws away a huge number of possibilities per comparison.
Let's consider an example:
Looking for: "K" 
 ---
| B | begining of list
 ---
| C |
 ---
| F |
 ---
| K |
 ---
| L | <- index
 ---
| M |
 ---
| O |
 ---
| P |
 ---
| V | end of list
 ---
Check the index: K < L


 ---
| B | begining of list
 ---
| C | <- index
 ---
| F |
 ---
| K | end of list
 ---
| L | 
 ---
| M |
 ---
| O |
 ---
| P |
 ---
| V |
 ---
Check the index: K > C


---
| B |
 ---
| C |
 ---
| F | <- index   begining of list
 ---
| K | end of list
 ---
| L | 
 ---
| M |
 ---
| O |
 ---
| P |
 ---
| V |
 ---
Check the index: K > F

 ---
| B |
 ---
| C |
 ---
| F |
 ---
| K | <- index,  begining of list and end of list
 ---
| L | 
 ---
| M |
 ---
| O |
 ---
| P |
 ---
| V |
 ---
Check the index: K = K

Done
Please consider the following implementation in the context of the contains() method:
 public boolean contains (Comparable item) {

    /*
     * Here we select the left and right bounds of the list to
     * search. Initially, we start searching the whole thing, so
     * [0...nextSlot-1]. From this, we find the length of the
     * list (right - left). Half of that length, (right-left)/2, is
     * the distance form the left side to the middle. So, the middle,
     * pivot value, is   left + (right-left)/2
     */
    int left = 0;
    int right = nextSlot-1;
    int pivot = left + (right - left)/2;

    /*
     * Now we loop until the partition is empty, signified
     * by a right value less than a left value. If they are equal,
     * it is a one item partition.
     *
     * Inside, we return true, if we find it. At the end, false, if we
     * didn't
     */
    while (right >= left) {

      // Do the comparison
      int direction = list[pivot].compareTo(item);

      // Found it!
      if (direction == 0) return true;

      // Didn't find it. Do we look left or right?
      if (direction > 0)
        // Look left
        right = pivot - 1;
      else  
        // Look right
        left = pivot + 1;

      // Find the new middle
      pivot = left + (right-left)/2;
    } // go back to top and repeat

    // We stopped because we ran out of numbers. Not here.
    return false;
  }