15-111 Lecture 25 (Wednesday, March 25, 2009)

Heap Sort

So, let me propose this we use the array or ArrayList implementation of a heap, as discussed last class, to sort a list of unsorted numbers. As with our other sorts, we'll view things in terms of two lists, an unsorted list, and a sorted list. Initially, the unsorted list will be full and the sorted list will be empty. Also, as before, each iteration will remove one element from the unsorted list and add it to the right place within the unsorted list.

For the moment, just to keep the math easy, we'll use elements 1...n, leaving element 0 empty. This way, each parent's left child, if it exists, will be at index 2*p and its right child, if it exists, will be at index 2*p + 1. Similarly, each node's parent will be at index (int) (c / 2).

So, the inital phase is to build the heap. We do this just as described above. Each iteration, we take the item with the lowest index (left-most item) from the unsorted portion of the list and add it to the heap. We do this by swapping it with its parent, and its parent's parent, &c, as necessary, until the heap-order property is restored.

This phase is O(n*log n). The "n" portion of this derives from the fact that we must add each node, one at a time, to the heap. The "log n" portion represents the fact that, in the worst case, we might need to swap the newly inserted node with each node between the leaf level and the root of the tree. Since the tree is fully balanced, the maximum path length is "log n".

Now, we observe that the minimum value is at the top of the heap. So, let's perform a removeMin(). At this point, we have the second lowest value at the top of the heap, and an empty slot in the last position of the array. We use this empty slot to store the old minimum value.

We now view the situation as this. We have a heap in positions 1...n-1 of the array and a sorted list within positions n...n.

We repeat the process my removing the minimum element of the array using removeMin(), which again creates an empty slot. This time at the position n-2. We drop the value we just removed off in this slot. We now have a heap in positions 1...n-1 and a sorted list in positions n-1..n.

Each time we repeat this process, the heap will shrink by one item and the sorted list will grow by one item. So, to sort the whole list will take n iterations. Each iteration involves the removeMin() operation, which is, as you know, log n. So, this phase of things is O(n*log n).

If we add both phases together, we end up with O(n*log n). Remember, we throw away the coefficients. "2*n log n" is still O(n*log n)".

The only detail is that our list is, depending on what you were expecting, sorted backwards -- from highest to lowest. If, instead, you want the list sorted, as we are accustomed, from lowest to highest, this can be easily accomplished. We can just build a max-Heap instead of a min-Heap. The result will be the removal of the items in from greatest to smallest, resulting in a list from smallest to greatest.

Also, if you prefer to avoid wasting the first slot, the parent nad child index formulae can be changed slightly to accomodate for this off by one situation.

Heap Sort Implementation

```class HeapSortExample extends SortExample
{
public HeapSortExample(int how_many, int min_val, int max_val,
int sorted)
{
super (how_many, min_val, max_val, sorted);
}

private void swapUp(int new_val_index)
{
int parent_index = (new_val_index+1)/2-1;

while (new_val_index > 0)
{
if (numbers[new_val_index] > numbers[parent_index])
{
swapNumbers (new_val_index, parent_index);
new_val_index = parent_index;
parent_index = (parent_index+1)/2-1;
}
else
break;
}
}

private void buildMaxHeap()
{
/* Start at 1 since 0 is right position w/respect to
* a list of 1 item
*/
for (int insert_me=1; insert_me < numbers.length; insert_me++)
swapUp (insert_me);
}

private void sortMaxHeap()
{
int hole;
int temp;

for (int remove=0, last_heap=numbers.length-1;
remove < numbers.length;
remove++)
{
temp = numbers[last_heap];
numbers[last_heap--] = numbers[0];

for (hole=0; hole <= last_heap; )
{
if ((2*(hole+1)-1 > last_heap) && (2*(hole+1) > last_heap))
{
numbers[hole] = temp;
swapUp(hole);
break;
}

if (2*(hole+1)-1 > last_heap)
{
numbers[hole] = numbers[2*(hole+1)];
hole = 2*(hole+1);
continue;
}

if (2*(hole+1) > last_heap)
{
numbers[hole] = numbers[2*(hole+1)-1];
hole = 2*(hole+1)-1;
continue;
}

if (numbers[2*(hole+1)-1] > numbers[2*(hole+1)])
{
numbers[hole] = numbers[2*(hole+1)-1];
hole = 2*(hole+1)-1;
}
else
{
numbers[hole] = numbers[2*(hole+1)];
hole = 2*(hole+1);
}
}
}
}
```