15-111 Lecture 24 (Monday, March 23, 2009)

Heaps

A heap is a tree structure where the relationship is between the parent and its children. In a min-heap, the parent always has a smaller value than its children (there is no ordering among the children themselves), so the minimum value in the tree will be at the root. In a max-heap, the parent always has a greater value than its children, so the maximum value in the tree will always be at the root. This property allows us to efficiently implement a priority queue, which is a queue where the item with the highest priority is dequeued first. (A normal FIFO queue is like a priority queue where items that were entered first have higher priority.)

In addition to the relationship between the parent and its children, we will impose another property on heaps. Heaps will always have to be full trees -- that is, there must be no holes at any depth except for the greatest depth, and all of the holes at the greatest depth are to the right. (That is, we will fill the tree from top to bottom, and from left to right.) This restriction on the shape of the tree will make our implementation easier and more efficient, as we will see later.

Inserting Into A Heap

When we insert into a heap we need to make sure that the resulting structure is still a heap. So the resulting structure needs to maintain the relationship between each node and its children, and the resulting tree cannot have any holes in it anywhere except for the bottom right of the tree. We will always initially use the first available spot (the leftmost open spot at the bottom of the tree) to initially insert the new item into the heap, so the tree will continue to be full after we insert. But will it still be a heap?

Let's look at an example of constructing a min-heap:

We will begin by inserting 70 into the heap. Since this is the first item in the heap, it will have to be the root.

Next, we will insert 150 into the heap. Since the tree is full at depth 1 (the root), we will insert it in the leftmost spot at depth 2, which in this case is the root's left child. Since 70 < 150, this is still a heap.

Now we will insert 110 into the heap. The next available spot to add an item to the heap is roots right child, so we will add 110 there. This is also still a heap because 70 < 110. (Remember, it does not matter how the siblings relate to each other.)

Next, we will insert 80 into the heap. Since the tree is now full at depth 2, we will add 80 at the leftmost spot at depth 3 (150's left child). This presents a problem, though, because 150 > 80, so we no longer have a heap.

How do we make this a heap again? Whenever we insert, we have to look at the new node's parent and check the relationship. If the new node is smaller than its parent (in a min-heap), then we will swap the two values. After we have done this, we then have to check if the new one is smaller than its new parent. We continue this process of checking and swapping until we either hit the root or reach a spot where the new value is greater than its parent, because in either case we will have a heap again.

So, when we insert 80, we will have to swap it with the 150 since 150 > 80. 80's parent will now be 70, so we will stop there since 70 < 80.

Now let's add 30 to the heap. This will start in the second spot at depth 3 (80's right child), and then will have to swap with 80 because 80 > 30, and then will also swap with 70 because 70 > 30, so here 30 will become the new root of the heap. This is consistent with our idea of a min-heap, since 30 is now the smallest value in the heap.

Finally, we will add 10 to the heap. It will start in the third spot at depth 3 (110's left child), and we will have to swap it with 110 because 110 > 10, and then we will also have to swap it with 30 because 30 > 10, so now 10 becomes the root of the heap.

So, when we insert something into the tree, we initially place the item in the bottom, leftmost spot in the tree, and then swap the new value with its parent until we once again satisfy the properties of the heap.

Removing From A Heap

When we remove an item from the heap, we will always take the value at the root. Why? The reason we use a heap is to have easy access to the item that has the highest or lowest value in the current set of items. Like the stack and the queue, limiting the possible behavior of the heap ensures that it will behave consistently.

When we remove the item at the top of the heap, we leave an empty spot. Since the heap cannot have any holes in it except at the bottom-right, we no longer have a heap, so what can we do to reestablish a heap?

Now the idea is to recreate a heap. The easiest way is to move the last item in the heap to the top. In this case, that means that 110 will be moved to the root of the heap.

Now this is clearly not a heap, so we will percholate 110 down through the heap to fix it. We have to check which of the children is smaller and swap them. Clearly 30 is the smaller value, so 110 and 30 will swap.

And now it is a heap again. This is a successful removal.

Implementing A Heap With A ArrayLists

Up until now, we have been talking about heaps as trees, but how would we implement them? Using a tree to implement the heap leaves us with some questions that have difficult answers. How do the children know who their parent is? How do you keep track of where the next item should be added? In order for children to interact with their parent, we would have to use recursion to go down a path, and then as the recursive calls unfold swap back up the tree to restructure the heap. To keep track of where the next item should go, we could keep a count of the number of items in the heap, and use that number to construct the path down the tree to insert.

Both questions have reasonable answers, but in practice they require complicated code in order to work correctly.

So if we are not going to store the heap as a tree, how should we store it? Before we decide how to store it, let's number the spots in the heap as follows:

We have numbered the items in the heap from top to bottom and from left to right with no gaps in the numbering. These should resemble indexes for the spots in the Heap. The indexes themselves are not what makes this numbering system so valuable. If you look closely, you will notice the following two relationships: the left child's index is exactly twice the parent's index, and the right child's index is twice the parent's index + 1. This means that we can easily get from the parent to one of its children, and we can also easily get from the children to the parent by simply dividing by two. This relationship between the indexes is what makes it possible for us to implement our heap using an ArrayList.

When we numbered the items in the heap, we started with the root at 1 rather than 0. This simplifies our math considerably, but if we are using a ArrayList indexes start at 0. What should go in index 0 if the root is at index 1? The easiest solution is to just throw a "null" into index 0 and effectively throw that part of the ArrayList away. We could start the root at 0, but that complicates the math, and throwing away one index in a ArrayList is not really a significant waste of memory.