Homework Assignment 1

Amortized Dictionary

Due date: May 22 by 23:59

Overview

One of the most important structures in computer science are the dictionary data structures that support fast insert and lookup operations. Dozens of different data structures have been proposed for implementing dictionaries including hash tables, skip lists, search trees and others.

In this lab assignment you will implement a dictionary based on linked lists and sorted arrays. This structure combines a fast lookup on sorted arrays with ease of linked list-insertion. Note that a sorted array is good for lookups (think of a binary search) but bad for inserts. A linked list is good for inserts but bad for lookups (they can take linear time).

The idea of this data structure is as follows. We will have a linked list of arrays, where array k has size 2k, and each array is in sorted order. However, there will be no relationship between the items in different arrays. The issue of which arrays to be used is based on the binary representation of the number of items we are storing. For example, if we have 11 items to store, then the arrays of sizes 1, 2, and 8 will be used, since 11 = 1 + 2 + 8, and our dictionary might look like this:

This data structure has interesting amortized and average-case performance, hence the name "Amortized Array-Based Dictionary" (AAD), which we will be using thought the writeup.

Objectives

Theory Questions: 20 points

In the file theory.pdf you will find short theory questions. Please answer the questions in this file and turn it in class on Friday May 20.

Starter Code

Download the starter code for assignment (zip file). Do not change any of the starter code! Once you finish implementation, submit java files to the Blackboard digital dropbox.

What to Submit

You FTP the following java files

Do not submit anything else like zip files, folders, desktops, class files. Make sure you do not use packages in your java files as well. Failure to do so will result in a 20 point penalty.

Instructions

You are going to implement the array-based dictionary data structure

public class Dictionary<E extends Comparable<E>> implements DictionaryInterface<E>
{
   private int size;
   private Node head;
   ...
}

using a linked list of sorted arrays. Specifically, you'll maintain a linked list of the following Nodes:

private static class Node
{
   private int power;
   private Comparable[] array;
   private Node next;
   ...
}

These nodes are organized in the ascending order of array sizes - there are no two nodes containing arrays of the same size.

Since this is a pretty advanced assignment, you have been given a lot of starter code to help you out. Don't be intimidated by the starter code! If you understand the theory behind the AAD data structure and follow the directions in the starter code, the code you write will be rather short.

Your main goal is to implement the Dictionary interface:

add(AnyType e):

You create an array of size one, containing a specified element e, and insert the array into the linked list. Next, you must traverse the linked list and merge sorted arrays of the same size until at most one array remains of each size.

remove(AnyType e):

This is the most cumbersome operation which can be implemented in more than one way. There are three cases to consider.

  1. e is not in the dictionary
  2. e is in the dictionary, and it is in the smallest array
  3. e is in the dictionary, and it is not in the smallest array
The first case is trivial. In the second case, remove e from the smallest array, and then split it up into a bunch of smaller arrays. In general terms, removing an item from an array of size 2k results in splitting an array into k smaller arrays of sizes 1, 2, 4, and so on until 2k-1. Next, put those arrays in the dictionary. No any merging is required. In the third case,

contains(AnyType e):

Returns true if the dictionary contains a specified element e, otherwise - false. Since each array in the list is in sorted order, you run Java's binary search on each of them. .

frequency(AnyType e):

Since the AAD allows duplicates, this method returns the number of elements in the dictionary equal to e.

In addiiton you will need to implement the following helper methods. Their specifications are detailed in the starter code.

If you get stuck, or don't understand something, re-read the notes. Ask for help if you need it!

Point Breakdown: 70 points

Here is the point breakdown for the implementation part:

Style will be graded very strictly for this lab assignment. Make sure you listen to the comments in the starter code (they are there to help you!) and your methods satisfy their specifications. Especially make sure you understand the AAD data structure as explained in lecture so that your implementation is as efficient as it should be.

Coding Style: 10 points

Your assignment will be graded first by compiling and testing it. After that, we will read your code to determine whether appropriate data structures and algorithms were used, as well as judge the overall style of your code. Programs will be graded both on correctness as well as their ability to deal with so-called edge cases, like empty lists, lists with only one element, and so on. Points will be deducted for poor design decisions and un-commented, or un-readable code.

Testing

Testing should be straightforward. Given a specific sequence of operations on an AAD, the resulting AAD is unique, so you can test by simply seeing if it is correct (the already-implemented toString() method should be really helpful for doing this).

That being said, here is a TestingDriver.java a class to help test your implementation. It will run some simple tests on your code and provide feedback.


Victor S. Adamchik, CMU, 2010