Lab 2 - Hash Table Spell Checking

Due: Wednesday, September 20th at 11:59PM

Overview

One straight-forward way to implement a fast spell checker is to use a hash table. If a word is found, it is assumed to be spelled correctly. Otherwise, it is assumed to be spelled incorrectly.


This assignment asks you to build a spell checker, using a Hash table, that is, capable of identifying correctly spelled words. This assignment should also use the java 1.5 Scanner class Java api to read in the files.

The Hash Table

You should build a generic open chain hash table, constructed as an array or ArrayList of LinkedLists. It should have the following methods:

Objects should be inserted and found based on their hashCode(). This class should work for any type of Object not just Strings.

The Spell Checker, Itself

You should design and implement a SpellChecker class, which contains the hash tables needed to perform spell checking and which provides the appropriate functionality. This class should contain at least one constructor:

Additionally, the SpellChecker must have a method as follows:

This method should output each potentially misspelled word. The output should be readable and nicely formatted.

The Dictionary Hash Table

You have been provided with a dictionary that contains root words. These should be inserted into the dictionary hash table based on the hashCode() of their String representation.


Program Main

The SpellChecker class should also have a main method. This method should acept all of the files as command-line arguments, should instantiate a new SpellChecker, and should spellCheck() the requested document file. You need to provide the document file. Only the dictionary file has been provided for you.


Chain Length Plot

Experiment making the hash table different sizes, holding the dictionary size as a constant. When you find the interesting range of sizes, please produce a plot of the minimum, maximum, and average chain sizes as a function of the table size.

You'll probably want to write a separate main program to collect and output this data. Then, please use Excel,  or the tool of your choice to produce a plot of the interesting size range. The interesting size range is the minimum range that shows the table, in practice, being too large through the table being too small.

Useful Files