Lab 6 - BST-based Database

Lab 6 - BST-based Database
Due: Monday, June 14, 2004^th at 11:59PM

Overview

This lab asks you to implement a simple BST-based database. Like a real database¹, it will be composed of several different trees -- one for each key.
Each record has associate with it two different types of keys: One primary key and more than one secondary key. The primary key is a unique identifier for the record -- there are no duplicates. The seocndary key is an attribute that might apply to multiple records.
A search for a primary key will result in not more than one match. A search for a secondary key may result in multiple matches.
Additionally the database allows for search based on multiple secondary keys using the logical operators AND and OR. For example "Give me a list of all cars that are red and have 4 doors" or "Give me a list of all cars that have 2 doors or a sunroof"

Overall Architecture

This assignment will consist of (at least) several classes and one interface, each of which you are asked to implement.

A SuperComparable interface, which extends the Comparable interface and includes the cability to compare based any of several keys.
A SuperComparable Record class, which represents the data stored within the database, e.g., recipies, cars, dolls, or products
A BST class, which can organize SuperComparable objects.
A DataBase class, which contains at least a few BSTs, one for each primary and secondary key, and provides the ability to store Records within the trees and to query the trees based on the primary key, secondary key, or secondary keys joined with AND and/or OR.

The SuperComparable interface

This interface is used to specify the classes that can be stored in the BSTs. It extends Comparable and, therefore, should implement compareTo(). The compareTo() method should compare SuperComparables based on only the primary (unique) key.
In addition to the inherited compareTo() definition SuperComparables should implement the following method:

int compareTo (String attributeName, Object o)

This overloaded compareTo() method should have semantics similar to the inherited compareTo() method, except that the name of the key field is specified, instead of assumed to be the primary key. In other words, it will compare Records based on whatever field, primary or secondary, is named by the parameter. This method will allow the database to implement BSTs that organize secondary keys.

The Record class

The Record class can actually be named whatever you like. In fact, it should probably be named something contextually meaningful within your application, for example Receipe, Car, BaseballCard, &c.
It should contain the attributes of the data type, for example, the uniqueReceipeName, preparationInstructions, ingredientsList, and cookingInstructions.
It should also contain methods that make the class useful for the application. For example, a version of toString() that provides for a pretty display of the record's content is almost certainly required.
Additionally, it should implement the SuperComparable interface so that it can be organized by your BST.

The BST class

The BST class should be your own implementation of a binary search tree. It should organize SuperComparable objects. Minimally, it should implement insert() and retrieve() methods. It need not support delete. The constructor of the tree should specify the key to be used for the comparison as below:

public BST (String fieldNameForComparison)

Additionally, it should have a default constructor, which will automatically compare nodes based on the primary key. In other words, if the default constructor is used, the comparison should be made based on the new, overloaded compareTo() method, instead of the inherited compareTo() method:

public BST ()

It is important to note that this BST must be able to handle duplicates. since secondary keys are not unique. You may handle this any way you choose. Three ideas, which just ideas, are the following:

Include an interator method that can be used to get additional matching keys. For example, have retrieve return a reference to the first matching node, before returning. Then, have the iterator move to the next matching node and return the value with each call, until the next item in the tree doesn't match or a subsequent retrieve() resets it or an insert() invalidates it. You might, for example, call such a method findNextMatch(). It could return null or throw an Exception when there are no more matches.

Punt. Have your Record class extend a container class, such as a Vector or LinkedList, as well as implement the SuperComparable interface. Then, store only one list for each match. On return, return the whole matching record. Depending on your choice of containers, the Enumeration or Iterator interface will likely be implemented for you.
This isn't as much of a punt as it is moving the problem -- the Record becomes more complex.

Implement a SuperComparable container to hold your simple Records. This container should implement the methods of SuperComparable, based on the first item in the list, which will suffice in context.
This has the advantage of keeping the Record class simple for the application programmer and of avoiding the duplication of the code among different types of Records.

The Database, itself

The Database, itself, should contain one BST for each of the keys. You are required to support one primary key, and at least two secondary keys. So, at a minimum, the Database will contain three BSTs. You are welcome and encouraged to support more than two secondary keys -- but this is not required.
The database should provide the folloing functionality:

Allow the user to create and store a new Record. You might want to include a method to do this within the Record, itself, and to simply call it here. Normally we wouldn't allow this -- but, in this case, we'll even let you add this method to the SuperComparable specification, if you'd like.
Storing a record involves adding it to each of the trees, the tree for each of the primary and secondary keys.
Duplicate primary keys should be rejected. You can build this into your BST, if you'd like, or handle it elsewhere.

Allow the user to find a record by primary key. Basically the user should be able to specify the primary key and see the requested record or an error message.

Allow the user to find a collection of records by a secondary key. All matchng records should be printed, or an error message should be printed.

Allow the user to find a collection of records by specifying two secondary keys combined with either OR or AND. Remember, OR returns items in either list. AND returns items that are present in both lists, but not just one.
You are only required to ask the user to select two attributed and either AND or OR -- and you can do this however you'd like. For example, you could parse a syntax as shown above -- or simply prompt with a menu.
It would be nice to be able to contruct more sophisticated multi-attribute queries, such as "FIND (MAKE=Ford AND DOORS=4) OR (TYPE=SUV AND 4WD=No)". But, this is certainly not necessary. You are only required to support two keys combined with either AND or OR.

One thing we want to stress: Your database's user interface needs to be functional, not pretty or super-friendly. You are welcome to use a simple menu and prompt system -- or, only if you'd like, something more sophisticated.

An Important Caution -- And Some Piece of Mind

This assignment is significantly more complex than those of the past. We expect that you'll need some help. We want to provide it. Please get started early and dig in. We'll be there to help the whole way. And, yes, we'll roll up our sleeves and get dirty along the way. We're here to help.

Footnotes

¹ Real data bases are actually based on derivatives of the B-Tree data structure, not BSTs. You'll learn about B-Trees in 15-211. They are balanced trees, so, unlike BSTs, they don't have degenerate cases. The nodes also contain many keys and child pointers, unlike BSTs, where each node contians one key and two pointers.