Return to labs index
Assignment #5 - Dictionary Array
Due Thursday, October 14, 2010 at 11:59PM


This lab asks you to implement a dictionary library via a dynamically allocated array of char pointers. Each time another word is read from the input file, that pointer is copied into the array of pointers. You will apply all that you have learned so far about arrays, pointers, passing pointers, malloc and free, as well as file I/O.

The Dictionaries

The "dictionary" that serves as this lab's main data structure is conceptually a growable array of strings. Since strings are, in C, implemented as arrays of chars, this data structure is really an array of character pointers. The INITIAL_DICT_SIZE should be 10.

The array needs to be a dynamic array because it needs to be growable. In order to make the array grow, we allocate a new array twice as large and set each of its elements to alias the elements of the original array. By aliasing the original elements, I mean initializing the new array by copying the pointer values from the old array into the new one so that the new array references the same data as the old one via the same indexes. Once this is accomplished, the old array can be free()'d and the original pointer can be made to reference the new array. In this way, we have, in effect "grown" the original array -- the original pointer now references a bigger array containing the same data in the same positions.

Please use the following type for the dictionary:

  typedef struct {
    char **wordList;
    unsigned long wordListSize;
    unsigned long count;
  } dictionary;

The dictionary should be implemented as a library and, as such, should make use of appropriately named and configured .c and .h files.

The dictionary should support the following operations:

Main Program

Your main program should serve as a test driver. It should accept its input as command-line arguments. The first arguments are the names of dictionary files. The last argument is the name of a query file. At least one dictionary file is required, but multiple dictioanry files are allowed. Exactly one query file is required.

  lab5 dictionaryFile [dictionaryFile ...] queryFile

For each supplied dictionary file name, create a dictionary. As a hint, look at argv to determine the number of dictionaries and create an array -- remember, all but one of the arguments (the query file) is the name of a dictionary file. Once you have created separate dictionary files, process the query file.

Your main program should look up each word in the query file in the corresponding dictionary. It should print out the word and "found" or "not found", as shown in the example, as appropriate.

  0	run	not found
  1	walk	found

The Dictionary Files

The dictionary files are plain text files that contain one word per line. Below is a short example:


The Query File

The query file contains three columns. The first column is a dictionary number. The second column is a word. The third, optional, column is a comment that can safely be ignored by the program. Dictionary #0 is the first dictionary passed in at the command line, #1 is the 2nd, &c.

The first column is separated from the second column by whitespace. Everything after the second column is the third column and can safely be ignored by the program. Please note that third column might be absent and the use of the #-pounds signs in the example below is purely cosmetic.

Please consider the following example:

  0	run 	# not found
  1	walk	# found

Consolidated Example

Please consider the following example run with the files as shown:

  lab5 nounsDict.txt verbsDict.txt adverbsDict.txt queries.txt








0	desk		# found
1	desk		# not found
1	run		# found
2	quickly		# found
2	desk		# not found

It should produce the following results:

0	desk		found
1	desk		not found
1	run		found
2	quickly		found
2	desk		not found

The Test Driver

The test driver should operate as follows:

  testlab5 dictdir resultsdir querylistfile

dictdir specifies a directory that contains all of the dictionary files. resultsdir specifies a diretory into which all result files are written. querylistfile specifies a plain text file that contains a list of arguments to be fed to lab 5.

For example, consider the following two invocations of lab 5:

  lab5 nounsDict.txt verbsDict.txt adverbsDict.txt languagequeries.txt
  lab5 colorsDict.txt namesDict.txt animalsDict.txt toythemes.txt

We can create a querylist file to represent these two queries executed in this order as follows:

  nounsDict.txt verbsDict.txt adverbsDict.txt languagequeries.txt
  colorsDict.txt namesDict.txt animalsDict.txt toythemes.txt

For each line within the query list file, the test driver should run the query and sore the results into a new file to be created within the results directory. The results files should be named test1.txt, test2.txt, test3.txt, etc., etc., etc. The number in the file name should represent the line number from the query list file. In other words the results from the first line of the query list file should be stored within test1.txt. The results from the second line, test2.txt, etc.

Plan of Attack

The following plan of attack is purely a suggestion. Please feel free to charge forward at your pleasure:

  1. Implement the dictionary library's create, addWord, and print methods.
  2. Implement a simple test driver -- and test
  3. Implement verify -- and another test set to your test driver -- and test
  4. Implement destroy -- and verify using a debugger
  5. Implement grow -- add another set to your test driver and test
  6. Retest destroy -- and verify using a debugger
  7. Implent in-order addWord -- and test it again
  8. Implement binary search within verify -- and test it again
  9. Implement a test driver that uses a dictionary file, if yours doesn't already
  10. Implement a test driver that can maintain multiple dictionaries supplied on the command line and test
  11. Implement the use of the query file -- and test
  12. Clean up
  13. Celebrate

We're Here To Help!

As always -- remember, we're here to help!