Back to lab index

Lab 7 - Histogram

Due: Monday, June 22, 2008


We thank Prof. Pattis for this lab. This lab is an adaptation of the first part of his Lab #7 from 15-111 Fall 2005.


This programming assignment is designed to ensure that you know how to use combinations of collection classes to model and solve a wide variety of different programming problems. The kind of abstract thinking that goes into modeling solutions to these programming problems with collection classes is critical to your development as computer scientists. Collection classes are important and useful themselves, and they also rely on almost everything powerful that we have learned about Java: interfaces, classes in an inheritence hierarchy, casting, and general pupose iterators.

For this problem, you must be familiar with the Java interfaces for Map, List, Iterator, and Comparator. In addition, you must understand the use of the classes HashMap, TreeMap, ArrayList, and LinkedList, which implement these collections. You must also understand the use of the static sort method defined in the Collections class, which use the Comparable and/or Comparator interfaces. You should also be familiar with the StringTokenizer class.

Structure of the Solution

This assignment is designed to quickly give you a feel for the collection classes, &c. It is not intended to be a "design of software" exercise. You can solve the problem within the static context of the main method, if you'd like.

The Problem: Word Frequency

Read a file of words, building a map (Map[String] -> Integer) from each word to its frequency (the number of times that it occurs in the file). When you are done, what you have, in effect, is a histogram of word usage: Each word is mapped to the count of the number of times it occured. Recall that the Integer class is a "wrapper" class that lets us treat ints as Objects. Once you have built this map, print it out.

Once you are done building the Map, get a list of Entrys from it -- this is easy to do as there is a Map method to do it. The Enttry class is nothing more than a pair. In this case, <(String) word, (Integer) count>. The Entrys are created for you, automatically, as the Map adds pairs.

Next, print this list twice: first with the word/frequency pairs ordered alphabetically; second with the word/frequency pairs ordered from highest to lowest frequency (words with equal frequency can appear in any order; the built-in sort method will take care of these details). This is a perfect opportunity to make use of Comparators.

Your solution might look something like this:

Sample Input:

  w x y z
  w x y
  w x w a c
  a b

Sample Output:

  Enter name of file of words: input1.txt

  {b=1, x=3, a=2, w=4, z=1, c=2, y=2}

  List (sorted alphabetically):
  [a=2, b=1, c=2, w=4, x=3, y=2, z=1]

  List (sorted by frequency):
  [w=4, x=3, a=2, c=2, y=2, b=1, z=1]