Return to the Lecture Notes Index

15-200 Lecture 30 (Wednesday, April 28, 2006)

On Wednesday, you'll be taking Part II of the Departmental Exam. This exam will cover the java collections classes, including HashSets, HashMap, and ArrayList. An overview of this part of the exam can be found here .

ExamIO

For this test, you'll use a special package for reading input from files. All you need to know is the ExamIO.readFile(String prompt) method. The string passed in as an argument is just the prompt that will be displayed on the screen. The user will then input the name of a test file, and ExamIO.readFile will return an Iterator that contains each line of the file, stored as a String.

Once you have a String representing a line of test, you will have to use a StringTokenizer to extract the individual words. A StringTokenizer functions exactly like an Iterator, except it is used specifically to iterate through words of a string, separated by a delimeter. In a normal piece of text, the delimeter will usually be a " ". In other cases, your input file may be formatted as a series of words separated by semicolons, in which case you would use a ";" for your delimiter. Much like an Iterator, a StringTokenizer has a hasMoreTokens() method, and a nextToken() method. Each call to nextToken() returns the next word in the String. There is also a countTokens() method which returns how many more times the nextToken() method can be called before it generates an exception.

So, to read in and parse your input file on the test, you'll have something like this...

Iterator allLines = ExamIO.readFile("Enter a file name");

while(allLines.hasNext())
{
   StringTokenizer st = new StringTokenizer(allLines.next(), ";");
   
   while(st.hasMoreTokens())
   {
      String word = st.nextToken();

      // Process word
   }
}

The Set Interface and the HashSet Class

One of the new data structures you'll need for the exam is the Set. The Set is a subInterface of the Collections interface, which also includes Lists and Maps.A Set is very similar to a list, except for a few minor details. First, a Set cannot contain duplicate items. If you add an element to a set that is already in that set, the set will still only contain the element once. With sets, we are only concerned with issues of membership. Either an object is a member of a set or it isn't.

Much like a List, you have methods such as add, remove, isEmpty, and contains. These all function pretty much like a list, except that as mentioned above, a Set does not contain duplicate elements. There are also addAll, removeAll, and containsAll methods, which work similar to their singular counterparts, except that they take in other Collections as arguements. In this way, you can add multiple objects to your Set at once. Note that you can use any type of Collection for this method, including Lists, Sets, and Maps.

Also, you should note that by defintion, a Set has no particular ordering to it. Because of this, you cannot sort a Set directly. You must add the elements to another data structure, like a List, and then sort that if you need them in order. Be careful, there is a special type of set called a TreeSet, where the elements will appear in order, simply because they happen to be stored in a binary tree. However, you are NOT allowed to use TreeSets on the exam. If you use a TreeSet, you will recieve a 0. What you will be using is a HashSet. The HashSet is built around a HashTable, which we studied earlier in the semester. This means it has constant time add, remove, and contains methods, but the elements are stored in a more or less random order. So keep in mind that when you create an Iterator for a HashSet, you will get the elements in no particular order.

The Map Interface and the HashMap Class

The other new Data Structure you will be using is the Map. Often, when storing Data, it is convienient to store them as key-value pairs, where the elements are identified and (sometimes) sorted by their keys, but the majority of the data will be contained in the value. Some examples of this might be storing your Student information based on your Social Security Number. Your SSN is the key, and your full student record is the value. Another common example is a dictionary. You look up the word based on its spelling, and then are given a definition. Here, the key is the actual word, and the value is its definition.

Data is inserted into a Map via the put(Object key, Object value) method. This will insert that key value pair into the Map. Now, we can do a lookup in the Map by just searching for the key by using the get(Object key) method. This will return the associated value. You can also remove based on key. Much like a set, a Map can not contain duplicate Keys. Also, each Key can only map to one Value. Again, like the Set, you cannot assume that the key-value pairs are stored in any particular order. There is a TreeMap class, where the elements will be sorted, but again, you CANNOT use this class. You MUST use the HashMap class, which uses a Hash Table to look up the keys. Note that there are two different contains methods. One is containsKey(Object key), and one is containsValue(Object value). Since objects are stored based on key, containsKey can be done in constant time. However, containsValue must do a brute force search of the entire HashTable to locate a particular value.