Return to lecture notes index
May 19, 2005 (Lecture 4)

The Garbage Collector

When objects, such as Strings or arrays, are no longer referenced, they can no longer be used. In other words, if no variable (or constant) holds the name of an object, then no part of the program knows that it exists. If this happens, the object can't be named, so it can't be used. It is nothing more than wasted space.

What happens to the poor nameless objects in Java? All of those objects that were once used, but now aren't, and since there are no longer any reference to them, can't be?

The answer is that Java has a garbage collector that can throw them away and allow the storage to be reused. Many people suggest that the garbage collector is a "background process" and that it steals a few cycles here and a few cycles there and cleans up objects as a program runs.

Myth killing time: The garbage collector is not a background or low priority process. It only runs when the system is out of memory. Although it would seem like it would be best to "steal cycles" for this task, or to amortize the cost -- it is really expensive. And, even more so when things are still changing. So, instead of trying to "spread it out", Java hopes for the best and tries to avoid garbage collecting. What to say? It is a nasty job. Not even Java wants to do it.

One other related item. Think back to our early discussion of references and the virtual binding table. The program never knows, or needs to know, where an object actually lives in memory. This allows the system to reorganize objects in memory to pack them together and avoid fragmentation -- small pieces of memory in between allocated objects. Again, this is very time consuming -- but, in a crisis, it is a powerful tool.

Strings vs. StringBuffers

Last class we reviewed the String class. We were reminded that String instances are immutable. Once created the objects, themselves, cannot be changed. Instead, the would-be mutators actually return new objects.

// The original object doesn't change. Instead, the reference is changed // to identify a different object. s = s.toUpperCase();

The designers of Java defined String this way because it is mechanically hard to grow the internal storage of a String in place. Instead, in languages such as C++, where Strings can be changed, such growth results in the creation of new internal storage, the copying of the data from the old storage to the new storage, and the deallocation of the old storage.

In Java, the programmer need not be concerned with the deallocation of the space -- Java is garbage collected. As a result, there was less to be gained by encapsulating this operation.

But, as it turns out, many mutations do not result in the growth of the String -- they can be done "in place". As a result, Java has since added a new class, the StringBuffer. It is much like a C++ string. It is a mutable string class in which changes that don't require growth can occur "in place" and changes that do require growth will happen as they did in C++: automatically and transparent to the programmer.

A StringBuffer can be initialized empty, or by passing a String or StringBuffer in as a parameter to the constructor. In this way, one can convert a String into a StringBuffer. Similarly, one can convert a StringBuffer into a String using toString():

  StringBuffer sb = new StringBuffer ("Hello world");
  String s = sb.toString();
  

Are Arrays Primitives or Objects?

In Java, Arrays are considered to be first-class objects. They combine not only the simple, indexable data that is familar to any C or C++ programmer, but also a rich set of behaviors. For example, the toString() method will convert an entire array to a String representation.

Also consider array reference variables and array allocation:

  // intList is a reference variable. It contains the name of an int array
  int [] intList; 

  // An array of int is allocated by new, just like other objects
  intList = new int[10];
  
  

But arrays in Java are also somewhat of a cludge. Most classes that are part of the Java language can be implemented within Java, itself. This isn't the case for an array. Consider these properties of arrays:

Ultimately, in Java, arrays are first-class objects -- but they are also a huge special case. I imagine that the designers of Java reached this compromise because they wanted all aggregate types to be first-class objects, but also wanted to preserve the well-known and well-loved syntax of C, C++, and many other prior languages.

Vectors and ArrayLists

Arrays are tremendously versatile data structures. They give us random (indexed) access, sequential access, and flexibility. Arrays of Objects can serve as general containers. Arrays are great.

But, they are also quite raw. The fact that they are bounded and cumbersome to "grow" is the torment of many a programmer. And, while flexible, they aren't intuitive. Cosnider using an array as a list: How does one simply add an item? Or see if one is there?

In 15-100 you spent much time, at least during the final exam, developing array-based containers. But, these containers were a bit essoteric, special-purpose. How about a generic container that can hold near everything? A container that can be used intuitively to insert, remove and search for items? One that grows automatically. And one that is well-known and part of the language's environment.

Java actually has two such classes: Vector and ArrayList. For our purposes, they are virtually alike. They are standard Java containers that are basically value-added arrays. Vector is the "original" and ArrayList came later, as part of a more general, far-reaching framework for collections.

Although Vector and ArrayList do basically the same thing (there is a subtle difference for concurrent programming), they sometimes have slightly different methods or method names. Basically, the older Vector class was designed to be familiar to the C++ programmers entering the language, whereas ther ArrayList was designed to be consistent with a larger framework of collections.

Java API Specification

In class, we walked through the functionality of the Vector and ArrayList. But, this was more of an exercise in reviewing the Java API Documentation than anything else. You are the "Application Programmer. The libraries are your "Interface". Hence API. The API documentation tells you everyting you need to know to make use of the features of Java's classes.

The message here is that you don't need to know everything off of the top of your head -- it is okay, and encouraged, to explore the documentation and to learn. The documentation is one of the greatest things about programming in Java.