Assignment #7 - Memory Tracer

Assignment #7 - Memory Tracer Due April 3, 2008 at 11:59PM(Thursday)

Files From Class

Overview

This lab asks you to implement a tool that can be included within a C source file in order to check memory usage for a few specific types of errors:

Freeing a pointer that has never been malloc'ed
Freeing a pointer that has already been freed (without being rellocated)
Leaking memory at the end, and only at the end, of the program

It should work much like the memory trace utility that we wrote in class, except that it should keep track of the memory allocation, and report only problems, rather than reporting each and every request and release.

How Do We Track this?

In order to do this, it should keep track of all "active" allocations by address, adding them to some list when upon call to malloc() and removing them upon call to free(). Free should produce a message to stderr if, and only if, it is asked to free an address which is not known to be "active".
In order to distinguish between an allocation which has been freed more than once from a bogus pointer which never identified allocated space, the utility should also keep track of freed allocations, also by address. So, upon call to free(), an address should not only be removed from the list of allocations, but also added to the list of freed allocations. This way, if a user tries to free it again, the utility can determine that it is a multiple free, not an otherwise bogus pointer.
This also means that, upon malloc(), the address of the allocation should be removed from the list of "freed allocations", if present. If it isn't removed, an appropriate call to free might appear as a duplicate free.
The utility should also warn upon call to exit(), if any allocations remain within the allocated list. Specifically, it should report the address of each allocation which has not been freed. This will require exit() be replaced as was malloc and free.
For this to work, instead of returning from main, exit() must be called. Use a standard int main, but call exit() before your return 0. This means control will never reach return 0, but the compiler will be happy.

Format of Output

Messages should be as in the example below:
    Free requested of unallocated pointer 0x5A456312 at file.c:123
    Free requested of already freed pointer 0x5A456312 (125 bytes) at file.c:123
    Allocation at address 0x5A456312 (125 bytes) not freed at program termination
In the case of a duplicate free, report both the location of the free and the location of the original malloc. This can be done by leaving the information in the "active" list upon a free. If this is done, both sets of information are available -- and we can determine that the allocation is freed by its presence within the "freed" list alone.
    Free requested of already freed pointer 0x5A456312 (125 bytes) at file.c:123
    ...original allocation was at file.c:123

The Preprocessor

In class we discussed the use of #undef and #define to clear out any existing definition of malloc and free and to also redefine them to map to our own versions.
In addition, this lab will make use of the preprocessor predefined names "__FILE__" and "__LINE__". When the preprocessor encounters these names, it replaces them with the name of the file and the line number, respectively. "__FILE__" is a string literal and "__LINE__" is a decimal constant.
If you'd like to see how they work, try adding this at any point within your favorite program:
  /* prints the printf's file and line number */
  printf ("%s:%d\n", __FILE__, __LINE__); 
  
In order to incorporate this into your malloc and free, you'll want to change the malloc and free macros that we wrote in class to capture the line and file at the point where the program called malloc or free. Remember that the macro is expanded "in place", so the "__LINE__" will be observed where malloc is called within the code, not where malloc is defined as a macro.
Below is an example of the modified macro.
  #define malloc(x) mymalloc(x, __FILE__, __LINE__)
  

What to Track

Our main goal is to track memory addresses. But, reporting a problem to the user by memory address is not super-helpful. It doesn't answer the question, "Where did the problem occur?". And, it doesn't give any clue about "Which object was allocated?"
So, as you've noticed by our output, we're actually tracking a few things:

Memory address
Allocation size
The file and line number of the malloc/free

In order to do this, you'll probably want to organize the information you need into a struct and keep track of instances of that struct, rather than pure addresses. this way, everything is in one place.

Keeping Track of Addresses: The Hash Table Approach

This assignment requires that you keep track of two different things: the "active" pointers and "freed" pointers. If you are familiar with hash tables, they are the logical choice of data structure for this assignment. One hash table can be used for each purpose.
If you take this route, you might want to consider the hash function shown below. It derives an index into the hash table directly from the memory address. As discussed in class, it strips off the least-significant bits (right most) part of an address -- because these bits are affected by "alignment". And, it also strips of the high-order bits of an address -- because they are affected by the "big picture" location of the stack in memory. And, within the framed bits, it favors the lower bits, which change more rapidly, so the hash table won't fill sequentially as the stack grows upward. It isn't a really good hash function -- but it is simple and good enough.
  #define hash(address,table_size) ( ((address) / 16) % (table_size) )
  
Also, you'll want to pick a hash table size that is "plenty big". This should be a configuration constant (#define or const), so that it can be changed. As yourself how many pointers you'll have in your program, estimate very high, and give your self several times that number -- and never less than a few hundred.

Keeping Track of Addresses: The Linked-List Approach

If you are unfamilar with hash tables, or just scared of them, you may use linked lists and search them brute-force. It isn't pretty -- but we do know that not everyone yet feels comfortable enough with hash tables. There is no penalty for this approach nor is there extra credit for using hash tables.

We're Here To Help!

As always -- remember, we're here to help!