Return to the lecture notes index

October 29, 2008 (Lecture 17)

Today's Example

"const"

The "const" qualifer can be used to give the compiler a hint about how a variable is to be used. It indicates that the variable is a "constant" -- that its value will not be changed by the programmer. If the programmer does try to reassign the variable, it will usually generate a compile-time "warning" ("discards qualifier") rather than an "error" -- the code will still compile. But, the warning properly indicates that there is a problem with the semantics.

We often use const, instead of #define to define constants. When we do this for global constants, we want to put the declarations into the "globals.c" file and the "externs" into the "globals.h" file:

  const double PI = 3.14;
  

There are some trade-offs involved in choosing between "const" and #define:

"const *"

The declaration "const *" is a bit confusing. It declares a "pointer to a constant", not a "constant pointer". In other words, it is a pointer to a constant value -- not a constant pointer.

If one assigns a "const *" pointer to data that isn't actually declared as a "const", this isn't an error. Instead, it just generates a constant view of the data. In other words that, although the value might be changeable -- it isn't changeable via that particular pointer. The opposite is, of course, not safe -- and not allowed.

It is always legal to assign a "const" to a non-const -- this doesn't place the constant value in any danger.

Consider the example below:

  int x = 5;
  int y = 6;

  int *ip = &x;
  const int *cip = &x

  /* Legal */
  x = 7; 
  *ip = 8; 

  /* NOT Legal */
  *cp = 9;
  

"const" Pointer

What if we do want a "constant pointer", a pointer which cannot be reassigned? Well we can do this -- but the syntax looks really weird.

The name of the game here is that C, like other programming languages, is designed to make the common case convenient. As a consequence, the syntax for declaring constants favors constant values, not constant pointers -- constant pointers are just much less common.

Also do note that, just like any other constant variable, constant pointers should be initialized at declaration. If they aren't -- they can't later be assigned a value -- so they are basically not useful.

So, let's take a look at this by example:

  int x = 5;
  int y = 6;

  /* This declares a "constant pointer", the pointer itself can't be 
   * reassigned 
   */
  int * const cp = &x;

  /* This declares a constant pointer to constant data. 
   * The pointer can't be assigned. And, the value, itself, can't be 
   * changed via this pointer 
   */
  int const * const cpc = &x;

  cp = &y; /* NOT legal */
  *cp = 7; /* Legal */

  cpc = &y; /* NOT legal */
  *cp = 7; /* NOT legal */
  

"const" and Function Arguments

On many occasions we use "const" when we pass arguments to functions by reference -- but don't intend to change them. This, for example, often happens when we pass strings into functions. Since we don't have a first-class string type, but instead must use a pointer to an array of chars, strings are always passed by reference.

So, if a programmer is looking at a function prototype and notices that a string is being passed into the function -- it is unclear if the function intends to change the string or just read it. We can clarify the intent, an also protect against accidental misuse, by clarifying the intent.

Consider strcpy(), a function which copies from one string to another. Notice that the src is annoted as "const", but the destination isn't.

  int strcpy (char *dest, const char *src);
  

We run into an analageous situation when we pass structs by reference. We do this, need it or not, just for performance reasons. But, when we don't actually intend to change the struct, we should note that it is "const":

  int printRecord (const struct studentRec *student) {...}
  

We can pass "constant pointers" into functions. If we do this, it prevents the function, internally, from assigning the pointer to the address of a different object. But, in practice, this is almost never done.

First, since pointer, itself, is passed by value, the caller is protected from any changes -- they only affect the function's local copy. Second, it muddies up the interface -- the internal constant use is exposed to the caller who doesn't care. If it really is important to mark the variables as constant as a measure of safety within the function, they can be assigned to constant locals -- without it leaking out to the interface.

"const" and Return Values

This is a short section: There is no such thing as a "const" return value. A reutrn value is an "rvalue" by definition -- it can't be assigned, anyway.

Linked List Implementation

For fun, we went through and annoted various parts of our linked list code with "const". The updated verion is linekd at the top.

Guidelines

For our purposes, we should always use "const" in the circumstances below, all other uses are optional:

Memory Errors In C

By now, you guys have probably realized that the most incidious errors in C programs are very often memory-related problems. These problems are nasty because they are related to the language and environment -- not the problem that one is trying to solve.

We see memory errors in a lot of different ways, a few of which are listed below:

So, it is pretty clear that if we use memory that is not properly allocated, one of three things can happen:

But, what if we "leak" memory? Well, the textbook answer is that eventually the system will run out of emmory and either malloc() will fail or the program will be killed by the OS for exceeding some resource limit. And, this can surely happen.

But, thee days most VM systems are backed by not only a large amount of RAM -- but a truly huge amount of disk. Well before malloc() fails or the system kills off a process, things are likely to slow down, perhaps exponentially, due to pagging

You'll learn about paging in 15-213 and, in depth, in OS. But, to make a long story short, when a computer doesn't have enough memory, it plays a shell game and temporaily frees some memory by writing pages of memory off to disk. Then, shoudl they be needed in the future, they can be read back in -- perhaps after writing out other pages to make room. This shell game dramatically hamper system performance because the disk, which is being used in place of RAM, is much, much, much, much slower.

Those of you who were in class got to hear a story about my master's project. For expediency on morning, I used a malloc in place of a static allocation. I knew I should remove it, but never got around to figuring out how big a buffer I needed. And, I never freed it, because, well, it was there only temporaily, anyway.

Well, I forgot about it and the software rolled out to our project's sponsor. And, with large enough inputs, my software became slow. I optimized the code. I added caching. I restructured large portions of the code. I tried to improve the algorithm.

Months after my graduation, my advisor took a look at it. Puzzled by the behavior, he started using some tools to analyze the situation. And, among those tools, he used strace -- which traces system calls. He found that in just a few seconds of exeuction, brk() was called some 20,000 times. You'll recall that brk() is the system call that malloc() uses when it runs out of memory to request more from the OS.

He replaced my sloppy malloc() call with a proper static allocation -- and the problem was gone. It would have been similarly fixed if he had simply freed the allocated space at the end of the work loop. But, in truth, malloc() should only be used when a static allocation won't do. Static allocaitons are "born" with the program. But malloc() is dynamic and wastes time during execution. And, in my case, it didn't make sense to free soemthing a the bottom of a loop only to reallocated it again moments later at the top.

Valgrind

Valgrind is a tremendous tool for finding memory problems in C programs. For those who might be familiar, it is similar to IBM's Rational Purify tool. Regardless, it can help you to find tons of different problems, and, of particular concern to us:

It is a dynamic, or runtime, analysis tool. This means that it analyzes your code while it is actually running. Basically, when you run a program using valgrind, it, at runtime, injects its code into your program (or vice-versa, really), so that it is able to trace your code.

But, like all runtime analysis tools, it checks only the code that actually runs -- not all paths. So, in any execution, it won't find problems, for example, in error handlers that don't happen to be exercised or in features that aren't invoked.

This is different than, for example, splint, which is a static tool. It analyzes the source code, rather than the execution. But, as it turns out, unless you program using a very formal and restricted style, runtime tools generally provide a better analysis.

In class we took a look at an excellent tutorial from the kind folks at cprogramming.com. I refer you there for a primer on valgrind: