Lecture 7 (September 16, 2014)

Synchronization: Sharing Resources and Coordination

Cooperating processes (or threads) often share resources. Some of these resources can be concurrently used by any number of processes. Others can only be used by one process at a time.
The air in this room can be shared by everyone without coordination -- we don't have to coordinate our breathing. But the printer wouldn't be much use to any of us if all of us were to use it at the same time. I'm not sure exactly how that would work -- perhaps it would print all of our images superimposed, or perhaps a piece here-or-there from each of our jobs. But in any case, we would want to do something to coordinate our use of the resource. Another example might be an interesection -- unless we have some way of controlling our use of an intersection (stop signs? traffic lights?) -- smack!
The policy that defines how a resource should be shared is known as a sharing discipline. The most common sharing discipline is mutual exclusion, but many others are possible. When the mutual exclusion policy is used, the use of a resource by one process excludes its use by any other process.
When we think about solving a problem in an environment that involves concurrency, we often need to ask the question, "What resources are in play?". Resources can be physical resources, such as a printer, or logical resources, such as a linked list. We then need to follow this up with the question, "Which of these resources can be freely shared, like air, and which require coordination, like the printer?" Those that require coordination are known as critical resources.
How do we manipulate a resource from within a program? With code, of course. Any portion of a program that manipulates a critical resource in a way that requires coordination is known as a critical section. Sometimes critical sections may appear across several lines of code or statements, but sometimes they may be hidden within a single line of code. One of the classic examples of a critical section is code which loads a value into a register from memory, updates it, and writes it back to memory. This might exist, for example, in the single statement, "i++;"

Characteristics of a Solution

A solution to the mutual exclusion problem must satisfy three conditions:

Mutual Exclusion: Only one process can be in the critical section at a time -- otherwise what critical section?.
Progress: No process is forced to wait for an available resource -- otherwise very wasteful.
Bounded Waiting: No process can wait forever for a resource -- otherwise an easy solution: no one gets in.

Special Instructions

The most common synchornization primitives used in conventional monolith systems involve hardware support. Specifically, the processor provides very simple instructions that can give us a limited guarantee of mutual exclusion in hardware. From these small guarantees, we can build more complex constructs in software. This is the approach that is, virtually universally, in use in modern hardware.

Test-and-Set

One such instruction, in its simplest form, is the test-and-set instruction. This instruction allows the atomic testing and setting of a value.
The semantics of the instruction are below. Please remember that it is atomic. It is not interruptable.
    TS (<mem loc>)
    {
       if (<memloc> == 0)
       {
           <mem loc> = 1;
           return 0;
       }
       else
       {
          return 1;
       }
    
Given the test-and-set instruction, we can build a simple synchronization primiative called the mutex or spin-lock. A process busy-waits until the test-and-set succeeds, at which time it can move into the critical section. When done, it can mark the critical section as available by setting the mutex's value back to 0 -- it is assumed that this operation is atomic.
    Acquire_mutex(<mutex>) /* Before entering critical section */
    {
        while(TS(<mutex>))
    }

    Release_mutex(<mutex>) /* After exiting critical section */
    {
        <mutex> = 0;
    }
    

Compare-and-Swap

Real-world hardware generally implements a slightly more general version of the test-and-set, such as the compare-and-swap described below:
    CS (<mem loc>, <expected value>, <new value>)
    {
        if (<memloc> == <expected value>))
        {
           <mem loc> = <new value>;
           return 0;
        }
        else
        {
           return 1;
        }
    
Note that test-and-set is just a special case of compare-and-swap:
TS(x) == CS (x, 0, 1)

The pseudo-code below illustration the creation of a mutex (spin lock) using compare-and-swap:
    Acquire_mutex(<mutex>) /* Before entering critical section */
    {
       while (CS (<mutex>, 1, 0))
       ;
    }

    Release_mutex(<mutex>) /* After exiting critical section */
    {
       <mutex> =  1;
    }

    

Counting Semaphores

Now that we have hardware support, and a very basic primative, the mutex, we can build higher-level synchronization constructs that can make our life easier.
The first of these higher-level primatives that we'll discuss is a new type of variable called a semaphore. It is initially set to an integer value. After initialization, its value can only be affected by two operations:

P(x)
V(x)

P(x) was named from the Dutch word proberen, which means to test.
V(x) was named from the Dutch word verhogen, which means to increment.
The pseudo-code below illustrates the semantics of the two semaphore operations. This time the operations are made to be atomic outside of hardware using the hardware support that we discussed earlier -- but more on that soon.
    /* proberen - test *.
    P(sem)
    {
       while (sem <= 0)
       ;
       sem = sem - 1;
    }


    /* verhogen - to increment */
    V(sem)
    {
       sem = sem + 1;
    }
    
In order to ensure that the critical sections within the semaphores themselves remain protected, we make use of mutexes. In this way we again grow a smaller guarantee into a larger one:
    P (csem) {
       while (1)  {
          Acquire_mutex (csem.mutex);
          if (csem.value <= 0) {
             Release_mutex (csem.mutex);
             continue;
          } 
          else {
              csem.value = csem.value – 1;
              Release_mutex (csem.mutex);
              break;
          }
       }
    }


    V (csem) 
    {
        Acquire_mutex (csem.mutex);
        csem.value = csem.value + 1;
        Release_mutex (csem.mutex);
    }
    
But let's carefully consider our implementation of P(csem). If contention is high and/or the critical section is large, we could spend a great deal of time spinning. Many processes could occupy the CPU and do nothing -- other than waste cycles waiting for a process in the runnable queue to run and release the critical section. This busy-waiting makes already high resource contention worse.
But all is not lost. With the help of the OS, we can implement semaphores so that the calling process will block instead of spin in the P() operation and wait for the V() operation to "wake it up" making it runnable again.
The pseudo-code below shows the implementation of such a semaphore, called a blocking semaphore:
    P (csem) {
       while (1)  {
          Acquire_mutex (csem.mutex);
          if (csem.value <= 0) {
             insert_queue (getpid(), csem.queue);
             Release_mutex_and_block (csem.mutex); /* atomic: lost wake-up */
          } 
          else {
              csem.value = csem.value – 1;
              Release_mutex (csem.mutex);
              break;
          }
       }
    }


    V (csem) 
    {
        Acquire_mutex (csem.mutex);

        csem.value = csem.value + 1;
        dequeue_and_wakeup (csem.queue)

        Release_mutex (csem.mutex);
    }
    
Please notice that the P()ing process must atomically become unrunnable and release the mutex. This is becuase of the risk of a lost wakeup. Imagine the case where these were two different operations: release_mutex(xsem.mutex) and sleep(). If a context-switch would occur in between the release_mutex() and the sleep(), it would be possible for another process to perform a V() operation and attempt to dequeue_and_wakeup() the first process. Unfortunately, the first process isn't yet asleep, so it missed the wake-up -- instead, when it again runs, it immediately goes to sleep with no one left to wake it up.
Operating systems generally provide this support in the form of a sleep() system call that takes the mutex as a parameter. The kernel can then release the mutex and put the process to sleep in an environment free of interruptions (or otherwise protected).

Boolean Semaphores

In many cases, it isn't necessary to count resources -- there is only one. A special type of semaphore, called a boolean semaphore may be used for this purpose. Boolean semaphores may only have a value of 0 or 1. In most systems, boolean semaphores are just a special case of counting semaphores, also known as general semaphores.

Bottom Line?

This has been a nice trip down 15-213 memory lane, with a little more detail. But, what is critically important to us is that we realizes that all of these primitives rely upon atomic shared memory.
Unfortunately, in distributed systems, we don't naturally have that. So, we are in a chicken-and-egg situation. Without atomic shared memory, we cna't implement synchronization -- and without synchronization, we can't implement atomic shared memory.
Next class, we're going to learn to break out of this paradox. For now, you know why these techniques won't work in distributed systems -- and have a deeper understanding for interview questions, even if you don't take OS.