Return to the lecture notes index
February 4, 2008 (Lecture 10)

Reading

Concurrency Control: An Introduction

We discussed concurrency and concurrency control in 15-213, and beat the topics to death in 15-410. So, right now, I'm just going to give the "Nickel tour".

Life is simple if we all have our own kingdom. Everything is ours. There's never a reason to share. No compromises ever need to be made. We get what we want, whenever we want it.

But, in the real world, this isn't always the case. Often time -- like it or not -- we have to share. And, how we share is particularly important. If we don't do it well, feelings get hurt and things get broken.

The situation is more-or-less the same for software. In many cases, different things are going on and these things need to share certain resources. We call these different, independent, things threads of control, which includes traditional threads, traditional processes, and other models for work-in-progress. A resource is an abstraction for anything that we might have to share. Another way of viewing a resource is an abstraction for any reason to wait. Some resources might be associated with physical things, such as printers, terminals, the user, or the network. Other reasons to wait are more abstract resources -- pieces of software that model objects not present in the physical world.

When we've got the potential for more than one thread of control to be chasing the same resource, we need to have some way of resolving contention. In other words, we need some way to figure out who gets the resource and when. Not only do we need such a policy, but we also need some mechanism to enforce it.

Concurrency Control: Mutual Exclusion and Other Policies

The policy that is used to divy up a recource among its contenders depends largely on the resource and the application. From the perspective of the resource, it might onyl be possible for one user to use it at a time. Or, it might be possible for a certain fixed number of users to access it at a time. Or, the number of users might depend on the character of the users. We played with all sorts of possibilities in 15-412.

From the perspective of prioritizing the users, there might be all sorts of different scenarios. Perhaps certain users are more important than others and, as a consequence, should be given priority. Or, perhaps it is more efficient for users to be scheduled in a certain order. Or perhaps fairness is our goal and users should be scheduled in exactly the same order as the request is made. Or, perhaps it doesn't matter, and any order will do.

Although the potential policies are endless, we are going to focus on only one of them -- the most important in practice. We are going to focus on finding a mechanism to implement mutual exclusion without priority. Mutually exclusive access to a resource implies that all of the users agree (hence mutual) that the use of the resource by one exludes the concurrenct use by another (hence exclusion). Given a mechanism to implement mutual exclusion, we can actually construct much more complex systems.

Characteristics of a Solution

We often like to say that, for a policy and corresponding mechanism, to ensure mutual exclusion in a meaningful and useful way, it must ensure three things: mutual exclusion, progress, and bounded wait.

The first of these three is meaningful on its face: If more than one user can access the resource at the same time, we are clearly not ensuring correct concurrency control.

The second, progress, though it takes a bit of explaining, is very common sense. A good solution to concurrency control allows the resource to be used, if available. It is not meaningful to declare, "We're implementing mutual exclusion -- no one can use the resource." Satisfying the progress requirement means that, if the resource is available, and if there is at least one user, someone should get the resource.

The third characteristic also makes sense: If someone wants a resource, they should eventually get it. Again, its not quite fair to declare, "Sure, we're implementing mutual exclusion -- only process 0 can use the resource. This condition means that, among other things, deadlocks and live locks should not occur.

Mutual Exclusion: Why Re-Invent the Wheel

Okay, great, so we're building distributed systems. Why re-invent the wheel? In 15-213 and 15-412, we discussed many techniques. The least common denominator of which includes simple mutexes and semaphores.

The problems with these tools in the context of distributed systems is that they included shared state. In the case of a simple mutex, a boolean variable. In the case of a semaphore, a shared queue. These, in turn imply some type of shared memory.

The problem with this is that shared memory requires synchronization, a type of concurrency control. The problem is that each of the multiple parties must see the same queue and value before any decisions can be made. This type of synchronization is exactly what we are trying to construct.

As a result, we are going to construct a new set of tools. By comparison, they'll seem a bit crude. But, in practice -- they work. And, we can build more convenient tools from them (at some expense), if we so desire.

Base Case: Centralized Approach

Although centralized approaches have their standard collection of shortcomings, including scalability, fault-tolerance, and accessibility they provide a useful starting point for discussions. So we'll begin by discussion a centralized approach to ensuring mutual exclusion for a critical section:

Leases

When possible, especially in distributed environments, which are inherently failure-prone, we don't want to give a user a permanant right to a resource. The user might die or become inaccessible, in which case the whole system stops.

Instead, we prefer to grant renewable leases with liberal terms. The basic idea is that we give the resource to the user only for a limited amount of time. Once this time has passed, the user needs to renew the lease in order to maintain access to the shared resource. Within the last ten years or so, almost all mutual exclusion and resource allocation systems have taken this approach, which is expecially well suited for centralized approaches.

The amount of time for the lease should be long enough that it isn't affected by reasonable drift among synchronized physical clocks. But, it should be short enough that the time wasted after the end of the task and before the lease expires is minimal. It is also possible to allow the user to relinquish a lease early.

The other problem is enforcement -- the user must be unable to access the resource after the credential expires. There are basically two ways of doing this. The leasing agent can tell the resource of the leasee and the term, or the leasor can give the leasee a copy of the lease to present to the resource.

In either case, cryptography is needed to ensure that the parties are who they claim to be and that the lease's content is not altered. We'll discuss how this can be accomplished in more detail later. But, for now, let me just offer that it is often done using public key cryptography.

This can be used to authenticate the parties, such as the leasor the leasee, or the resource, and it can also be used to make the lease unalterable by the leasee.

Timestamp Approach (Lamport)

Another approach to mutual exclusion involves sending messages to all nodes, and ordering requests using Lamport logical time. The first such approach was described by Lamport. It requires that ties be broken using host id (or similar value) to esnure that there is a total ordering among events.

This approach is based on the notion of a global priority queue of requests for the critical section. This queue is ordered by the logical time of the request. Unlike the central algorithm we discussed as the "base case", this approach calls for each node to maintain a copy of this queue. The copies are maintained in a consistent way using a request-reply protocol.

When a node wants access to the critical section, it sends a REQUEST message to every other node. This message can be sent via a multicast or a collection of unicasts. This message contains the logical time of the request. When a participant receives this request, it adds it to its priority queue and sends a REPLY message to the requesting node.

The requesting node takes no action until it receives all of the replies. This ensures that the request has been entered into all of the queues, and that, at least with respect to this request, the queues are consistent. Once it receives all of the replies, the request is free to go -- once its turn arrives.

If the critical section is available (the queue was previosuly empty), the request can go as soon as the last REPLY is received. If the critical section is in use, the request must wait.

When a node exits the critical section, it removes itself from its own queue and sends a RELEASE message to every other participant, perhaps by multicast. This message directs these nodes to remove the now-completed request for the critical section from their queues. It also directs them to "peek" at their queue.

If the first request in a hosts queue is itself, it enters the critical section. Otherwise, it does nothing. A host can enter the critical section if it is at the head of its own queue, because the REPLY ensures that it will be at the head of every other node's queue.

The RELEASE message does not need an ACK or a REPLY, because it does not matter if its arrival is delayed. Since we are assuming a reliable unicast or multicast, the RELEASE will eventually reach each participant. We don't care if it arrives late -- this doesn't break the correctness of the algorithm. In the worst case, it is delayed in its arrival to the next requestor to enter the critical section. In this case, the critical section will go unused until the RELEASE arrives and is processed by the host. In the other cases, it delays the host in "peeking" at the queue, but this is without consequence -- the delayed host wasn't going to enter the critical section, anyway.

But wait! Why do we need the REPLY to the REQUEST, then? Can't we just get rid of that. Well, not exactly. The problem is that a reliable protocol guarantees that a message will eventually arrive at its destination, but makes no guarantees about when. The protocol may retransmit the information many, many times, over many, many timeout periods, before successfully delivering the message.

In the case of the RELEASE message, timing is not critical. But this is not the case for the REQUEST message. The REQUEST message must be received, before the requesting node can enter the critical section. This is the only way of ensuring that all nodes will see the same head node, should a RELEASE message arrive. Otherwise, two different hosts could look at their queues, determine that they are at the head, and enter the critical section -- disaster. This disaster could be detected after-the-fact when the belated REQUEST arrives -- but this too late since mutual exclusion has already been violated.

This approach requires 3(N - 1) messages per request: REQUEST, REPLY, and RELEASE must be sent to every other node. It isn't very fault-tolerant. Even a single failed host can disable the system -- it can't REPLY.