Return to the lecture notes index
November 20, 2008 (Lecture 23)

What is Distributed Shared Memory (DSM)?

Before we discuss distributed shared memory (DSM) in the context of distributed systems, let's think back to 15-213 for a few minutes. Let's recall the introduction to virtual memory vis-a-vis physical memory.

So, what is virtual memory? Well, like everything else discussed in an operating systems course, it is an abstraction. It is an model of memory that incorporates the useful characteristics, perhaps idealized, while neglecting the details that convolute things.

In the context of operating systems, virtual memory is an idealized memory that contains the important aspects of a real memory, and neglects the convoluting details. For example, we discussed a model for paged virtual memory that was addressable, readable, and writable, just like real memory, while concurrently providing better protection, a larger size, and no external fragmentation.

Well, DSM is also an abstraction. It is nothing more than a a type of virtual memory. DSM, much like a local, physical memory is addressable, it may be readable and/or writeable. Much like the virtual memory model that we used last semester, it may be larger than physical memory and it may provide protection mechanisms that control how its elements can be accessed.

So, what are the defining properties of DSM? Well, it is both distributed and shared. Distributed suggests that the objects stored within the memory must, in some capacity, be stored on more than one host. Shared suggests that these objects must be accessible by more than one host.

Now, let's consider the important elements of the design of a DSM. What questions does one need to ask in order to understand the model that is implemented by a particular system? Several different questions come to mind. Let's take a look.

What Is Shared? (What Is The Unit of Sharing?)

A memory stores things. But, what are these things? Do I ask my memory for a byte? A word? A page? A segment? Or, perhaps some higher-level object?

In a traditional virtual memory system, we can address a single word, but we move memory around in collections of words known as pages. In physical memory, we manipulate things at the word or byte level. Some operating systems are object-oriented and objects are atomic units in memory (consider the Smalltalk environment, for example).

All of these are possibilities in a distributed system as well. Bytes (or words) are a very fine level of granularity. And this level of granularity might be useful if many small objects are allocated, because it will reduce fragmentation. It might also be useful if spatial locality is very poor, because it will reduce the data moved with each memory access. But, if temporal locality is strong, it might make more sense to treat adjacent bytes together, especially if external fragmentation is a concern -- this might suggest a paged scheme. Or, if external fragmentation is a concern and strong compiler support is available, segmentation might be indicated. If the software is oriented, distributed shared objects might make more sense than a more rudimentary model.

"What is shared?" is in fact a very critical, perhaps the most critical question in the design of a DSM.

Protection and Security

Much as is the case with a traditional memory system, protection is a concern. It is often useful (or absolutely mandatory) for memory to be protected from accidental misuse (for example, writing over an executible) and intentional misuse (for example, writing over your own grades in my gradebook). These concerns suggest the familiar requirement for protection mechanisms that can implement a reasonable, user-friendly policy.

Some well-known DSM models make extensive use of immutable read-only objects. Such systems include Linda and JavaSpaces. These systems allow objects to be created but never changed (think about all of the problems that this mitigates).

DSMs also have another concern: security. Whereas the memory bus of a traditional system is unlikely to be attacked by sniffers, imposters, or "men in the middle", this isn't necessarily the case with a distributed system. For this reason, a DSM might want to enclude some type of encryption and/or authentication when sending memory elements over the network.

Who Runs the Smoke and Mirrors?

DSMs are great, but everyone wants them to be someone else's problem. For example, DSMs can be implemented in hardware, as is the case in many multiprocessor traditional systems. They can also be implemented by the operating system, as is the case with Amoeba. Another apporach is to hide them in a software layer between the application and above the operating system. Although they are not RMIs, both CORBA and Java RMI, with which you may be familiar, are middleware solutions. Applications can certainly implement their own DSM by performing their own message passing.

Hardware support, although prefered by computer scientists, isn't practical on a heterogenous network composed of commodity devices. Punting the problem off to the application isn't a good answer, because it is a lot of work for any singel application developer and isn't readily reused. Furthermore, without at least some lower-level support, the interface can never be really clean. Operating system level support is reusable and convenient for the application programmers -- but it is hard to find a single shoe to fit all applications. This makes middleware a popular choice -- it is reusable, but also replacable.

Putting DSM On Sale

So, we can implement a DSM with atomic consistency -- but it is expensive. What happens if we don't need atomic consistency? Atomic consistency guarantees that all reads so the most recently written value. But this might not be a requirment for any particular application. If we can relax this guarantee, we can implement DSM more cheaply. Let's consider some common options:

Weak Consistency Models

The consistency models that we have discussed so far are called unsynchronized consistency models. What this really means is that consistency, the synchronization of data, is handled without the involvement of the programmer. Another collection of consistency models, known as synchronized consistency models, a.k.a. weak consistency models require the involvement of the programmer. This models are unfortunate in the sense that they don't allow the DSM to remain entirely transparent to the application developer. But, the good thing is that the application developer can provide the DSM system with information that makes its job much simpler and cheaper.