Return to the lecture notes index
January 28, 2004 (Lecture 9)

Vector Logical Time

Vector logical time can be used to detect causality violations after-the fact. Let's discuss vector logical time -- and then take a look at how we can detect prior causality violations by comparing the current local time with the timestamp of an incoming message.

As with Lamport logical time each host maintains its own notion of the local time and updates it using the timestamps placed by the sender onto messages. But with vector logical time, the time contains more information -- it contains a vector representing the state of each host. In other words, this vector not only contains the event count for the host, itself, it also contains the last-known event counts on each and every other host.

The only entry in this vector that is guaranteed to be up-to-date is the entry that represents the sender. For this reason, it is possible that the receiver may have a more up-to-date understanding of the logical time on some of the hosts. This would be the case if a message was sent from another host to the sender, but has not been received by the recipient.

As a result, when a hosts receives a message, it merges its time vector and the timestamp sent with the message -- it selects the higher of the values for each element. This ensures that the sender has information that is at least as up-to-date as the receiver.

Below is a summary of the rules for vector logical clocks:

Recall this example from earlier:

Let's label it in vector time, just for practice:

Comparing Vector Timestamps

When comparing vector timestamps, we compare them by comparing each element in one timestamp to the corresponding element in the other timestamp.

The above definition of vector timestamp comparison ensures both of the following properties:

EventA "happens before" EventB ==> Vector_Timestamp (EventA) < Vector_Timestamp (EventB)

Vector_Timestamp (EventA) < Vector_Timestamp (EventB) ==> EventA "happened before" EventB

Detecting Causality Violations Using Vector Timestamps

We can detect a causality violation using vector timestamps by comparing the timestamp of a newly received message to the local time. If the message's timestamp is less than the local time vector, a (potential) causality violation has occurred.

Why? For the local time to have advanced such that it is ahead of the timestamp of the newly received message, a prior message must have advanced the local time. The sender of that prior message must have gotten the newly arrived message before it sent its prior message to us. Thus a (potential) causality violation occured.

Admittedly, this doesn't fix the problem -- but at least we have a way of detecting and logging the problem. This will make it much easier to isolate and debug or system -- or at least to take mitigating action to ensure that the output from the system is correct.

Now, let's consider the this familiar example again:

This time, let's label it using vector logical time and vector timestamps:

Notice that the timestamp on the M1 indicates a causality violation. M1's timestamp is (1,0,0). The local time on P2 is (2,0,2). (1,0,0) is less than (2,0,2). This indicates that a causality violation has occured -- someone who had already seen M1 sent P2 a message, before P2 received M1.

If the timestamps are concurrent, this does not represent a problem -- the messages are unrelated.

Matrix Logical Clocks

Before we leave time to discuss communication, let me mention one more detail. There is actually another type of logical clock that is one step more encompassing than a vector logical clock -- the matrix logical clock. Much like a vector clock maintains the simple logical time for each host, a matrix clock maintains a vector of the vector clocks for each host.

Every time a message is exchanged, the sending host tells us not only what it knows about the global state of time, but what other hosts have told it that they know about the global state of time -- relaible gossip.

This is useful in applications such as checkpointing and recovery, and garbage collection. In these cases, having a lower bound on what another host knows can prove useful by enabling the disposal of unusable objects. In the case of garbage collection -- objects that are no other object can reference. In the case of recovery -- logs and/or checkpoints that are no longer needed.

We'll discuss matrix time in more detail when we discuss checkpointing and recovery -- it is much easier to understand with a clear application.