In centralized systems, where one or more processors share a common bus, time isn't much of a concern. The entire system shares the same understanding of time: right or wrong, it is consistent.
In distributed systems, this is not the case. Unfortunately, each system has its own timer that drives its clock. These timers are based either on the oscillation of a quartz crytal, or equivalent IC. Although they are reasonably precise, stable, and accurate, they are not perfect. This means that the clocks will drift away from the true time. Each timer has different characteristics -- characteristics that might change with time, termperature, &c. This implies that each systems time will drift away from the true time at a different rate -- and perhaps in a different direction (slow or fast).
How Often Do We Need To Resynchronize Clocks?
Coordinating physical clocks among several systems is possible, but it can never be exact. In distributed systems, we must be willing to accept some drift away from the "real" time on each clock.
A typical real-time clock within a computer has a relative error of approxmiately 10-5. This means that if the clock is configured for 100 ticks/second, we should expect 360,000 +/- 4 ticks per hour. The maximum drift rate of a clock is the maximum rate at which it can drift.
Since different clocks can drift in different directions, the worst case is that two clocks in a system will drift in opposite directions. In this case the difference between these clocks can be twice the relative error. Use our example relative error of 10-5, this suggests that clocks can drift apart by up to 8 ticks/hour. So, for example, if we want all clocks in this system to be within 4 ticks, we must synchronize them twice per hour.
A general formula expressing these ideas follows:largest_synchronization_interval = maximum_acceptable_difference/maximum_drift_rate/2
Cristian's Algorithm is one approach to synchronizing physical clocks using a time server. The time server is a special host that contains the reference time -- the time that is considered to be (most) correct. Typically time servers are themselves synchronized against another source, such as a UTC time server based on an atomic clock.
Cristian's algorithm is a client-pull mechanism. Clients ask the time server for the current time. When the server's response is received, the client tries to adjust it for the transit delay, and then adjusts its own clock.
To understand the reason that the adjustment for the transit time is needed, please remember that the time server is replying with the current time as of the time that it received the request -- not the time that the request was dispatched and not the time that the reply was received by the client.
In order to better estimate the current time, the host needs to estimate how long it has been since the time server replied. Typically this is done by assuming that the client's time is reasonably accurate over short intervals and that the latency of the link is approximately symmetrical (the request takes as long to get to the server as the reply takes to get back). Given these assumtions, the client can measure the round trip time (RTT) of the request using its local clock, divide this time in half, and subtract the local time from the result. The result is a good estimate of the error in the local time. In other words:
local_clock_error = server_timestamp + (response_received_timestamp - request_dispatched_timestamp)/2
local_clock_error = server_timestamp + (RTT/2)
If the server is known to have an approximatable latency in answering the requests once it receives them, an adjustment can also be made for this.
local_clock_error = server_timestamp + (RTT - server_latency)/2
These estimates can be further improved by adjusting for unusually large RTTs. Long RTTs are likely to be less symmetric, making the estimated error less accurate. One approach might be to keep track of recent RTTs and repeat requests if the RTT appears to be an outlier. Another approach is to use an adaptive average to weight the new estimate of the error against aggregate of prior weighted averages:
avg_clock_error0 = local_clock_error
avg_clock_errorn = (weight * local_clock_error) + (1 - weight) * local_clock_errorn-1
If the local clock is slower than the reference clock, the new time value can simply be adopted -- in general, clocks can be advanced without causing many problems. But if the local clock is fast, adopting a prior time is probably not a good idea. Doing so could reverse the apparent order of some events by giving a lower timestamp to an event than a predecessor. In the case of a fast local clock, the clock should be slowed down -- the time should not be adjusted.
The Berkeley Algorithm is an approach that is applicable if not time server is available. This approach is somewhat centralized. One active server polls all machines, adjusts them based on the RTT and service time (if known), computes an average, and then tells each server to speed up or to slow down. The average time may be a better estimate of real time than the clock of any one host, but it will still drift. The period of the polling needs to be set as it was before to ensure that the maximum drift in between adjustments is acceptable.
Berkeley Algorithm w/Broadcast
Instead of polling the active server could broadcast to all servers periodically. At the end of the same period, it could collect those times that it received, discard outliers, make the same adjustments for RTT and I as before, and then tell the hosts to adjust their clocks as with Berkeley or Cristian's. At the end of the period, the whole thing is repeated. As with the Berkeley approach, the average time can drift away from the "real time," and the period needs to be sufficiently small to ensure an acceptable level of relative drift.
Multiple Time Servers
If multiple time servers are available, a scheme similar to the Berkeley approach or the variation above can be used to better adjust for the inaccuracy instituted by the RTT-based estimation and to remain more resiliant in the event of failure.
Real-world Synchronization: NTP
Clock synchronization in the real world is based on techniques like the two we just described, but often includes additional statistical filtering and a more complex model for finding a time server. In addition, security becomes a concern -- it is necessary to prevent a malicious user from polluting the reference time.