March 18, 2010 (Lecture 16)

March 18, 2010 (Lecture 16)

Let's Be Optimistic

The replica management techniques that we have discussed so far are said to be pessimistic techniques. This is because they sacrafice a lot of efficiency in order to prevent conflicts from occuring. Just like the pessimists that you and I know, they are very concerned about the bad things that can happen.
Optimistic techniques assume that problems are far less likely and try to optimize for the common case. In some cases, they may allow for stale data or other inconsistencies.
Today we are going to talk about two such techniques. The first offers no guarantees at all, and the second allows us to detect conflicts, but doesn't always prevent them.

Best-Effort Replication

The replication methods that we have discussed so far ensure that stale data is never received by a client. In most applications, this is a useful requirment. But sometimes it is okay to deliver stale data, at least for a while. Consider a password database. A user is likely to remember the old password for a while after changing passwords. So if the password change doesn't immediately reach some host, a properly educated user will simply use the old password.
It is important to note that even best-effort replciation strategies should provide protection against conflicts in the event of partitioning. It is the case that reading stale data is okay, but if a new value arrives, we must be able to determine whether it is an update, or a value more stale than our own.
Allowing best-effort consistency releases the requirement that writes occur to more than one site. For example, it is possible to update only one "master" site and allow that site to asynchronously update "slave" sites. It is also unnecessary to use a read quorum, since a stale value is acceptable.
Epidemic algorithms are one approach to best-effort replication control. This approach models the spread of disease. There are three types of systems:

Infectious: Systems that know of an update and are trying to spread it
Susceptible: A system that has not seen an update before and as a consequence can contract it.
Immune: A system that has contracted an update and is consequently immune from getting the same update again, but that is no longer infections.

When a system learns of an update, it becomes infectious. An infectious system contacts other systems and lets them know about the update. Some of the systems that are contacted will be immune. Immune systems have already seen the update; they take no action. Others will be susceptible. When a susceptible systems is contacted by an infectius system, it updates its own copy fo the data, becomes infectious, and starts to contact other systems to spread the update. Eventually an infectious system stops trying to spread the update. Depending on the implementation, the system may become immune after some n number of attempts to spread the disease, after some m number of failures to spread the disease, or after its failure rate reaches some threshhold.
One characteristic of epidemic algorithms is that there is no guarantee that all systems will eventually receive the update. This is because the spread of the update is uncoordinated. Some processors simple may not be contacted. The longer a system remains infectious, the more systems will get the update and the more likely it is that all systems will get the update. But there are no guarantees.