February 20, 2008 (Lecture 16)

February 20, 2008 (Lecture 16)

Reading

Coulouris, et al: 12.4.2 - 12.4.8, 13.4-13.7

Safe Schedules

Transactions must execute as if in isolation. That doesn't necessarily prohibit concurrency among transactions -- it just implies that this concurrency shouldn't have any effect on the results or the state of the system. In fact, as long as the results and other state are the same as for some serial execution of the transactions, the transactions can be interleaved, executed concurrently, or btoh.
Transaction processing systems (TPSs) contain a transaction scheduler that dispatches the transactions and allows them to execute. This scheduler isn't necessarily FIFO, and it doesn't necessarily dispatch only one at a time. Instead, it tries to maximize the amount of work that gets done. One popular measure of the performance of a TPS is the number of transactions per second (TPS). Yes, unfortuantely, TPS is also its abbreviation.
In discussing the scheduler, it is helpful to ask the question, "What is a schedule?" It is an ordering of events. The transaction scheduler's job is to execute the indivdual operations that compose the transactions in an order that is efficient and preserves the property of isolation. As a result, it is the schedule of individual operations that is our concern.

Safe Concurrency: Serial Schedules and Serializability

A serial schedule is a schedule that executes all of the operations from one transaction, before moving on to the operations of another transaction. In other words the transactions are executed in series. An interleaved schedule is a schedule in which the operations of an individual transaction are executed in order with respect to the same transaction, but without the restriction that the transactions be scheduled as a whole. In other words, interleaving allows the scheduling of any operation, as long as the operations of the same transaction are not reversed.
Some interleaved schedules are safe, whereas other way result in violations of the isolation property. Safe interleaved schedules are known as serializable schedules. This is because an interleaved schedule is only safe if it is equivalent to a serial schedule -- that's why they call it serial-izable.
What did I mean when I wrote, is equivalent? An interleaved schedule is equivalent to a serial schedule, if transactions which containing conflicting operations are not interleaved. Operations are said to be conflicting if the results differ depending on their order.
This means that an interleaved schedule is serializable if, and only if, each pair of operations occurs in the same order as they would in some serial schedule.

Serializability Graphs

We can see if a schedule is serializable by building a serialzability graph. The Fundamental Theorem of Serializability states that a schedule H is serializable, if and only if, SG(H) is acyclic.
So how do we build a serializability graph?

Create a node for each transaction
Draw an edge from T_i to T_j if and only if some operation in T_i conflicts with an operation in T_j and the operation in T_i occurs before the operation in T_j in the given schedule. The example belwo will make this a bit clearer.

Example: Directory Operations

Let's take a careful look and make sure that we understand the source of the problem in schedule H₁. The fragment below shows only the relevant portion of the schedule:

. . . L₁(x) . . . L₁(y) . . . D₃(y) . . . D₃(y) . . . E₁(y) . . .

If we look at the fragment, we see that L₁(x) conflicts with D₃(y). Since L₁(x) occurs before D₃(y) in H₁, an equivalent serial schedule must execute T₁ before T₃. But if we look further ahead in the trace we see that D₃(y) occurs before E₁(y). Similarly, since these operations conflict, it implies that T₃ must occur before T₁ in an equivalent serial schedule. Both of these statments cannot be true. If T₁ executes before T₃, T₃ cannot execute before T₁. This schedule cannot be converted to an equivalent serial schedule. It is not serializable -- it is not safe.

Locking and Serializability

Remember two-phase locking from last class? Although it is handled by the transaction manager, it does allow for the interleaving of operations. Are the schedules that it generates serializable?
The answer to this is yes. Two-phase locking ensures that transactions which use the same objects cannot execute concurrently. This ensures that no conflicts can happen. If interleaved transactions don't share, we know that we are safe.
Two phase locking ensured serialzable schedules using what is known as inconsistency prevention. Prevention techniques constrain the transactions to ensure that conflicting operations can never happen. Two-phase locking does this by preventing transactions that share objects from executing concurrently. Although inconsistency prevention is effective, it is also expensive. Perfectly safe sharing may be prevented -- this unnecessarily reduces concurrency.