Transactions

Our solution to those problems is to define a set of rules and guarantees about operations. We will do this by using transactions. A transaction 1 is a sequence of multiple actions that should be executed as a single, logical, atomic unit. Transactions guarantee the ACID properties to avoid the problems discussed above:

Motivation for Concurrent Execution

It seems as though concurrency causes a lot of trouble, so why bother with it? There are two main advantages that come with implementing concurrency in our DBMS:

  1. Increase in Throughput (transactions per second): If the DBMS is running on a single core, one Xact can use the CPU while another Xact is reading to or writing from the disk. If the DBMS is running on a multicore processor, we can ideally scale throughput with respect to the number of processors.
  2. Decrease in Latency (response time per transaction): Multiple transactions can run at the same time, so one transaction’s latency need not be dependent on another unrelated transaction.

Concurrency Control

In this note we will discuss how to enforce the isolation property of transactions (we will learn how the other properties are enforced in the note about recovery). To do this, we will analyze transaction schedules which show the order that operations are executed in. These operations include: Begin, Read, Write, Commit and Abort. The easiest way to ensure isolation is to run all the operations of one transaction to completion before beginning the operations of next transaction. This is called a serial schedule. For example, the following schedule is a serial schedule because \(T1\)’s operations run completely before \(T2\) runs.

image

The problem with these schedules, however, is that it is not efficient to wait for an entire transaction to finish before starting another one. Ideally, we want to get the same results as a serial schedule (because we know serial schedules are correct) while also getting the performance benefits of running schedules simultaneously. Basically, we are looking for a schedule that is equivalent to a serial schedule. For schedules to be equivalent they must satisfy the following three rules:

  1. They involve the same transactions
  2. Operations are ordered the same way within the individual transactions
  3. They each leave the database in the same state

If we find a schedule whose results are equivalent to a serial schedule, we call the schedule serializable. For example, the following schedule is serializable because it is equivalent to the schedule above. You can work through the following schedule and see that resources \(A\) and \(B\) end up with the same value as the serial schedule above.

image

Conflict Serializability

Now the question is: how do we ensure that two schedules leave the database in the same final state without running through the entire schedule to see what the result is? We can do this by looking for conflicting operations. For two operations to conflict they must satisfy the following three rules:

  1. The operations are from different transactions
  2. Both operations operate on the same resource
  3. At least one operation is a write

We then check if the two schedules order every pair of conflicting operations in the same way. If they do, we know for sure that the database will end up in the same final state. When two schedules order their conflicting operations in the same way the schedules are said to be conflict equivalent, which is a stronger condition than being equivalent. Now that we have a way of ensuring that two schedules leave the database in the same final state, we can check if a schedule is conflict equivalent to a serial schedule without running through the entire schedule. We call a schedule that is conflict equivalent to some serial schedule conflict serializable. Note: if a schedule \(S\) is conflict serializable then it implies that is serializable. 2 .

Conflict Dependency Graph

Now we have a way of checking if a schedule is serializable! We can check if the schedule is conflict equivalent to some serial schedule because conflict serializable implies serializable. We can check conflict serializability by building a dependency graph. Dependency graphs have the following structure: