The Various Flavors of Two-Phase Commits — Explained

Internally, NuoDB uses the two-phase commit protocol to manage the durability of user data. NuoDB also supports the X/Open XA protocol for synchronizing global transactions across multiple data stores. XA is also sometimes referred to as two-phase commit. The fundamental principles in both protocols are similar but serve different purposes. Let us explore the difference between these two protocols.

A SINGLE TRANSACTION ACROSS TWO RESOURCES

Let us explore a simple use case. A simple application takes messages from one data source (outgoing_messages) and writes them to a new data source (incoming_messages). This is a fairly common use case if you read messages from a message queue such as Apache ActiveMQ and you write them to a database (such as NuoDB). The following code snippet shows this example in the corresponding SQL form.

SQL> SELECT * FROM outgoing_messages;
 
 ID     MSG    
 --- --------- 
 
  1  message 1 
 
SQL> START TRANSACTION;
SQL> SELECT * FROM outgoing_messages;
 
 ID     MSG    
 --- --------- 
 
  1  message 1 
 
SQL> DELETE FROM outgoing_messages WHERE id=1;
SQL> INSERT INTO incoming_messages(id, msg) VALUES(1, 'message 1');
SQL> commit;

When executed in a single relational database, the two statements (insert and delete) are expected to behave according to the ACID guarantees. They either both succeed or they both fail. Throughout this article, we focus on the A(tomic) guarantee of ACID.

XA abstracts away the statements and transaction lifetime for scenarios where the tables live in different data stores. The following example is a simplified version of an ActiveMQ consumer that receives a message and writes it to NuoDB. Due to space constraints in this article, the code does not contain any setup or failure handling.

​javax.jms.MessageConsumer consumer=xasession.createConsumer(queue);
        MessageListener listener = NEW MessageListener() {
            @Override
            public void onMessage(Message msg) {
                TextMessage msg1=(TextMessage)msg;
 
                NuoXADataSource nuodbDs = NEW NuoXADataSource();
                XAConnection nuodbXAconn = nuodbDs.getXAConnection(DBA_USER, DBA_PASSWORD);
 
                XAResource mqRes = xasession.getXAResource();
                XAResource nuoRes = nuodbXAconn.getXAResource();
 
                nuodbStmt.executeUpdate(String.format("insert into incoming_messages(id, msg) values(1, '%s')", msg1.getText()));
 
                mqRes.END(xid, XAResource.TMSUCCESS);
                nuoRes.END(xid, XAResource.TMSUCCESS);
 
                mqRes.PREPARE(xid);
                nuoRes.PREPARE(xid);
 
                mqRes.commit(xid, FALSE);
                nuoRes.commit(xid, FALSE);
            }
        };

INTERNAL TWO-PHASE COMMIT IN NUODB

When you commit a transaction in NuoDB, you have to wait for a period of time that is equal to a network round trip of the slowest link between the TE that executed the transaction and the slowest SM. For a refresher on the NuoDB architecture, please read this article. The default commit protocol in NuoDB is SAFE, so every transaction needs to be confirmed by all SMs. It is also possible to change the commit protocol to a weaker guarantee that waits for a subset of SMs, effectively trading latency for durability guarantees.

The commit protocol contains three messages. First, the TE informs all the SMs about the commit intent (a pre-commit). Then the TE waits for confirmation (commit ack) from all the SMs. The third message is broadcast to all engines in the cluster informing them that the commit was successful. Since the transaction does not wait for the third message to be acknowledged, the user never has to wait for it and the commit protocol returns control to the user application (commit succeeds). Once the commit message reaches remote transaction engines, future transactions started on those engines will see all effects of the committed transaction.

SAFE commit guarantees that a transaction will be durable even in the face of catastrophic engine failures. The Reliable Broadcast protocol guarantees that a transaction either gets committed everywhere or nowhere. The third message is not necessary for durability as NuoDB can deduce the correct state of a transaction during recovery from failure. As long as at least one SM survives a catastrophic event, the transaction can be recovered.

The pre-commit message contains the order in which the transaction is going to be committed in the version vector of that engine. If you are unsure what version vectors are and why NuoDB is using them, don’t hesitate to ask in the comment section (or wait for a dedicated blog post on that topic). The important bit is that the order has already been defined and is going to be the same on all engines. All transaction engines will see all commits from the same transaction engine in the same order.

If the transaction engine fails with a catastrophic failure, the cluster reconciles any ongoing transactions and finishes the commit protocol automatically. This means that if a transaction engine fails before the commit message is broadcast, a replacement message will be generated and the transaction will be made visible. This is an important contrast to XA that we further explore in the following sections.

To summarize: the first two messages are used for durability. The last message is used to make the change visible across the cluster.

XA TWO-PHASE COMMIT EXPLAINED

XA uses the same three messages to commit a transaction across multiple data stores — in our case a message queue and NuoDB. First, it sends a pre-commit, which informs the data store about the intent of a commit. The data store answers with a commit-ack and guarantees that it will always be able to commit the transaction in the future. That means any subsequent conflicting transactions have to wait until the XA transaction gets resolved before they can be executed. Once all data stores answer with a commit-acknowledgment, the transaction can be globally committed. If any of the data stores fail to commit-ack, all data stores are instructed to abandon the pre-committed transaction.

Once all data stores acknowledge the pre-commit, the commit message is broadcast to all data stores. Of course, this message arrives at different times. Due to unpredictable network latencies and process lifetimes, the message might arrive at arbitrary times. It is also possible that one of the data stores crashed and the transaction is only partially committed in the other stores. Once the crashed data store comes back up, the transaction manager needs to reconcile the state. This means that the XA transaction manager needs to persist state for a potentially long period of time.

How good is the promise of a commit ack in XA? A data store is not allowed to retract its guarantee to commit the change after it acknowledged the pre-commit. In modern applications, depending on such guarantee proves to have its limitations. Disks have faults, cables get cut, electricity outages might last a long time. While most data stores can internally guarantee that a pre-commit will always be able to commit, no data store can guarantee that the state can never be lost due to arbitrary external factors. And as such, XA is hard to depend on.

In case of catastrophic failure, when the XA pre-commit is lost and the XA commit can never be completed, undoing a committed transaction from other XA participants can be extremely hard. A torn transaction is a transaction that did not become visible atomically. In other words, only some part of the transaction (such as a single statement) successfully mutated the state of the data store, while others did not. In the case of a catastrophic failure, XA violates A(tomicity) guarantees.

While most applications should be developed with failure in mind, we understand those torn transactions are not the norm. Let us, therefore, explore the normal case without any failure. Even without failure, XA is never globally Atomic. Let’s return to our original example. The application reads the message from a message queue and writes it to NuoDB. Depending on how the participants receive and process network packets, the message might exist in both, neither or either of the two involved participants at any given time. To make the argument less abstract, let’s imagine another application that reads the state from both data stores at the same time. If XA were globally ACID, there would only be two possible outcomes: the message is in the message queue, but not in NuoDB; or the message is not in the queue but is in NuoDB.

XA transactions are Isolated (the state transition is not observed until the final commit message within one data store), Consistent (same as other transactions, the state transition needs to lead to a valid state within a data store) and Durable (once committed, the state does not get lost unless catastrophic failure happens). XA transactions are not Atomic.

SINGLE TRANSACTION ACROSS MULTIPLE MACHINES

Let us return to the internal NuoDB commit protocol. Since NuoDB has the ability to recover from failure and the Reliable Broadcast protocol guarantees that all messages make it to all engines, the user is never required to undo a partially committed transaction manually. All transactions are ACID in NuoDB. There is no need for a Transaction Manager that would have to persist state¹.

A transaction in NuoDB can never be torn. A single transaction always executes on the same transaction engine. Even if a transaction modifies multiple resources (tables), those modifications are part of the same transaction. This is an important distinction for readers who are familiar with two-phase commits in NoSQL systems.

SUMMARY

As you can see, NuoDB separates the distributed state transition from the durability guarantees as part of the two-phase commit. The first two messages (pre-commit and pre-commit-ack) guarantee that the commit protocol has been satisfied and that a transaction can be recovered after a failure. The third message (commit) makes the change atomically visible across the cluster.

While similar to XA, the internal two-phase protocol used by NuoDB does not suffer from the same limitations.

NuoDB supports X/Open XA transactions since version 3.0. When using XA, be aware of the inherent XA global atomicity violations.

This article first appeared at the NuoDB Tech Blog under the name The Various Flavors of Two-Phase Commits — Explained


If you enjoyed this post, then make sure you subscribe to my Newsletter and/or Feed.

Facebooktwittergoogle_plusredditpinterestlinkedin