Amazon QLDB concurrency model - Amazon Quantum Ledger Database (Amazon QLDB)

Amazon QLDB concurrency model

Amazon QLDB is intended to address the needs of high-performance online transaction processing (OLTP) workloads. QLDB supports SQL-like query capabilities, and delivers full ACID transactions. In addition, QLDB data items are documents, delivering schema flexibility and intuitive data modeling. With a journal at the core, QLDB makes it easy to access the complete and verifiable history of all changes to your data, and to stream coherent transactions to other data services as needed.

Optimistic concurrency control

In QLDB, concurrency control is implemented using optimistic concurrency control (OCC). OCC operates on the principle that multiple transactions can frequently complete without interfering with each other.

Using OCC, transactions in QLDB don't acquire locks on database resources and operate with full serializable isolation. QLDB executes concurrent transactions in a serial manner, such that it produces the same effect as if those transactions were executed serially.

Before committing, each transaction performs a validation check to ensure that no other committed transaction has modified the snapshot of data that it's reading. If this check reveals conflicting modifications, or the state of the data snapshot changes, the committing transaction is rejected. However, the transaction can be restarted.

When a transaction writes to QLDB, the validation checks of the OCC model are implemented by QLDB itself. If a transaction can't be written to the journal due to a failure in the verification phase of OCC, QLDB returns an OccConflictException to the application layer. The application software is responsible for ensuring that the transaction is restarted. The application should abort the rejected transaction and then retry the whole transaction from the start.

For a code example that shows how to catch and handle OCC conflicts, see Handling OCC conflict exceptions.

Using indexes to limit OCC conflicts

As a best practice, you should run statements with a WHERE predicate clause that filters on a document id or an indexed field. QLDB requires an index to efficiently look up a document. Without an index or a document id, QLDB needs to run a table scan when reading documents. This can cause more query latency and also increases the chances of an OCC conflict.

For example, consider a table named Vehicle that has an index on the VIN field only. It contains the following documents.

VIN Make Model Color
"1N4AL11D75C109151" "Audi" "A5" "Silver"
"KM8SRDHF6EU074761" "Tesla" "Model S" "Blue"
"3HGGK5G53FM761765" "Ducati" "Monster 1200" "Yellow"
"1HVBBAANXWH544237" "Ford" "F 150" "Black"
"1C4RJFAG0FC625797" "Mercedes" "CLK 350" "White"

Two concurrent users named Alice and Bob are working with the same table in a ledger. They want to update two different documents, as follows.

Alice:

UPDATE Vehicle AS v SET v.Color = 'Blue' WHERE v.VIN = '1N4AL11D75C109151'

Bob:

UPDATE Vehicle AS v SET v.Color = 'Red' WHERE v.Make = 'Tesla' AND v.Model = 'Model S'

Suppose that Alice and Bob start their transactions at the same time. Alice's UPDATE statement does an indexed lookup on the VIN field, so it only needs to read that one document. Alice finishes and successfully commits her transaction first.

Bob's statement filters on non-indexed fields, so it does a table scan and encounters an OccConflictException. This is because Alice's committed transaction modified the data snapshot that Bob's statement is reading, which includes every document in the table—not only the document that he is updating.

Insertion OCC conflicts

OCC conflicts can include documents that are newly inserted—not only documents that previously existed. Consider the following diagram, in which two concurrent users (Alice and Bob) are working with the same table in a ledger. They both want to insert a new document only under the condition that a predicate value does not yet exist.


                Amazon QLDB optimistic concurrency control (OCC) diagram showing an example
                    of a conflict exception between two concurrent users.

In this example, both Alice and Bob execute the following SELECT and INSERT statements within a single transaction. Their application executes the INSERT statement only if the SELECT statement returns no results.

SELECT * FROM Vehicle v WHERE v.VIN = 'ABCDE12345EXAMPLE'
INSERT INTO Vehicle VALUE { 'VIN' : 'ABCDE12345EXAMPLE', 'Type' : 'Wagon', 'Year' : 2019, 'Make' : 'Subaru', 'Model' : 'Outback', 'Color' : 'Gray' }

Suppose that Alice and Bob start their transactions at the same time. Both of their SELECT queries return no existing document with a VIN of ABCDE12345EXAMPLE. So, their applications proceed with the INSERT statement.

Alice finishes and successfully commits her transaction first. Then, Bob tries to commit his transaction, but QLDB rejects it and throws an OccConflictException. This is because Alice's committed transaction modified the result set of Bob's SELECT query, and OCC detects this conflict before committing Bob's transaction.

Bob can then retry his whole transaction from the start. But his next SELECT query will return the document that Alice inserted, so Bob's application won't execute the INSERT.

Making transactions idempotent

The insert transaction in the previous section is also an example of an idempotent transaction. In other words, executing the same transaction multiple times produces identical results. If Bob executes the INSERT without first checking if the document already exists, the table might end up with duplicate documents that have the same VIN.

Consider other retry scenarios in addition to OCC conflicts. For example, it's possible that QLDB successfully commits a transaction on the server side, but the client times out while waiting for a response. As a best practice, make your write transactions idempotent to avoid any unexpected side effects in the case of retries.

Concurrent sessions management

If you have experience using a relational database management system (RDBMS), you might be familiar with concurrent connection limits. QLDB doesn't have the same concept of a traditional RDBMS connection because transactions are executed with HTTP request and response messages.

In QLDB, the analogous concept is an active session. A session manages information about your data transaction requests to a ledger. An active session is one that is actively executing a transaction—or a session that recently executed a transaction such that QLDB anticipates it will execute another transaction immediately. QLDB supports one actively executing transaction per session.

The limit of concurrent active sessions per ledger is defined in Quotas in Amazon QLDB. After this limit is reached, any session that tries to start a transaction will result in an error.

For best practices for using the QLDB Driver in your application to configure a session pool, see Configuring the QldbDriver object in Amazon QLDB Driver recommendations.