Amazon QLDB concurrency model - Amazon Quantum Ledger Database (Amazon QLDB)

Amazon QLDB concurrency model

Amazon QLDB is intended to address the needs of high-performance online transaction processing (OLTP) workloads. QLDB supports SQL-like query capabilities, and delivers full ACID transactions. In addition, QLDB data items are documents, delivering schema flexibility and intuitive data modeling. With a journal at the core, QLDB makes it easy to access the complete and verifiable history of all changes to your data, and to stream coherent transactions to other data services as needed.

Optimistic concurrency control

In QLDB, concurrency control is implemented using optimistic concurrency control (OCC). OCC operates on the principle that multiple transactions can frequently complete without interfering with each other.

Using OCC, transactions in QLDB don't acquire locks on database resources and operate with full serializable isolation. QLDB executes concurrent transactions in a serial manner, such that it produces the same effect as if those transactions were executed serially.

Before committing, each transaction performs a validation check to ensure that no other committed transaction has modified the snapshot of data that it's reading. If this check reveals conflicting modifications, or the state of the data snapshot changes, the committing transaction is rejected. However, the transaction can be restarted.

When a transaction writes to QLDB, the validation checks of the OCC model are implemented by QLDB itself. If a transaction can't be written to the journal due to a failure in the verification phase of OCC, QLDB returns an OccConflictException to the application layer. The application software is responsible for ensuring that the transaction is restarted. The application should abort the rejected transaction and then retry the whole transaction from the start.

For a code example that shows how to catch and handle OCC conflicts, see Handling OCC conflict exceptions.

Using indexes to avoid full table scans

In QLDB, every SELECT query is processed in a transaction and is subject to a transaction timeout limit.

As a best practice, you should run statements with a WHERE predicate clause that filters on an indexed field or a document ID. QLDB requires an equality operator on an indexed field to efficiently look up a document; for example, WHERE indexedField = 123 or WHERE indexedField IN (456, 789).

Without this indexed lookup, QLDB needs to do a full table scan when reading documents. This can cause query latency and transaction timeouts, and also increases the chances of an OCC conflict with competing transactions.

For example, consider a table named Vehicle that has an index on the VIN field only. It contains the following documents.

VIN Make Model Color
"1N4AL11D75C109151" "Audi" "A5" "Silver"
"KM8SRDHF6EU074761" "Tesla" "Model S" "Blue"
"3HGGK5G53FM761765" "Ducati" "Monster 1200" "Yellow"
"1HVBBAANXWH544237" "Ford" "F 150" "Black"
"1C4RJFAG0FC625797" "Mercedes" "CLK 350" "White"

Two concurrent users named Alice and Bob are working with the same table in a ledger. They want to update two different documents, as follows.


UPDATE Vehicle AS v SET v.Color = 'Blue' WHERE v.VIN = '1N4AL11D75C109151'


UPDATE Vehicle AS v SET v.Color = 'Red' WHERE v.Make = 'Tesla' AND v.Model = 'Model S'

Suppose that Alice and Bob start their transactions at the same time. Alice's UPDATE statement does an indexed lookup on the VIN field, so it only needs to read that one document. Alice finishes and successfully commits her transaction first.

Bob's statement filters on non-indexed fields, so it does a table scan and encounters an OccConflictException. This is because Alice's committed transaction modified the data snapshot that Bob's statement is reading, which includes every document in the table—not only the document that he is updating.

Insertion OCC conflicts

OCC conflicts can include documents that are newly inserted—not only documents that previously existed. Consider the following diagram, in which two concurrent users (Alice and Bob) are working with the same table in a ledger. They both want to insert a new document only under the condition that a predicate value does not yet exist.

                Amazon QLDB optimistic concurrency control (OCC) diagram showing an example
                    of a conflict exception between two concurrent users.

In this example, both Alice and Bob run the following SELECT and INSERT statements within a single transaction. Their application runs the INSERT statement only if the SELECT statement returns no results.

INSERT INTO Vehicle VALUE { 'VIN' : 'ABCDE12345EXAMPLE', 'Type' : 'Wagon', 'Year' : 2019, 'Make' : 'Subaru', 'Model' : 'Outback', 'Color' : 'Gray' }

Suppose that Alice and Bob start their transactions at the same time. Both of their SELECT queries return no existing document with a VIN of ABCDE12345EXAMPLE. So, their applications proceed with the INSERT statement.

Alice finishes and successfully commits her transaction first. Then, Bob tries to commit his transaction, but QLDB rejects it and throws an OccConflictException. This is because Alice's committed transaction modified the result set of Bob's SELECT query, and OCC detects this conflict before committing Bob's transaction.

Bob can then retry his whole transaction from the start. But his next SELECT query will return the document that Alice inserted, so Bob's application won't run the INSERT.

Making transactions idempotent

The insert transaction in the previous section is also an example of an idempotent transaction. In other words, running the same transaction multiple times produces identical results. If Bob runs the INSERT without first checking if the document already exists, the table might end up with duplicate documents that have the same VIN.

Consider other retry scenarios in addition to OCC conflicts. For example, it's possible that QLDB successfully commits a transaction on the server side, but the client times out while waiting for a response. As a best practice, make your write transactions idempotent to avoid any unexpected side effects in the case of retries.

Concurrent sessions management

If you have experience using a relational database management system (RDBMS), you might be familiar with concurrent connection limits. QLDB doesn't have the same concept of a traditional RDBMS connection because transactions are executed with HTTP request and response messages.

In QLDB, the analogous concept is an active session. A session is conceptually similar to a user login—it manages information about your data transaction requests to a ledger. An active session is one that is actively running a transaction—or a session that recently finished a transaction such that the service anticipates it will start another transaction immediately. QLDB supports one actively running transaction per session.

The limit of concurrent active sessions per ledger is defined in Quotas and limits in Amazon QLDB. After this limit is reached, any session that tries to start a transaction will result in an error (LimitExceededException).

For information about the lifecycle of a session and how the QLDB driver handles sessions when running data transactions, see Session management in Amazon QLDB. For best practices for configuring a session pool in your application using the QLDB driver, see Configuring the QldbDriver object in Amazon QLDB driver recommendations.