Amazon QLDB driver for Python – Cookbook reference - Amazon Quantum Ledger Database (Amazon QLDB)

Amazon QLDB driver for Python – Cookbook reference

This reference guide shows common use cases of the Amazon QLDB driver for Python. It provides Python code examples that demonstrate how to use the driver to run basic create, read, update, and delete (CRUD) operations. It also includes code examples for processing Amazon Ion data. In addition, this guide highlights best practices for making transactions idempotent and implementing uniqueness constraints.

Note

Where applicable, some use cases have different code examples for each supported major version of the QLDB driver for Python.

Importing the driver

The following code example imports the driver.

3.x
from pyqldb.driver.qldb_driver import QldbDriver import amazon.ion.simpleion as simpleion
2.x
from pyqldb.driver.pooled_qldb_driver import PooledQldbDriver import amazon.ion.simpleion as simpleion
Note

This example also imports the Amazon Ion package (amazon.ion.simpleion). You need this package to process Ion data when running some data operations in this reference. To learn more, see Working with Amazon Ion.

Instantiating the driver

The following code example creates an instance of the driver that connects to a specified ledger name using default settings.

3.x
qldb_driver = QldbDriver(ledger_name='vehicle-registration')
2.x
qldb_driver = PooledQldbDriver(ledger_name='vehicle-registration')

CRUD operations

QLDB runs create, read, update, and delete (CRUD) operations as part of a transaction.

Warning

As a best practice, make your write transactions strictly idempotent.

Making transactions idempotent

We recommend that you make write transactions idempotent to avoid any unexpected side effects in the case of retries. A transaction is idempotent if it can run multiple times and produce identical results each time.

For example, consider a transaction that inserts a document into a table named Person. The transaction should first check whether or not the document already exists in the table. Without this check, the table might end up with duplicate documents.

Suppose that QLDB successfully commits the transaction on the server side, but the client times out while waiting for a response. If the transaction isn't idempotent, the same document could be inserted more than once in the case of a retry.

Using indexes to avoid full table scans

We also recommend that you run statements with a WHERE predicate clause using an equality operator on an indexed field or a document ID; for example, WHERE indexedField = 123 or WHERE indexedField IN (456, 789). Without this indexed lookup, QLDB needs to do a table scan, which can lead to transaction timeouts or optimistic concurrency control (OCC) conflicts.

For more information about OCC, see Amazon QLDB concurrency model.

Implicitly created transactions

The pyqldb.driver.qldb_driver.execute_lambda method accepts a lambda function that receives an instance of pyqldb.execution.executor.Executor, which you can use to run statements. The instance of Executor wraps an implicitly created transaction.

You can run statements within the lambda function by using the execute_statement method of the transaction executor. The driver implicitly commits the transaction when the lambda function returns.

Note

The execute_statement method supports both Amazon Ion types and Python native types. If you pass a Python native type as an argument to execute_statement, the driver converts it to an Ion type using the amazon.ion.simpleion module (provided that conversion for the given Python data type is supported). For supported data types and conversion rules, see the simpleion source code.

The following sections show how to run basic CRUD operations, specify custom retry logic, and implement uniqueness constraints.

Creating tables

def create_table(transaction_executor): transaction_executor.execute_statement("CREATE TABLE Person") qldb_driver.execute_lambda(lambda executor: create_table(executor))

Creating indexes

def create_index(transaction_executor): transaction_executor.execute_statement("CREATE INDEX ON Person(GovId)") qldb_driver.execute_lambda(lambda executor: create_index(executor))

Reading documents

# Assumes that Person table has documents as follows: # { "GovId": "TOYENC486FH", "FirstName": "Brent" } def read_documents(transaction_executor): cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE GovId = 'TOYENC486FH'") for doc in cursor: print(doc["GovId"]) # prints TOYENC486FH print(doc["FirstName"]) # prints Brent qldb_driver.execute_lambda(lambda executor: read_documents(executor))

Using query parameters

The following code example uses a native type query parameter.

cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE GovId = ?", 'TOYENC486FH')

The following code example uses an Ion type query parameter.

name = ion.loads('Brent') cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE FirstName = ?", name)

The following code example uses multiple query parameters.

cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE GovId = ? AND FirstName = ?", 'TOYENC486FH', "Brent")

The following code example uses a list of query parameters.

gov_ids = ['TOYENC486FH','ROEE1','YH844'] cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE GovId IN (?,?,?)", *gov_ids)
Note

When you run a query without an indexed lookup, it invokes a full table scan. In this example, we recommend having an index on the GovId field to optimize performance. Without an index on GovId, queries can have more latency and can also lead to OCC conflict exceptions or transaction timeouts.

Inserting documents

The following code example inserts native data types.

def insert_documents(transaction_executor, arg_1): # Check if doc with GovId:TOYENC486FH exists # This is critical to make this transaction idempotent cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE GovId = ?", 'TOYENC486FH') # Check if there is any record in the cursor first_record = next(cursor, None) if first_record: # Record already exists, no need to insert pass else: transaction_executor.execute_statement("INSERT INTO Person ?", arg_1) doc_1 = { 'FirstName': "Brent", 'GovId': 'TOYENC486FH', } qldb_driver.execute_lambda(lambda executor: insert_documents(executor, doc_1))

The following code example inserts Ion data types.

def insert_documents(transaction_executor, arg_1): # Check if doc with GovId:TOYENC486FH exists # This is critical to make this transaction idempotent cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE GovId = ?", 'TOYENC486FH') # Check if there is any record in the cursor first_record = next(cursor, None) if first_record: # Record already exists, no need to insert pass else: transaction_executor.execute_statement("INSERT INTO Person ?", arg_1) doc_1 = { 'FirstName': 'Brent', 'GovId': 'TOYENC486FH', } # create a sample Ion doc ion_doc_1 = simpleion.loads(simpleion.dumps(doc_1))) qldb_driver.execute_lambda(lambda executor: insert_documents(executor, ion_doc_1))

This transaction inserts a document into the Person table. Before inserting, it first checks if the document already exists in the table. This check makes the transaction idempotent in nature. Even if you run this transaction multiple times, it won't cause any unintended side effects.

Note

In this example, we recommend having an index on the GovId field to optimize performance. Without an index on GovId, statements can have more latency and can also lead to OCC conflict exceptions or transaction timeouts.

Inserting multiple documents in one statement

To insert multiple documents by using a single INSERT statement, you can pass a parameter of type list to the statement as follows.

# people is a list transaction_executor.execute_statement("INSERT INTO Person ?", people)

You don't enclose the variable placeholder (?) in double angle brackets ( <<...>> ) when passing a list. In manual PartiQL statements, double angle brackets denote an unordered collection known as a bag.

Updating documents

The following code example uses native data types.

def update_documents(transaction_executor, gov_id, name): transaction_executor.execute_statement("UPDATE Person SET FirstName = ? WHERE GovId = ?", name, gov_id) gov_id = 'TOYENC486FH' name = 'John' qldb_driver.execute_lambda(lambda executor: update_documents(executor, gov_id, name))

The following code example uses Ion data types.

def update_documents(transaction_executor, gov_id, name): transaction_executor.execute_statement("UPDATE Person SET FirstName = ? WHERE GovId = ?", name, gov_id) # Ion datatypes gov_id = simpleion.loads('TOYENC486FH') name = simpleion.loads('John') qldb_driver.execute_lambda(lambda executor: update_documents(executor, gov_id, name))
Note

In this example, we recommend having an index on the GovId field to optimize performance. Without an index on GovId, statements can have more latency and can also lead to OCC conflict exceptions or transaction timeouts.

Deleting documents

The following code example uses native data types.

def delete_documents(transaction_executor, gov_id): cursor = transaction_executor.execute_statement("DELETE FROM Person WHERE GovId = ?", gov_id) gov_id = 'TOYENC486FH' qldb_driver.execute_lambda(lambda executor: delete_documents(executor, gov_id))

The following code example uses Ion data types.

def delete_documents(transaction_executor, gov_id): cursor = transaction_executor.execute_statement("DELETE FROM Person WHERE GovId = ?", gov_id) # Ion datatypes gov_id = simpleion.loads('TOYENC486FH') qldb_driver.execute_lambda(lambda executor: delete_documents(executor, gov_id))
Note

In this example, we recommend having an index on the GovId field to optimize performance. Without an index on GovId, statements can have more latency and can also lead to OCC conflict exceptions or transaction timeouts.

Running multiple statements in a transaction

# This code snippet is intentionally trivial. In reality you wouldn't do this because you'd # set your UPDATE to filter on vin and insured, and check if you updated something or not. def do_insure_car(transaction_executor, vin): cursor = transaction_executor.execute_statement( "SELECT insured FROM Vehicles WHERE vin = ? AND insured = FALSE", vin) first_record = next(cursor, None) if first_record: transaction_executor.execute_statement( "UPDATE Vehicles SET insured = TRUE WHERE vin = ?", vin) return True else: return False def insure_car(qldb_driver, vin_to_insure): return qldb_driver.execute_lambda( lambda executor: do_insure_car(executor, vin_to_insure))

Retry logic

The driver's execute_lambda method has a built-in retry mechanism that retries the transaction if a retryable exception occurs (such as timeouts or OCC conflicts).

3.x

The maximum number of retry attempts and the backoff strategy are configurable.

The default retry limit is 4, and the default backoff strategy is Exponential Backoff and Jitter with a base of 10 milliseconds. You can set the retry configuration per driver instance and also per transaction by using an instance of pyqldb.config.retry_config.RetryConfig.

The following code example specifies retry logic with a custom retry limit and a custom backoff strategy for an instance of the driver.

from pyqldb.config.retry_config import RetryConfig from pyqldb.driver.qldb_driver import QldbDriver # Configuring retry limit to 2 retry_config = RetryConfig(retry_limit=2) qldb_driver = QldbDriver("test-ledger", retry_config=retry_config) # Configuring a custom backoff which increases delay by 1s for each attempt. def custom_backoff(retry_attempt, error, transaction_id): return 1000 * retry_attempt retry_config_custom_backoff = RetryConfig(retry_limit=2, custom_backoff=custom_backoff) qldb_driver = QldbDriver("test-ledger", retry_config=retry_config_custom_backoff)

The following code example specifies retry logic with a custom retry limit and a custom backoff strategy for a particular lambda execution. This configuration for execute_lambda overrides the retry logic that is set for the driver instance.

from pyqldb.config.retry_config import RetryConfig from pyqldb.driver.qldb_driver import QldbDriver # Configuring retry limit to 2 retry_config_1 = RetryConfig(retry_limit=4) qldb_driver = QldbDriver("test-ledger", retry_config=retry_config_1) # Configuring a custom backoff which increases delay by 1s for each attempt. def custom_backoff(retry_attempt, error, transaction_id): return 1000 * retry_attempt retry_config_2 = RetryConfig(retry_limit=2, custom_backoff=custom_backoff) # The config `retry_config_1` will be overriden by `retry_config_2` qldb_driver.execute_lambda(lambda txn: txn.execute_statement("CREATE TABLE Person"), retry_config_2)
2.x

The maximum number of retry attempts is configurable. You can configure the retry limit by setting the retry_limit property when initializing PooledQldbDriver.

The default retry limit is 4.

Implementing uniqueness constraints

QLDB doesn't support unique indexes, but you can implement this behavior in your application.

Suppose that you want to implement a uniqueness constraint on the GovId field in the Person table. To do this, you can write a transaction that does the following:

  1. Assert that the table has no existing documents with a specified GovId.

  2. Insert the document if the assertion passes.

If a competing transaction concurrently passes the assertion, only one of the transactions will commit successfully. The other transaction will fail with an OCC conflict exception.

The following code example shows how to implement this uniqueness constraint logic.

def insert_documents(transaction_executor, gov_id, document): # Check if doc with GovId = gov_id exists cursor = transaction_executor.execute_statement("SELECT * FROM Person WHERE GovId = ?", gov_id) # Check if there is any record in the cursor first_record = next(cursor, None) if first_record: # Record already exists, no need to insert pass else: transaction_executor.execute_statement("INSERT INTO Person ?", document) qldb_driver.execute_lambda(lambda executor: insert_documents(executor, gov_id, document))
Note

In this example, we recommend having an index on the GovId field to optimize performance. Without an index on GovId, statements can have more latency and can also lead to OCC conflict exceptions or transaction timeouts.

Working with Amazon Ion

The following sections show how to use the Amazon Ion module to process Ion data.

Importing the Ion module

import amazon.ion.simpleion as simpleion

Creating Ion types

The following code example creates an Ion object from Ion text.

ion_text = '{GovId: "TOYENC486FH", FirstName: "Brent"}' ion_obj = simpleion.loads(ion_text) print(ion_obj['GovId']) # prints TOYENC486FH print(ion_obj['Name']) # prints Brent

The following code example creates an Ion object from a Python dict.

a_dict = { 'GovId': 'TOYENC486FH', 'FirstName': "Brent" } ion_obj = simpleion.loads(simpleion.dumps(a_dict)) print(ion_obj['GovId']) # prints TOYENC486FH print(ion_obj['FirstName']) # prints Brent

Getting an Ion binary dump

# ion_obj is an Ion struct print(simpleion.dumps(ion_obj)) # b'\xe0\x01\x00\xea\xee\x97\x81\x83\xde\x93\x87\xbe\x90\x85GovId\x89FirstName\xde\x94\x8a\x8bTOYENC486FH\x8b\x85Brent'

Getting an Ion text dump

# ion_obj is an Ion struct print(simpleion.dumps(ion_obj, binary=False)) # prints $ion_1_0 {GovId:'TOYENC486FH',FirstName:"Brent"}

For more information about working with Ion, see the Amazon Ion documentation on GitHub. For more code examples of working with Ion in QLDB, see Working with Amazon Ion data types in Amazon QLDB.