Streaming journal data from Amazon QLDB - Amazon Quantum Ledger Database (Amazon QLDB)

Streaming journal data from Amazon QLDB

Amazon QLDB uses an immutable transactional log, known as a journal, for data storage. The journal tracks every change to your committed data and maintains a complete and verifiable history of changes over time.

You can create a stream in QLDB that captures every document revision that is committed to your journal and delivers this data to Amazon Kinesis Data Streams in near-real time. A QLDB stream is a continuous flow of data from your ledger's journal to a Kinesis data stream resource.

Then, you use the Kinesis streaming platform or the Kinesis Client Library to consume your stream, process the data records, and analyze the data contents. A QLDB stream writes your data to Kinesis Data Streams in three types of records: control, block summary, and revision details. For more information, see QLDB stream records in Kinesis.

Common use cases

Streaming lets you use QLDB as a single, verifiable source of truth while integrating your journal data with other services. The following are some of the common use cases supported by QLDB journal streams:

  • Event-driven architecture – Build applications in an event-driven architectural style with decoupled components. For example, a bank can use AWS Lambda functions to implement a notification system that alerts customers when their account balance drops below a threshold. In such a system, the account balances are maintained in a QLDB ledger, and any balance changes are recorded in the journal. The AWS Lambda function can trigger the notification logic upon consuming a balance update event that is committed to the journal and sent to a Kinesis data stream.

  • Real-time analytics – Build Kinesis consumer applications that run real-time analytics on event data. With this capability, you can gain insights in near-real time and respond quickly to a changing business environment. For example, an ecommerce website can analyze product sales data and stop advertisements for a discounted product as soon as sales reach a limit.

  • Historical analytics – Take advantage of the journal-oriented architecture of Amazon QLDB by replaying historical event data. You can choose to start a QLDB stream as of any point in time in the past, in which all revisions since that time are delivered to Kinesis Data Streams. Using this feature, you can build Kinesis consumer applications that run analytics jobs on historical data. For example, an ecommerce website can run ad hoc analytics to generate past sales metrics that were not previously captured.

  • Replication to purpose-built databases – Connect QLDB ledgers to other purpose-built data stores using QLDB journal streams. For example, use the Kinesis streaming data platform to integrate with Amazon Elasticsearch Service, which can provide full text search capabilities for QLDB documents. You can also build custom Kinesis consumer applications to replicate your journal data to other purpose-built databases that provide different materialized views. For example, replicate to Amazon Aurora for relational data or to Amazon Neptune for graph-based data.

Consuming your stream

Use Kinesis Data Streams to continuously consume, process, and analyze large streams of data records. In addition to Kinesis Data Streams, the Kinesis streaming data platform includes Amazon Kinesis Data Firehose and Amazon Kinesis Data Analytics. You can use this platform to send data records directly to services such as Amazon Elasticsearch Service (Amazon ES), Amazon Redshift, Amazon Simple Storage Service (Amazon S3), or Splunk. For more information, see Kinesis Data Streams consumers in the Amazon Kinesis Data Streams Developer Guide.

You can also use the Kinesis Client Library (KCL) to build a stream consumer application to process data records in a custom way. The KCL simplifies coding by providing useful abstractions above the low-level Kinesis Data Streams API. To learn more about the KCL, see Developing consumers using the Kinesis Client Library in the Amazon Kinesis Data Streams Developer Guide.

Delivery guarantee

QLDB streams provide an at-least-once delivery guarantee. Each data record that is produced by a QLDB stream is delivered to Kinesis Data Streams at least once. The same records can appear in a Kinesis data stream multiple times. So you must have deduplication logic in the consumer application layer if your use case requires it.

There are also no ordering guarantees. In some circumstances, QLDB blocks and revisions can be produced in a Kinesis data stream out of order. For more information, see Handling duplicate and out-of-order records.

Getting started with streams

The following is a high-level overview of the steps that are required to get started with streaming journal data to Kinesis Data Streams:

  1. Create a Kinesis Data Streams resource. For instructions, see Creating and updating data streams in the Amazon Kinesis Data Streams Developer Guide.

  2. Create an IAM role that enables QLDB to assume write permissions for the Kinesis data stream. For instructions, see Stream permissions in QLDB.

  3. Create a QLDB journal stream. For instructions, see Creating and managing streams in QLDB.

  4. Consume the Kinesis data stream, as described in the previous section Consuming your stream. For code examples that show how to use the Kinesis Client Library or AWS Lambda, see Developing with streams in QLDB.