Amazon Quantum Ledger Database (Amazon QLDB)
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Data Verification in Amazon QLDB

With Amazon QLDB, you can trust that the history of changes to your application data is accurate. QLDB uses an immutable transactional log, known as a journal, for data storage. The journal tracks every change to your data and maintains a complete and verifiable history of changes over time.

Sourcing data from the journal, QLDB uses a cryptographic hash function (SHA-256) with a Merkle tree–based model to generate a secure output file of your ledger's full hash chain. This output file is known as a digest and acts as a fingerprint of your data’s entire change history as of a point in time. It enables you to look back and validate the integrity of your data revisions relative to that fingerprint.

What Kind of Data Can You Verify in QLDB?

In QLDB, each ledger has exactly one journal. A journal can have multiple strands, which are partitions of the journal.

Note

QLDB currently supports journals with a single strand only.

A block is an object that is committed to the journal strand during a transaction. This block contains entry objects, which represent the document revisions that resulted from the transaction. These entries are the objects whose integrity you can verify in QLDB.

The following diagram illustrates this journal structure.


                Amazon QLDB journal structure diagram showing a set of chained blocks that
                    make up a strand, and the sequence number and block hash of each block.

The diagram shows that a block is committed to the strand during a transaction and results in document revision entries within that block. It also shows that each block is hashed for verification and has a sequence number to specify its address within the strand.

What Does Data Integrity Mean?

Data integrity in QLDB means that your ledger's journal is in fact immutable. In other words, your data (specifically, each document revision) is in a state where the following are true:

  1. It exists at the same location in your journal where it was first written.

  2. It hasn't been altered in any way since it was written.

Why Does a Merkle Tree Enable Verification?

Fundamentally, a Merkle tree is a tree data structure in which each leaf node represents a hash of a data block. Each non-leaf node is a hash of its child nodes. Commonly used in blockchains, a Merkle tree enables efficient verification of large datasets with an audit proof mechanism. For more information about Merkle trees, see the Merkle tree Wikipedia page. To learn more about Merkle audit proofs and for an example use case, see How Log Proofs Work on the Certificate Transparency site.

Amazon QLDB uses a modified binary hash tree, based on the Merkle tree, in which the leaf nodes are the set of all document hashes in your journal. The root node represents the digest of the entire journal as of a point in time. QLDB uses SHA-256 as its cryptographic hash function, so each hash is a 256-bit value.

Using a Merkle audit proof, you can verify that a document revision exists in your journal and is in the correct position relative to the digest, without having to check your ledger's entire document history. You do this by traversing the tree from a leaf node to its root, meaning that you only need to compute the node hashes within this audit path. This process has a time complexity of log(n) nodes in the tree. A proof in QLDB is simply the list of node hashes required to mathematically transform any given leaf node hash (a document) into the root hash (the digest).

Hash Tree Diagrams

The following diagram illustrates the Amazon QLDB hash tree model. It shows a set of block hashes that rolls up to the top root node, which represents the digest of a journal strand. In a ledger with a single-strand journal, this root node is also the digest of the entire ledger.


                Amazon QLDB hash tree diagram for a set of block hashes in a journal
                    strand.

Suppose that node A is the block that contains the document whose hash you want to verify. The following list of nodes represents the ordered list of hashes that QLDB provides in your proof: B, E, G. These hashes are required to recalculate the digest from hash A.

To do this, start with hash A and concatenate it with hash B. Then, hash the result to compute D. Next, use D and E to compute F. Finally, use F and G to compute the digest. The verification is successful if your recalculated digest matches the expected value. It's not mathematically feasible to reverse engineer the hashes in a proof. Therefore, this exercise proves that the contents of your document were indeed written in this journal location relative to the digest.

What Is the Verification Process in QLDB?

Before you can verify data, you must request a digest from your ledger and save it for later. Any document revision that is committed before the latest block covered by the digest is eligible for verification against that digest.

Then, you request a proof from Amazon QLDB for an eligible document revision that you want to verify. Using this proof, you call a client-side API to recalculate the digest, starting with your revision hash. As long as the previously saved digest is known and trusted outside of QLDB, the integrity of your document is proven if your recalculated digest hash matches the saved digest hash.

Note

What you are specifically proving is that the document revision was not altered between the time that you saved this digest and when you run the verification.

For step-by-step guides on how to request a digest from your ledger and then verify your data, see the following: