Journal export output in QLDB - Amazon Quantum Ledger Database (Amazon QLDB)

Journal export output in QLDB

An Amazon QLDB journal export job writes two manifest files in addition to the data objects that contain your journal blocks. These are all saved in the Amazon S3 bucket that you provided in your export request. The following sections describe the format and contents of each output object.

Note

If you specify JSON as the output format of your export job, QLDB down-converts the Amazon Ion journal data to JSON in your exported data objects. For more information, proceed to Down-converting to JSON.

Manifest files

Amazon QLDB creates two manifest files in the provided S3 bucket for each export request. The initial manifest file is created as soon as you submit the export request. The final manifest file is written after the export is complete. You can use these files to check the status of your export jobs in Amazon S3.

The format for the contents of the manifest files corresponds to the requested output format for the export.

Initial manifest

The initial manifest indicates that your export job has started. It contains the input parameters that you passed to the request. In addition to the Amazon S3 destination and the start and end time parameters for the export, this file also contains an exportId. The exportId is a unique ID that QLDB assigns to each export job.

The file-naming convention is as follows.

s3://DOC-EXAMPLE-BUCKET/prefix/exportId.started.manifest

The following is an example of an initial manifest file and its contents in Ion text format.

s3://DOC-EXAMPLE-BUCKET/journalExport/8UyXulxccYLAsbN1aon7e4.started.manifest
{ ledgerName:"my-example-ledger", exportId:"8UyXulxccYLAsbN1aon7e4", inclusiveStartTime:2019-04-15T00:00:00.000Z, exclusiveEndTime:2019-04-15T22:00:00.000Z, bucket:"DOC-EXAMPLE-BUCKET", prefix:"journalExport", objectEncryptionType:"NO_ENCRYPTION", outputFormat:"ION_TEXT" }

The initial manifest includes the outputFormat only if it was specified in the export request. If you don't specify the output format, the exported data defaults to ION_TEXT format.

The DescribeJournalS3Export API operation and the content type of the exported Amazon S3 objects also indicate the output format.

Final manifest

The final manifest indicates that your export job for a particular journal strand has completed. The export job writes a separate final manifest file for each strand.

Note

In Amazon QLDB, a strand is a partition of your ledger's journal. QLDB currently supports journals with a single strand only.

The final manifest includes an ordered list of data object keys that were written during the export. The file naming convention is as follows.

s3://DOC-EXAMPLE-BUCKET/prefix/exportId.strandId.completed.manifest

The strandId is a unique ID that QLDB assigns to the strand. The following is an example of a final manifest file and its contents in Ion text format.

s3://DOC-EXAMPLE-BUCKET/journalExport/8UyXulxccYLAsbN1aon7e4.JdxjkR9bSYB5jMHWcI464T.completed.manifest
{ keys:[ "2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.1-4.ion", "2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.5-10.ion", "2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.11-12.ion", "2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.13-20.ion", "2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.21-21.ion" ] }

Data objects

Amazon QLDB writes journal data objects in the provided Amazon S3 bucket in either the text or binary representation of Amazon Ion format, or in JSON Lines text format.

In JSON Lines format, each block in an exported data object is a valid JSON object that is delimited by a newline. You can use this format to directly integrate JSON exports with analytics tools such as Amazon Athena and AWS Glue because these services can parse newline-delimited JSON automatically. For more information about the format, see JSON Lines.

Data object names

A journal export job writes these data objects with the following naming convention.

s3://DOC-EXAMPLE-BUCKET/prefix/yyyy/mm/dd/hh/strandId.startSn-endSn.ion|.json
  • The output data of each export job is broken up into chunks.

  • yyyy/mm/dd/hh – The date and time when you submitted the export request. Objects that are exported within the same hour are grouped under the same Amazon S3 prefix.

  • strandId – The unique ID of the particular strand that contains the journal block that is being exported.

  • startSn-endSn – The sequence number range that is included in the object. A sequence number specifies the location of a block within a strand.

For example, suppose that you specify the following path.

s3://DOC-EXAMPLE-BUCKET/journalExport/

Your export job creates an Amazon S3 data object that looks similar to the following. This example shows an object name in Ion format.

s3://DOC-EXAMPLE-BUCKET/journalExport/2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.1-5.ion

Data object contents

Each data object contains journal block objects with the following format.

{ blockAddress: { strandId: String, sequenceNo: Int }, transactionId: String, blockTimestamp: Datetime, blockHash: SHA256, entriesHash: SHA256, previousBlockHash: SHA256, entriesHashList: [ SHA256 ], transactionInfo: { statements: [ { //PartiQL statement object } ], documents: { //document-table-statement mapping object } }, revisions: [ { //document revision object } ] }

A block is an object that is committed to the journal during a transaction. A block contains transaction metadata along with entries that represent the document revisions that were committed in the transaction and the PartiQL statements that committed them.

The following is an example of a block with sample data in Ion text format. For information about the fields in a block object, see Journal contents in Amazon QLDB.

Note

This block example is provided for informational purposes only. The hashes shown aren't real calculated hash values.

{ blockAddress:{ strandId:"JdxjkR9bSYB5jMHWcI464T", sequenceNo:1234 }, transactionId:"D35qctdJRU1L1N2VhxbwSn", blockTimestamp:2019-10-25T17:20:21.009Z, blockHash:{{WYLOfZClk0lYWT3lUsSr0ONXh+Pw8MxxB+9zvTgSvlQ=}}, entriesHash:{{xN9X96atkMvhvF3nEy6jMSVQzKjHJfz1H3bsNeg8GMA=}}, previousBlockHash:{{IAfZ0h22ZjvcuHPSBCDy/6XNQTsqEmeY3GW0gBae8mg=}}, entriesHashList:[ {{F7rQIKCNn0vXVWPexilGfJn5+MCrtsSQqqVdlQxXpS4=}}, {{C+L8gRhkzVcxt3qRJpw8w6hVEqA5A6ImGne+E7iHizo=}} ], transactionInfo:{ statements:[ { statement:"CREATE TABLE VehicleRegistration", startTime:2019-10-25T17:20:20.496Z, statementDigest:{{3jeSdejOgp6spJ8huZxDRUtp2fRXRqpOMtG43V0nXg8=}} }, { statement:"CREATE INDEX ON VehicleRegistration (VIN)", startTime:2019-10-25T17:20:20.549Z, statementDigest:{{099D+5ZWDgA7r+aWeNUrWhc8ebBTXjgscq+mZ2dVibI=}} }, { statement:"CREATE INDEX ON VehicleRegistration (LicensePlateNumber)", startTime:2019-10-25T17:20:20.560Z, statementDigest:{{B73tVJzVyVXicnH4n96NzU2L2JFY8e9Tjg895suWMew=}} }, { statement:"INSERT INTO VehicleRegistration ?", startTime:2019-10-25T17:20:20.595Z, statementDigest:{{ggpon5qCXLo95K578YVhAD8ix0A0M5CcBx/W40Ey/Tk=}} } ], documents:{ '8F0TPCmdNQ6JTRpiLj2TmW':{ tableName:"VehicleRegistration", tableId:"BPxNiDQXCIB5l5F68KZoOz", statements:[3] } } }, revisions:[ { hash:{{FR1IWcWew0yw1TnRklo2YMF/qtwb7ohsu5FD8A4DSVg=}} }, { blockAddress:{ strandId:"JdxjkR9bSYB5jMHWcI464T", sequenceNo:1234 }, hash:{{t8Hj6/VC4SBitxnvBqJbOmrGytF2XAA/1c0AoSq2NQY=}}, data:{ VIN:"1N4AL11D75C109151", LicensePlateNumber:"LEWISR261LL", State:"WA", City:"Seattle", PendingPenaltyTicketAmount:90.25, ValidFromDate:2017-08-21, ValidToDate:2020-05-11, Owners:{ PrimaryOwner:{ PersonId:"GddsXfIYfDlKCEprOLOwYt" }, SecondaryOwners:[] } }, metadata:{ id:"8F0TPCmdNQ6JTRpiLj2TmW", version:0, txTime:2019-10-25T17:20:20.618Z, txId:"D35qctdJRU1L1N2VhxbwSn" } } ] }

In the revisions field, some revision objects might only contain a hash value and no other attributes. These are internal-only system revisions that don't contain user data. An export job includes these revisions in their respective blocks because the hashes of these revisions are part of the journal's full hash chain. The full hash chain is required for cryptographic verification.

Down-converting to JSON

If you specify JSON as the output format of your export job, QLDB down-converts the Amazon Ion journal data to JSON in your exported data objects. However, converting Ion to JSON is lossy in certain cases where your data uses the rich Ion types that don't exist in JSON.

For details about Ion to JSON conversion rules, see Down-converting to JSON in the Amazon Ion Cookbook.

Export processor library (Java)

QLDB provides an extensible framework for Java that streamlines the processing of exports in Amazon S3. This framework library handles the work of reading an export's output and iterating through the exported blocks in sequential order. To use this export processor, see the GitHub repository awslabs/amazon-qldb-export-processor-java.