Journal export output in QLDB
Important
End of support notice: Existing customers will be able to use Amazon QLDB until end of support on 07/31/2025. For more details, see
Migrate an Amazon QLDB Ledger to Amazon Aurora PostgreSQL
An Amazon QLDB journal export job writes two manifest files in addition to the data objects that contain your journal blocks. These are all saved in the Amazon S3 bucket that you provided in your export request. The following sections describe the format and contents of each output object.
Note
If you specify JSON as the output format of your export job, QLDB down-converts the Amazon Ion journal data to JSON in your exported data objects. For more information, proceed to Down-converting to JSON.
Manifest files
Amazon QLDB creates two manifest files in the provided S3 bucket for each export request. The initial manifest file is created as soon as you submit the export request. The final manifest file is written after the export is complete. You can use these files to check the status of your export jobs in Amazon S3.
The format for the contents of the manifest files corresponds to the requested output format for the export.
Initial manifest
The initial manifest indicates that your export job has started. It contains
the input parameters that you passed to the request. In addition to the Amazon S3
destination and the start and end time parameters for the export, this file also
contains an exportId
. The exportId
is a unique ID that
QLDB assigns to each export job.
The file-naming convention is as follows.
s3://amzn-s3-demo-bucket/prefix
/exportId
.started.manifest
The following is an example of an initial manifest file and its contents in Ion text format.
s3://amzn-s3-demo-bucket/journalExport/8UyXulxccYLAsbN1aon7e4.started.manifest
{
ledgerName:"my-example-ledger",
exportId:"8UyXulxccYLAsbN1aon7e4",
inclusiveStartTime:2019-04-15T00:00:00.000Z,
exclusiveEndTime:2019-04-15T22:00:00.000Z,
bucket:"amzn-s3-demo-bucket",
prefix:"journalExport",
objectEncryptionType:"NO_ENCRYPTION",
outputFormat:"ION_TEXT"
}
The initial manifest includes the outputFormat
only if it was
specified in the export request. If you don't specify the output format, the
exported data defaults to ION_TEXT
format.
The DescribeJournalS3Export API operation and the content type of the exported Amazon S3 objects also indicate the output format.
Final manifest
The final manifest indicates that your export job for a particular journal strand has completed. The export job writes a separate final manifest file for each strand.
Note
In Amazon QLDB, a strand is a partition of your ledger's journal. QLDB currently supports journals with a single strand only.
The final manifest includes an ordered list of data object keys that were written during the export. The file naming convention is as follows.
s3://amzn-s3-demo-bucket/prefix
/exportId
.strandId
.completed.manifest
The strandId
is a unique ID that QLDB assigns to the strand.
The following is an example of a final manifest file and its contents in Ion
text format.
s3://amzn-s3-demo-bucket/journalExport/8UyXulxccYLAsbN1aon7e4.JdxjkR9bSYB5jMHWcI464T.completed.manifest
{
keys:[
"2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.1-4.ion",
"2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.5-10.ion",
"2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.11-12.ion",
"2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.13-20.ion",
"2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.21-21.ion"
]
}
Data objects
Amazon QLDB writes journal data objects in the provided Amazon S3 bucket in either the text or binary representation of Amazon Ion format, or in JSON Lines text format.
In JSON Lines format, each block in an exported data object is a valid JSON object
that is delimited by a newline. You can use this format to directly integrate JSON
exports with analytics tools such as Amazon Athena and AWS Glue because these services
can parse newline-delimited JSON automatically. For more information about the
format, see JSON Lines
Data object names
A journal export job writes these data objects with the following naming convention.
s3://amzn-s3-demo-bucket/prefix
/yyyy/mm/dd/hh/strandId
.startSn
-endSn
.ion|.json
-
The output data of each export job is broken up into chunks.
-
yyyy/mm/dd/hh
– The date and time when you submitted the export request. Objects that are exported within the same hour are grouped under the same Amazon S3 prefix. -
strandId
– The unique ID of the particular strand that contains the journal block that is being exported. -
startSn-endSn
– The sequence number range that is included in the object. A sequence number specifies the location of a block within a strand.
For example, suppose that you specify the following path.
s3://amzn-s3-demo-bucket/journalExport/
Your export job creates an Amazon S3 data object that looks similar to the following. This example shows an object name in Ion format.
s3://amzn-s3-demo-bucket/journalExport/2019/04/15/22/JdxjkR9bSYB5jMHWcI464T.1-5.ion
Data object contents
Each data object contains journal block objects with the following format.
{
blockAddress: {
strandId: String,
sequenceNo: Int
},
transactionId: String,
blockTimestamp: Datetime,
blockHash: SHA256,
entriesHash: SHA256,
previousBlockHash: SHA256,
entriesHashList: [ SHA256 ],
transactionInfo: {
statements: [
{
//PartiQL statement object
}
],
documents: {
//document-table-statement mapping object
}
},
revisions: [
{
//document revision object
}
]
}
A block is an object that is committed to the journal during a transaction. A block contains transaction metadata along with entries that represent the document revisions that were committed in the transaction and the PartiQL statements that committed them.
The following is an example of a block with sample data in Ion text format. For information about the fields in a block object, see Journal contents in Amazon QLDB.
Note
This block example is provided for informational purposes only. The hashes shown aren't real calculated hash values.
{
blockAddress:{
strandId:"JdxjkR9bSYB5jMHWcI464T",
sequenceNo:1234
},
transactionId:"D35qctdJRU1L1N2VhxbwSn",
blockTimestamp:2019-10-25T17:20:21.009Z,
blockHash:{{WYLOfZClk0lYWT3lUsSr0ONXh+Pw8MxxB+9zvTgSvlQ=}},
entriesHash:{{xN9X96atkMvhvF3nEy6jMSVQzKjHJfz1H3bsNeg8GMA=}},
previousBlockHash:{{IAfZ0h22ZjvcuHPSBCDy/6XNQTsqEmeY3GW0gBae8mg=}},
entriesHashList:[
{{F7rQIKCNn0vXVWPexilGfJn5+MCrtsSQqqVdlQxXpS4=}},
{{C+L8gRhkzVcxt3qRJpw8w6hVEqA5A6ImGne+E7iHizo=}}
],
transactionInfo:{
statements:[
{
statement:"CREATE TABLE VehicleRegistration",
startTime:2019-10-25T17:20:20.496Z,
statementDigest:{{3jeSdejOgp6spJ8huZxDRUtp2fRXRqpOMtG43V0nXg8=}}
},
{
statement:"CREATE INDEX ON VehicleRegistration (VIN)",
startTime:2019-10-25T17:20:20.549Z,
statementDigest:{{099D+5ZWDgA7r+aWeNUrWhc8ebBTXjgscq+mZ2dVibI=}}
},
{
statement:"CREATE INDEX ON VehicleRegistration (LicensePlateNumber)",
startTime:2019-10-25T17:20:20.560Z,
statementDigest:{{B73tVJzVyVXicnH4n96NzU2L2JFY8e9Tjg895suWMew=}}
},
{
statement:"INSERT INTO VehicleRegistration ?",
startTime:2019-10-25T17:20:20.595Z,
statementDigest:{{ggpon5qCXLo95K578YVhAD8ix0A0M5CcBx/W40Ey/Tk=}}
}
],
documents:{
'8F0TPCmdNQ6JTRpiLj2TmW':{
tableName:"VehicleRegistration",
tableId:"BPxNiDQXCIB5l5F68KZoOz",
statements:[3]
}
}
},
revisions:[
{
hash:{{FR1IWcWew0yw1TnRklo2YMF/qtwb7ohsu5FD8A4DSVg=}}
},
{
blockAddress:{
strandId:"JdxjkR9bSYB5jMHWcI464T",
sequenceNo:1234
},
hash:{{t8Hj6/VC4SBitxnvBqJbOmrGytF2XAA/1c0AoSq2NQY=}},
data:{
VIN:"1N4AL11D75C109151",
LicensePlateNumber:"LEWISR261LL",
State:"WA",
City:"Seattle",
PendingPenaltyTicketAmount:90.25,
ValidFromDate:2017-08-21,
ValidToDate:2020-05-11,
Owners:{
PrimaryOwner:{
PersonId:"GddsXfIYfDlKCEprOLOwYt"
},
SecondaryOwners:[]
}
},
metadata:{
id:"8F0TPCmdNQ6JTRpiLj2TmW",
version:0,
txTime:2019-10-25T17:20:20.618Z,
txId:"D35qctdJRU1L1N2VhxbwSn"
}
}
]
}
In the revisions
field, some revision objects might only contain
a hash
value and no other attributes. These are internal-only
system revisions that don't contain user data. An export job includes these
revisions in their respective blocks because the hashes of these revisions are
part of the journal's full hash chain. The full hash chain is required for
cryptographic verification.
Down-converting to JSON
If you specify JSON as the output format of your export job, QLDB down-converts the Amazon Ion journal data to JSON in your exported data objects. However, converting Ion to JSON is lossy in certain cases where your data uses the rich Ion types that don't exist in JSON.
For details about Ion to JSON conversion rules, see Down-converting to JSON
Export processor library (Java)
QLDB provides an extensible framework for Java that streamlines the processing
of exports in Amazon S3. This framework library handles the work of reading an export's
output and iterating through the exported blocks in sequential order. To use this
export processor, see the GitHub repository awslabs/amazon-qldb-export-processor-java