Menu
AWS Encryption SDK
Developer Guide

AWS Encryption SDK Message Format Reference

The information on this page is a reference for building your own encryption library that is compatible with the AWS Encryption SDK. If you are not building your own compatible encryption library, you likely do not need this information.

To use the AWS Encryption SDK in one of the supported programming languages, see Programming Languages.

The encryption operations in the AWS Encryption SDK return a single data structure or message that contains the encrypted data (ciphertext) and all encrypted data keys. To understand this data structure, or to build libraries that read and write it, you need to understand the message format.

The message format consists of at least two parts: a header and a body. In some cases, the message format consists of a third part, a footer. The message format defines an ordered sequence of bytes in network byte order, also called big-endian format. The message format begins with the header, followed by the body, followed by the footer (when there is one).

Header Structure

The message header contains the encrypted data key and information about how the message body is formed. The following table describes the fields that form the header. The bytes are appended in the order shown.

Header Structure

Field Length, in bytes
Version 1
Type 1
Algorithm ID 2
Message ID 16
AAD Length 2
AAD Variable. Equal to the value specified in the previous 2 bytes (AAD Length).
Encrypted Data Key Count 2
Encrypted Data Key(s) Variable. Determined by the number of encrypted data keys and the length of each.
Content Type 1
Reserved 4
IV Length 1
Frame Length 4
Header Authentication Variable. Determined by the algorithm that generated the message.
Version

The version of this message format. The current version is 1.0, encoded as the byte 01 in hexadecimal notation.

Type

The type of this message format. The type indicates the kind of structure. The only supported type is described as customer authenticated encrypted data. Its type value is 128, encoded as byte 80 in hexadecimal notation.

Algorithm ID

An identifier for the algorithm used. It is a 2-byte value interpreted as a 16-bit unsigned integer. For more information about the algorithms, see AWS Encryption SDK Algorithms Reference.

Message ID

A randomly generated 128-bit value that identifies the message. The Message ID:

  • Uniquely identifies the encrypted message.

  • Weakly binds the message header to the message body.

  • Provides a mechanism to securely reuse a data key with multiple encrypted messages.

  • Protects against accidental reuse of a data key or the wearing out of keys in the AWS Encryption SDK.

AAD Length

The length of the additional authenticated data (AAD). It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of bytes that contain the AAD.

AAD

The additional authenticated data. The AAD is an encoding of the encryption context, an array of key-value pairs where each key and value is a string of UTF-8 encoded characters. The encryption context is converted to a sequence of bytes and used for the AAD value.

When the algorithms with signing are used, the encryption context must contain the key-value pair {'aws-crypto-public-key', Qtxt}. Qtxt represents the elliptic curve point Q compressed according to SEC 1 version 2.0 and then base64-encoded. The encryption context can contain additional values, but the maximum length of the constructed AAD is 2^16 - 1 bytes.

The following table describes the fields that form the AAD. Key-value pairs are sorted, by key, in ascending order according to UTF-8 character code. The bytes are appended in the order shown.

AAD Structure

Field Length, in bytes
Key-Value Pair Count 2
Key Length 2
Key Variable. Equal to the value specified in the previous 2 bytes (Key Length).
Value Length 2
Value Variable. Equal to the value specified in the previous 2 bytes (Value Length).
Key-Value Pair Count

The number of key-value pairs in the AAD. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of key-value pairs in the AAD. The maximum number of key-value pairs in the AAD is 2^16 - 1.

Key Length

The length of the key for the key-value pair. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of bytes that contain the key.

Key

The key for the key-value pair. It is a sequence of UTF-8 encoded bytes.

Value Length

The length of the value for the key-value pair. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of bytes that contain the value.

Value

The value for the key-value pair. It is a sequence of UTF-8 encoded bytes.

Encrypted Data Key Count

The number of encrypted data keys. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of encrypted data keys.

Encrypted Data Key(s)

A sequence of encrypted data keys. The length of the sequence is determined by the number of encrypted data keys and the length of each. The sequence contains at least one encrypted data key.

The following table describes the fields that form each encrypted data key. The bytes are appended in the order shown.

Encrypted Data Key Structure

Field Length, in bytes
Key Provider ID Length 2
Key Provider ID Variable. Equal to the value specified in the previous 2 bytes (Key Provider ID Length).
Key Provider Information Length 2
Key Provider Information Variable. Equal to the value specified in the previous 2 bytes (Key Provider Information Length).
Encrypted Data Key Length 2
Encrypted Data Key Variable. Equal to the value specified in the previous 2 bytes (Encrypted Data Key Length).
Key Provider ID Length

The length of the key provider identifier. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of bytes that contain the key provider ID.

Key Provider ID

The key provider identifier. It is used to indicate the provider of the encrypted data key and intended to be extensible.

Key Provider Information Length

The length of the key provider information. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of bytes that contain the key provider information.

Key Provider Information

The key provider information. It is determined by the key provider.

When AWS KMS is the key provider, the following are true:

  • This value contains the Amazon Resource Name (ARN) of the AWS KMS customer master key (CMK).

  • This value is always the full CMK ARN, regardless of which key identifier (key ID, alias, etc.) was specified when calling the master key provider.

Encrypted Data Key Length

The length of the encrypted data key. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of bytes that contain the encrypted data key.

Encrypted Data Key

The encrypted data key. It is the data encryption key encrypted by the key provider.

Content Type

The type of encrypted content, either non-framed or framed.

Non-framed content is not broken into parts; it is a single encrypted blob. Non-framed content is type 1, encoded as the byte 01 in hexadecimal notation.

Framed content is broken into equal-length parts; each part is encrypted separately. Framed content is type 2, encoded as the byte 02 in hexadecimal notation.

Reserved

A reserved sequence of 4 bytes. This value must be 0. It is encoded as the bytes 00 00 00 00 in hexadecimal notation (that is, a 4-byte sequence of a 32-bit integer value equal to 0).

IV Length

The length of the initialization vector (IV). It is a 1-byte value interpreted as an 8-bit unsigned integer that specifies the number of bytes that contain the IV. This value is determined by the IV bytes value of the algorithm that generated the message.

Frame Length

The length of each frame of framed content. It is a 4-byte value interpreted as a 32-bit unsigned integer that specifies the number of bytes that form each frame. When the content is non-framed—that is, when the value of the content type field is 1—this value must be 0.

Header Authentication

The header authentication is determined by the algorithm that generated the message. The header authentication is calculated over the entire header up to, but not including, the header authentication structure. It consists of an IV and an authentication tag. The bytes are appended in the order shown.

Header Authentication Structure

Field Length, in bytes
IV Variable. Determined by the IV bytes value of the algorithm that generated the message.
Authentication Tag Variable. Determined by the authentication tag bytes value of the algorithm that generated the message.
IV

The initialization vector (IV) used to calculate the header authentication tag. It is a unique value generated only for this use.

Authentication Tag

The authentication value for the header. It is used to authenticate the header fields up to, but not including, the header authentication structure.

Body Structure

The message body contains the encrypted data, called the ciphertext. The structure of the body depends on the content type (non-framed or framed). The following sections describe the format of the message body for each content type.

Non-Framed Data

Non-framed data is encrypted in a single blob with a unique IV and body AAD. The following table describes the fields that form non-framed data. The bytes are appended in the order shown.

Non-Framed Body Structure

Field Length, in bytes
IV Variable. Equal to the value specified in the IV Length byte of the header.
Encrypted Content Length 8
Encrypted Content Variable. Equal to the value specified in the previous 8 bytes (Encrypted Content Length).
Authentication Tag Variable. Determined by the algorithm implementation used.
IV

The initialization vector (IV) to use with the encryption algorithm.

Encrypted Content Length

The length of the encrypted content, or ciphertext. It is an 8-byte value interpreted as a 64-bit unsigned integer that specifies the number of bytes that contain the encrypted content.

Technically, the maximum allowed value is 2^63 - 1, or 8 exbibytes (8 EiB). However, in practice the maximum value is 2^36 - 32, or 64 gibibytes (64 GiB), due to restrictions imposed by the implemented algorithms.

Note

The Java implementation of this SDK further restricts this value to 2^31 - 1, or 2 gibibytes (2 GiB), due to restrictions in the language.

Encrypted Content

The encrypted content (ciphertext) as returned by the encryption algorithm.

Authentication Tag

The authentication value for the body. It is used to authenticate the body fields up to, but not including, the authentication tag itself.

Framed Data

Framed data is divided into equal-length parts, except for the last part. Each frame is encrypted separately with a unique IV and body AAD.

There are two kinds of frames: regular and final. A final frame is always used, even when the content fits into a single regular frame. In that case, the final frame contains no data—that is, a content length of 0.

The following tables describe the fields that form the frames. The bytes are appended in the order shown.

Framed Body Structure, Regular Frame

Field Length, in bytes
Sequence Number 4
IV Variable. Equal to the value specified in the IV Length byte of the header.
Encrypted Content Variable. Equal to the value specified in the Frame Length of the header.
Authentication Tag Variable. Determined by the algorithm used, as specified in the Algorithm ID of the header.
Sequence Number

The frame sequence number. It is an incremental counter number for the frame. It is a 4-byte value interpreted as a 32-bit unsigned integer that specifies the number of bytes that contain the encrypted content.

Framed data must start at sequence number 1. Subsequent frames must be in order and must contain an increment of 1 of the previous frame. Otherwise, the decryption process stops and reports an error.

IV

The initialization vector (IV) for the frame. The IV is a randomly generated value of length specified by the algorithm used.

Encrypted Content

The encrypted content (ciphertext) for the frame, as returned by the encryption algorithm.

Authentication Tag

The authentication value for the frame. It is used to authenticate the frame fields up to, but not including, the authentication tag itself.

Framed Body Structure, Final Frame

Field Length, in bytes
Sequence Number End 4
Sequence Number 4
IV Variable. Equal to the value specified in the IV Length byte of the header.
Encrypted Content Length 4
Encrypted Content Variable. Equal to the value specified in the previous 4 bytes (Encrypted Content Length).
Authentication Tag Variable. Determined by the algorithm used, as specified in the Algorithm ID of the header.
Sequence Number End

An indicator for the final frame. The value is encoded as the 4 bytes FF FF FF FF in hexadecimal notation.

Sequence Number

The frame sequence number. It is an incremental counter number for the frame. It is a 4-byte value interpreted as a 32-bit unsigned integer that specifies the number of bytes that contain the encrypted content.

Framed data must start at sequence number 1. Subsequent frames must be in order and must contain an increment of 1 of the previous frame. Otherwise, the decryption process stops and reports an error.

IV

The initialization vector (IV) for the frame. The IV is a randomly generated value of length specified by the algorithm used.

Encrypted Content Length

The length of the encrypted content. It is a 4-byte value interpreted as a 32-bit unsigned integer that specifies the number of bytes that contain the encrypted content for the frame.

Encrypted Content

The encrypted content (ciphertext) for the frame, as returned by the encryption algorithm.

Authentication Tag

The authentication value for the frame. It is used to authenticate the frame fields up to, but not including, the authentication tag itself.

When the algorithms with signing are used, the message format contains a footer. The message footer contains a signature calculated over the message header and body. The following table describes the fields that form the footer. The bytes are appended in the order shown.

Footer Structure

Field Length, in bytes
Signature Length 2
Signature Variable. Equal to the value specified in the previous 2 bytes (Signature Length).
Signature Length

The length of the signature. It is a 2-byte value interpreted as a 16-bit unsigned integer that specifies the number of bytes that contain the signature.

Signature

The signature. It is used to authenticate the header and body of the message.