Understand custom prefixes for Amazon S3 objects
Objects delivered to Amazon S3 follow the name format of <evaluated prefix><suffix>.
You can specify your custom prefix that includes expressions that are evaluated at runtime. Custom prefix you specify will override the default prefix of YYYY/MM/dd/HH
.
You can use expressions of the following forms in your custom prefix:
!{namespace:
, where
value
}namespace
can be one of the following, as explained in the following
sections.
-
firehose
-
timestamp
-
partitionKeyFromQuery
-
partitionKeyFromLambda
If a prefix ends with a slash, it appears as a folder in the Amazon S3 bucket. For more information, see Amazon S3 Object Name Format in the Amazon Data FirehoseDeveloper Guide.
timestamp
namespace
Valid values for this namespace are strings that are valid Java DateTimeFormatter!{timestamp:yyyy}
evaluates to 2018
.
When evaluating timestamps, Firehose uses the approximate arrival timestamp of the oldest record that's contained in the Amazon S3 object being written.
By default, timestamp is in UTC. But, you can specify a time zone that you prefer. For example, you can configure the time zone to Asia/Tokyo in the AWS Management Console or in API parameter setting (CustomTimeZone) if you want to use Japan Standard Time instead of UTC. To see the list of supported time zones, see Amazon S3 Object Name Format.
If you use the timestamp
namespace more than once in the same prefix
expression, every instance evaluates to the same instant in time.
firehose
namespace
There are two values that you can use with this namespace:
error-output-type
and random-string
. The following table
explains how to use them.
Conversion | Description | Example input | Example output | Notes |
---|---|---|---|---|
error-output-type |
Evaluates to one of the following strings, depending on the
configuration of your Firehose stream, and the reason of failure:
{processing-failed, AmazonOpenSearchService-failed, splunk-failed,
format-conversion-failed, http-endpoint-failed}. If you use it more than once in the same expression, every instance evaluates to the same error string.. |
myPrefix/result=!{firehose:error-output-type}/!{timestamp:yyyy/MM/dd} |
myPrefix/result=processing-failed/2018/08/03 |
The error-output-type value can only be used in the ErrorOutputPrefix field. |
random-string |
Evaluates to a random string of 11 characters. If you use it more than once in the same expression, every instance evaluates to a new random string. |
myPrefix/!{firehose:random-string}/ |
myPrefix/046b6c7f-0b/ |
You can use it with both prefix types. You can place it at the beginning of the format string to get a randomized prefix, which is sometimes necessary for attaining extremely high throughput with Amazon S3. |
partitionKeyFromLambda
and partitionKeyFromQuery
namespaces
For dynamic partitioning, you must use the following expression format in your
S3 bucket prefix: !{namespace:value}
, where namespace can be either
partitionKeyFromQuery
or partitionKeyFromLambda
, or both.
If you are using inline parsing to create the partitioning keys for your source data,
you must specify an S3 bucket prefix value that consists of expressions specified in the
following format: "partitionKeyFromQuery:keyID"
. If you are using an AWS
Lambda function to create partitioning keys for your source data, you must specify an S3
bucket prefix value that consists of expressions specified in the following format:
"partitionKeyFromLambda:keyID"
. For more information, see the "Choose
Amazon S3 for Your Destination" in Creating an
Amazon Firehose stream.
Semantic rules
The following rules apply to Prefix
and ErrorOutputPrefix
expressions.
-
For the
timestamp
namespace, any character that isn't in single quotes is evaluated. In other words, any string escaped with single quotes in the value field is taken literally. -
If you specify a prefix that doesn't contain a timestamp namespace expression, Firehose appends the expression
!{timestamp:yyyy/MM/dd/HH/}
to the value in thePrefix
field. -
The sequence
!{
can only appear in!{namespace:
expressions.value
} -
ErrorOutputPrefix
can be null only ifPrefix
contains no expressions. In this case,Prefix
evaluates to<specified-prefix>yyyy/MM/DDD/HH/
andErrorOutputPrefix
evaluates to<specified-prefix><error-output-type>YYYY/MM/DDD/HH/
.DDD
represents the day of the year. -
If you specify an expression for
ErrorOutputPrefix
, you must include at least one instance of!{firehose:error-output-type}
. -
Prefix
can't contain!{firehose:error-output-type}
. -
Neither
Prefix
norErrorOutputPrefix
can be greater than 512 characters after they're evaluated. -
If the destination is Amazon Redshift,
Prefix
must not contain expressions andErrorOutputPrefix
must be null. -
When the destination is Amazon OpenSearch Service or Splunk, and no
ErrorOutputPrefix
is specified, Firehose uses thePrefix
field for failed records. -
When the destination is Amazon S3, the
Prefix
andErrorOutputPrefix
in the Amazon S3 destination configuration are used for successful records and failed records, respectively. If you use the AWS CLI or the API, you can useExtendedS3DestinationConfiguration
to specify an Amazon S3 backup configuration with its ownPrefix
andErrorOutputPrefix
. -
When you use the AWS Management Console and set the destination to Amazon S3, Firehose uses the
Prefix
andErrorOutputPrefix
in the destination configuration for successful records and failed records, respectively. If you specify a prefix using expressions, you must specify the error prefix including!{firehose:error-output-type}
. -
When you use
ExtendedS3DestinationConfiguration
with the AWS CLI, the API, or AWS CloudFormation, if you specify aS3BackupConfiguration
, Firehose doesn't provide a defaultErrorOutputPrefix
. -
You cannot use
partitionKeyFromLambda
andpartitionKeyFromQuery
namespaces when creating ErrorOutputPrefix expressions.
Example prefixes
Input | Evaluated prefix (at 10:30 AM UTC on Aug 27, 2018) |
---|---|
|
|
|
Invalid input: ErrorOutputPrefix can't be null when
Prefix contains expressions |
|
|
|
|
|
|