Specify agent configuration settings - Amazon Data Firehose

Specify agent configuration settings

The agent supports two mandatory configuration settings, filePattern and deliveryStream, plus optional configuration settings for additional features. You can specify both mandatory and optional configuration settings in /etc/aws-kinesis/agent.json.

Whenever you change the configuration file, you must stop and start the agent, using the following commands:

sudo service aws-kinesis-agent stop sudo service aws-kinesis-agent start

Alternatively, you could use the following command:

sudo service aws-kinesis-agent restart

The following are the general configuration settings.

Configuration Setting Description
assumeRoleARN

The Amazon Resource Name (ARN) of the role to be assumed by the user. For more information, see Delegate Access Across AWS Accounts Using IAM Roles in the IAM User Guide.

assumeRoleExternalId

An optional identifier that determines who can assume the role. For more information, see How to Use an External ID in the IAM User Guide.

awsAccessKeyId

AWS access key ID that overrides the default credentials. This setting takes precedence over all other credential providers.

awsSecretAccessKey

AWS secret key that overrides the default credentials. This setting takes precedence over all other credential providers.

cloudwatch.emitMetrics

Enables the agent to emit metrics to CloudWatch if set (true).

Default: true

cloudwatch.endpoint

The regional endpoint for CloudWatch.

Default: monitoring.us-east-1.amazonaws.com

firehose.endpoint

The regional endpoint for Amazon Data Firehose.

Default: firehose.us-east-1.amazonaws.com

sts.endpoint

The regional endpoint for the AWS Security Token Service.

Default: https://sts.amazonaws.com

userDefinedCredentialsProvider.classname If you define a custom credentials provider, provide its fully-qualified class name using this setting. Don't include .class at the end of the class name.
userDefinedCredentialsProvider.location If you define a custom credentials provider, use this setting to specify the absolute path of the jar that contains the custom credentials provider. The agent also looks for the jar file in the following location: /usr/share/aws-kinesis-agent/lib/.

The following are the flow configuration settings.

Configuration Setting Description
aggregatedRecordSizeBytes

To make the agent aggregate records and then put them to the Firehose stream in one operation, specify this setting. Set it to the size that you want the aggregate record to have before the agent puts it to the Firehose stream.

Default: 0 (no aggregation)

dataProcessingOptions

The list of processing options applied to each parsed record before it is sent to the Firehose stream. The processing options are performed in the specified order. For more information, see Pre-process data with Agents.

deliveryStream

[Required] The name of the Firehose stream.

filePattern

[Required] A glob for the files that need to be monitored by the agent. Any file that matches this pattern is picked up by the agent automatically and monitored. For all files matching this pattern, grant read permission to aws-kinesis-agent-user. For the directory containing the files, grant read and execute permissions to aws-kinesis-agent-user.

Important

The agent picks up any file that matches this pattern. To ensure that the agent doesn't pick up unintended records, choose this pattern carefully.

initialPosition

The initial position from which the file started to be parsed. Valid values are START_OF_FILE and END_OF_FILE.

Default: END_OF_FILE

maxBufferAgeMillis

The maximum time, in milliseconds, for which the agent buffers data before sending it to the Firehose stream.

Value range: 1,000–900,000 (1 second to 15 minutes)

Default: 60,000 (1 minute)

maxBufferSizeBytes

The maximum size, in bytes, for which the agent buffers data before sending it to the Firehose stream.

Value range: 1–4,194,304 (4 MB)

Default: 4,194,304 (4 MB)

maxBufferSizeRecords

The maximum number of records for which the agent buffers data before sending it to the Firehose stream.

Value range: 1–500

Default: 500

minTimeBetweenFilePollsMillis

The time interval, in milliseconds, at which the agent polls and parses the monitored files for new data.

Value range: 1 or more

Default: 100

multiLineStartPattern

The pattern for identifying the start of a record. A record is made of a line that matches the pattern and any following lines that don't match the pattern. The valid values are regular expressions. By default, each new line in the log files is parsed as one record.

skipHeaderLines

The number of lines for the agent to skip parsing at the beginning of monitored files.

Value range: 0 or more

Default: 0 (zero)

truncatedRecordTerminator

The string that the agent uses to truncate a parsed record when the record size exceeds the Amazon Data Firehose record size limit. (1,000 KB)

Default: '\n' (newline)