AWS Lambda Event Sources

An event source mapping is an AWS Lambda resource that reads from an event source and invokes a Lambda function. You can use event source mappings to process items from a stream or queue in services that don’t invoke Lambda functions directly. Lambda provides event source mappings for the following services. Read more about lambda event sources here.

This module includes classes that allow using various AWS services as event sources for AWS Lambda via the high-level lambda.addEventSource(source) API.

NOTE: In most cases, it is also possible to use the resource APIs to invoke an AWS Lambda function. This library provides a uniform API for all Lambda event sources regardless of the underlying mechanism they use.

The following code sets up a lambda function with an SQS queue event source -

from aws_cdk.aws_lambda_event_sources import SqsEventSource

# fn: lambda.Function

queue = sqs.Queue(self, "MyQueue")
event_source = SqsEventSource(queue)
fn.add_event_source(event_source)

event_source_id = event_source.event_source_mapping_id
event_source_mapping_arn = event_source.event_source_mapping_arn

The eventSourceId property contains the event source id. This will be a token that will resolve to the final value at the time of deployment.

The eventSourceMappingArn property contains the event source mapping ARN. This will be a token that will resolve to the final value at the time of deployment.

SQS

Amazon Simple Queue Service (Amazon SQS) allows you to build asynchronous workflows. For more information about Amazon SQS, see Amazon Simple Queue Service. You can configure AWS Lambda to poll for these messages as they arrive and then pass the event to a Lambda function invocation. To view a sample event, see Amazon SQS Event.

To set up Amazon Simple Queue Service as an event source for AWS Lambda, you first create or update an Amazon SQS queue and select custom values for the queue parameters. The following parameters will impact Amazon SQS’s polling behavior:

  • visibilityTimeout: May impact the period between retries.

  • batchSize: Determines how many records are buffered before invoking your lambda function.

  • maxBatchingWindow: The maximum amount of time to gather records before invoking the lambda. This increases the likelihood of a full batch at the cost of delayed processing.

  • maxConcurrency: The maximum concurrency setting limits the number of concurrent instances of the function that an Amazon SQS event source can invoke.

  • enabled: If the SQS event source mapping should be enabled. The default is true.

from aws_cdk.aws_lambda_event_sources import SqsEventSource
# fn: lambda.Function


queue = sqs.Queue(self, "MyQueue",
    visibility_timeout=Duration.seconds(30)
)

fn.add_event_source(SqsEventSource(queue,
    batch_size=10,  # default
    max_batching_window=Duration.minutes(5),
    report_batch_item_failures=True
))

S3

You can write Lambda functions to process S3 bucket events, such as the object-created or object-deleted events. For example, when a user uploads a photo to a bucket, you might want Amazon S3 to invoke your Lambda function so that it reads the image and creates a thumbnail for the photo.

You can use the bucket notification configuration feature in Amazon S3 to configure the event source mapping, identifying the bucket events that you want Amazon S3 to publish and which Lambda function to invoke.

import aws_cdk.aws_s3 as s3
from aws_cdk.aws_lambda_event_sources import S3EventSource
# fn: lambda.Function


bucket = s3.Bucket(self, "mybucket")

fn.add_event_source(S3EventSource(bucket,
    events=[s3.EventType.OBJECT_CREATED, s3.EventType.OBJECT_REMOVED],
    filters=[s3.NotificationKeyFilter(prefix="subdir/")]
))

In the example above, S3EventSource is accepting Bucket type as parameter. However, Functions like from_bucket_name and from_bucket_arn will return IBucket and is not compliant with S3EventSource. If this is the case, please consider using S3EventSourceV2 instead, this class accepts IBucket.

import aws_cdk.aws_s3 as s3
from aws_cdk.aws_lambda_event_sources import S3EventSourceV2
# fn: lambda.Function


bucket = s3.Bucket.from_bucket_name(self, "Bucket", "amzn-s3-demo-bucket")

fn.add_event_source(S3EventSourceV2(bucket,
    events=[s3.EventType.OBJECT_CREATED, s3.EventType.OBJECT_REMOVED],
    filters=[s3.NotificationKeyFilter(prefix="subdir/")]
))

SNS

You can write Lambda functions to process Amazon Simple Notification Service notifications. When a message is published to an Amazon SNS topic, the service can invoke your Lambda function by passing the message payload as a parameter. Your Lambda function code can then process the event, for example publish the message to other Amazon SNS topics, or send the message to other AWS services.

This also enables you to trigger a Lambda function in response to Amazon CloudWatch alarms and other AWS services that use Amazon SNS.

For an example event, see Appendix: Message and JSON Formats and Amazon SNS Sample Event. For an example use case, see Using AWS Lambda with Amazon SNS from Different Accounts.

import aws_cdk.aws_sns as sns
from aws_cdk.aws_lambda_event_sources import SnsEventSource

# topic: sns.Topic

# fn: lambda.Function

dead_letter_queue = sqs.Queue(self, "deadLetterQueue")
fn.add_event_source(SnsEventSource(topic,
    filter_policy={},
    dead_letter_queue=dead_letter_queue
))

When a user calls the SNS Publish API on a topic that your Lambda function is subscribed to, Amazon SNS will call Lambda to invoke your function asynchronously. Lambda will then return a delivery status. If there was an error calling Lambda, Amazon SNS will retry invoking the Lambda function up to three times. After three tries, if Amazon SNS still could not successfully invoke the Lambda function, then Amazon SNS will send a delivery status failure message to CloudWatch.

DynamoDB Streams

You can write Lambda functions to process change events from a DynamoDB Table. An event is emitted to a DynamoDB stream (if configured) whenever a write (Put, Delete, Update) operation is performed against the table. See Using AWS Lambda with Amazon DynamoDB for more information about configuring Lambda function event sources with DynamoDB.

To process events with a Lambda function, first create or update a DynamoDB table and enable a stream specification. Then, create a DynamoEventSource and add it to your Lambda function. The following parameters will impact Amazon DynamoDB’s polling behavior:

  • batchSize: Determines how many records are buffered before invoking your lambda function - could impact your function’s memory usage (if too high) and ability to keep up with incoming data velocity (if too low).

  • bisectBatchOnError: If a batch encounters an error, this will cause the batch to be split in two and have each new smaller batch retried, allowing the records in error to be isolated.

  • reportBatchItemFailures: Allow functions to return partially successful responses for a batch of records.

  • maxBatchingWindow: The maximum amount of time to gather records before invoking the lambda. This increases the likelihood of a full batch at the cost of delayed processing.

  • maxRecordAge: The maximum age of a record that will be sent to the function for processing. Records that exceed the max age will be treated as failures.

  • onFailure: In the event a record fails after all retries or if the record age has exceeded the configured value, the record will be sent to SQS queue or SNS topic that is specified here

  • parallelizationFactor: The number of batches to concurrently process on each shard.

  • retryAttempts: The maximum number of times a record should be retried in the event of failure.

  • startingPosition: Will determine where to being consumption, either at the most recent (’LATEST’) record or the oldest record (’TRIM_HORIZON’). ‘TRIM_HORIZON’ will ensure you process all available data, while ‘LATEST’ will ignore all records that arrived prior to attaching the event source.

  • tumblingWindow: The duration in seconds of a processing window when using streams.

  • enabled: If the DynamoDB Streams event source mapping should be enabled. The default is true.

  • filters: Filters to apply before sending a change event from a DynamoDB table to a Lambda function. Events that are filtered out are not sent to the Lambda function.

import aws_cdk.aws_dynamodb as dynamodb
from aws_cdk.aws_lambda_event_sources import DynamoEventSource, SqsDlq

# table: dynamodb.Table

# fn: lambda.Function


dead_letter_queue = sqs.Queue(self, "deadLetterQueue")
fn.add_event_source(DynamoEventSource(table,
    starting_position=lambda_.StartingPosition.TRIM_HORIZON,
    batch_size=5,
    bisect_batch_on_error=True,
    on_failure=SqsDlq(dead_letter_queue),
    retry_attempts=10
))

The following code sets up a Lambda function with a DynamoDB event source. A filter is applied to only send DynamoDB events to the Lambda function when the id column is a boolean that equals true.

import aws_cdk.aws_dynamodb as dynamodb
from aws_cdk.aws_lambda_event_sources import DynamoEventSource

# table: dynamodb.Table

# fn: lambda.Function

fn.add_event_source(DynamoEventSource(table,
    starting_position=lambda_.StartingPosition.LATEST,
    filters=[
        lambda_.FilterCriteria.filter({
            "event_name": lambda_.FilterRule.is_equal("INSERT"),
            "dynamodb": {
                "NewImage": {
                    "id": {"BOOL": lambda_.FilterRule.is_equal(True)}
                }
            }
        })
    ]
))

Kinesis

You can write Lambda functions to process streaming data in Amazon Kinesis Streams. For more information about Amazon Kinesis, see Amazon Kinesis Service. To learn more about configuring Lambda function event sources with kinesis and view a sample event, see Amazon Kinesis Event.

To set up Amazon Kinesis as an event source for AWS Lambda, you first create or update an Amazon Kinesis stream and select custom values for the event source parameters. The following parameters will impact Amazon Kinesis’s polling behavior:

  • batchSize: Determines how many records are buffered before invoking your lambda function - could impact your function’s memory usage (if too high) and ability to keep up with incoming data velocity (if too low).

  • bisectBatchOnError: If a batch encounters an error, this will cause the batch to be split in two and have each new smaller batch retried, allowing the records in error to be isolated.

  • reportBatchItemFailures: Allow functions to return partially successful responses for a batch of records.

  • maxBatchingWindow: The maximum amount of time to gather records before invoking the lambda. This increases the likelihood of a full batch at the cost of possibly delaying processing.

  • maxRecordAge: The maximum age of a record that will be sent to the function for processing. Records that exceed the max age will be treated as failures.

  • onFailure: In the event a record fails and consumes all retries, the record will be sent to SQS queue or SNS topic that is specified here

  • parallelizationFactor: The number of batches to concurrently process on each shard.

  • retryAttempts: The maximum number of times a record should be retried in the event of failure.

  • startingPosition: Will determine where to begin consumption. ‘LATEST’ will start at the most recent record and ignore all records that arrived prior to attaching the event source, ‘TRIM_HORIZON’ will start at the oldest record and ensure you process all available data, while ‘AT_TIMESTAMP’ will start reading records from a specified time stamp. Note that ‘AT_TIMESTAMP’ is only supported for Amazon Kinesis streams.

  • startingPositionTimestamp: The time stamp from which to start reading. Used in conjunction with startingPosition when set to ‘AT_TIMESTAMP’.

  • tumblingWindow: The duration in seconds of a processing window when using streams.

  • enabled: If the DynamoDB Streams event source mapping should be enabled. The default is true.

import aws_cdk.aws_kinesis as kinesis
from aws_cdk.aws_lambda_event_sources import KinesisEventSource

# my_function: lambda.Function


stream = kinesis.Stream(self, "MyStream")
my_function.add_event_source(KinesisEventSource(stream,
    batch_size=100,  # default
    starting_position=lambda_.StartingPosition.TRIM_HORIZON
))

Kafka

You can write Lambda functions to process data either from Amazon MSK or a self managed Kafka cluster.

The following code sets up Amazon MSK as an event source for a lambda function. Credentials will need to be configured to access the MSK cluster, as described in Username/Password authentication.

from aws_cdk.aws_secretsmanager import Secret
from aws_cdk.aws_lambda_event_sources import ManagedKafkaEventSource

# my_function: lambda.Function


# Your MSK cluster arn
cluster_arn = "arn:aws:kafka:us-east-1:0123456789019:cluster/SalesCluster/abcd1234-abcd-cafe-abab-9876543210ab-4"

# The Kafka topic you want to subscribe to
topic = "some-cool-topic"

# The secret that allows access to your MSK cluster
# You still have to make sure that it is associated with your cluster as described in the documentation
secret = Secret(self, "Secret", secret_name="AmazonMSK_KafkaSecret")
my_function.add_event_source(ManagedKafkaEventSource(
    cluster_arn=cluster_arn,
    topic=topic,
    secret=secret,
    batch_size=100,  # default
    starting_position=lambda_.StartingPosition.TRIM_HORIZON
))

The following code sets up a self managed Kafka cluster as an event source. Username and password based authentication will need to be set up as described in Managing access and permissions.

from aws_cdk.aws_secretsmanager import Secret
from aws_cdk.aws_lambda_event_sources import SelfManagedKafkaEventSource

# The secret that allows access to your self hosted Kafka cluster
# secret: Secret

# my_function: lambda.Function


# The list of Kafka brokers
bootstrap_servers = ["kafka-broker:9092"]

# The Kafka topic you want to subscribe to
topic = "some-cool-topic"

# (Optional) The consumer group id to use when connecting to the Kafka broker. If omitted the UUID of the event source mapping will be used.
consumer_group_id = "my-consumer-group-id"
my_function.add_event_source(SelfManagedKafkaEventSource(
    bootstrap_servers=bootstrap_servers,
    topic=topic,
    consumer_group_id=consumer_group_id,
    secret=secret,
    batch_size=100,  # default
    starting_position=lambda_.StartingPosition.TRIM_HORIZON
))

If your self managed Kafka cluster is only reachable via VPC also configure vpc vpcSubnets and securityGroup.

You can specify event filtering for managed and self managed Kafka clusters using the filters property:

from aws_cdk.aws_lambda_event_sources import ManagedKafkaEventSource

# my_function: lambda.Function


# Your MSK cluster arn
cluster_arn = "arn:aws:kafka:us-east-1:0123456789019:cluster/SalesCluster/abcd1234-abcd-cafe-abab-9876543210ab-4"

# The Kafka topic you want to subscribe to
topic = "some-cool-topic"
my_function.add_event_source(ManagedKafkaEventSource(
    cluster_arn=cluster_arn,
    topic=topic,
    starting_position=lambda_.StartingPosition.TRIM_HORIZON,
    filters=[
        lambda_.FilterCriteria.filter({
            "string_equals": lambda_.FilterRule.is_equal("test")
        })
    ]
))

By default, Lambda will encrypt Filter Criteria using AWS managed keys. But if you want to use a self managed KMS key to encrypt the filters, You can specify the self managed key using the filterEncryption property.

from aws_cdk.aws_lambda_event_sources import ManagedKafkaEventSource
from aws_cdk.aws_kms import Key

# my_function: lambda.Function


# Your MSK cluster arn
cluster_arn = "arn:aws:kafka:us-east-1:0123456789019:cluster/SalesCluster/abcd1234-abcd-cafe-abab-9876543210ab-4"

# The Kafka topic you want to subscribe to
topic = "some-cool-topic"

# Your self managed KMS key
my_key = Key.from_key_arn(self, "SourceBucketEncryptionKey", "arn:aws:kms:us-east-1:123456789012:key/<key-id>")
my_function.add_event_source(ManagedKafkaEventSource(
    cluster_arn=cluster_arn,
    topic=topic,
    starting_position=lambda_.StartingPosition.TRIM_HORIZON,
    filters=[
        lambda_.FilterCriteria.filter({
            "string_equals": lambda_.FilterRule.is_equal("test")
        })
    ],
    filter_encryption=my_key
))

You can also specify an S3 bucket as an “on failure” destination:

from aws_cdk.aws_lambda_event_sources import ManagedKafkaEventSource, S3OnFailureDestination
from aws_cdk.aws_s3 import IBucket

# bucket: IBucket
# my_function: lambda.Function


# Your MSK cluster arn
cluster_arn = "arn:aws:kafka:us-east-1:0123456789019:cluster/SalesCluster/abcd1234-abcd-cafe-abab-9876543210ab-4"

# The Kafka topic you want to subscribe to
topic = "some-cool-topic"

s3_on_failure_destination = S3OnFailureDestination(bucket)

my_function.add_event_source(ManagedKafkaEventSource(
    cluster_arn=cluster_arn,
    topic=topic,
    starting_position=lambda_.StartingPosition.TRIM_HORIZON,
    on_failure=s3_on_failure_destination
))

Set configuration for provisioned pollers that read from the event source.

from aws_cdk.aws_lambda_event_sources import ManagedKafkaEventSource

# Your MSK cluster arn
# cluster_arn: str

# my_function: lambda.Function


# The Kafka topic you want to subscribe to
topic = "some-cool-topic"
my_function.add_event_source(ManagedKafkaEventSource(
    cluster_arn=cluster_arn,
    topic=topic,
    starting_position=lambda_.StartingPosition.TRIM_HORIZON,
    provisioned_poller_config=ProvisionedPollerConfig(
        minimum_pollers=1,
        maximum_pollers=3
    )
))

Roadmap

Eventually, this module will support all the event sources described under Supported Event Sources in the AWS Lambda Developer Guide.