Menu
Amazon Elasticsearch Service
Developer Guide (API Version 2015-01-01)

Loading Streaming Data into Amazon Elasticsearch Service

You can load streaming data into your Amazon ES domain from Amazon S3 buckets, Amazon Kinesis streams, Amazon DynamoDB Streams, and Amazon CloudWatch metrics. For example, to load streaming data from Amazon S3 and Amazon Kinesis, you use a Lambda function as an event handler in the AWS Cloud. The Lambda function responds to new data by processing it and streaming the data to your domain.

Streaming data provides fresh data for search and analytic queries. Amazon S3 pushes event notifications to AWS Lambda. For more information, see Using AWS Lambda with Amazon S3 in the AWS Lambda Developer Guide. Amazon Kinesis requires AWS Lambda to poll for, or pull, event notifications. For more information, see Using AWS Lambda with Amazon Kinesis.

You should be familiar with these service integrations before attempting to use them to load streaming data into your Amazon ES domain. For more information about these services, see the following AWS documentation:

Note

AWS Lambda is available in limited regions. For more information, see the list of AWS Lambda regions in the AWS General Reference.

Loading Streaming Data into Amazon ES from Amazon S3

You can integrate your Amazon ES domain with Amazon S3 and AWS Lambda. Any new data sent to an S3 bucket triggers an event notification to Lambda, which then runs your custom Java or Node.js application code. After your application processes the data, it streams the data to your domain. At a high level, setting up to load streaming data to Amazon ES requires the following steps:

You also must create an Amazon S3 bucket and an Amazon ES domain. Setting up this integration path has the following prerequisites.

Prerequisite Description
Amazon S3 Bucket The event source that triggers your Lambda function. For more information, see Create a Bucket in the Amazon Simple Storage Service Getting Started Guide. The bucket must reside in the same AWS Region as your Amazon ES domain.
Amazon ES Domain The destination for data after it is processed by the application code in your Lambda function. For more information, see Creating Amazon ES Domains.
Lambda Function The Java or Node.js application code that runs when S3 pushes an event notification to Lambda. Amazon ES provides a sample application in Node.js, s3_lambda_es.js, that you can download to get started. See the Lambda sample code for Amazon ES.
Lambda Deployment Package A .zip file that consists of your Java or Node.js application code and any dependencies. For information about the required folder hierarchy, see Creating a Lambda Deployment Package. For information about creating specific Lambda deployment packages, see Creating a Deployment Package (Node.js) and Creating a Deployment Package (Java).
Amazon ES Authorization An IAM access policy that permits Lambda to add data to your domain. Attach the policy to the Amazon S3 execution role that you create as part of your Lambda function. For details, see Granting Authorization to Add Data to Your Amazon ES Domain.

Setting Up to Load Streaming Data into Amazon ES from Amazon S3

This section provides additional details about setting up the prerequisites for loading streaming data into Amazon ES from Amazon S3. After you finish configuring the integration, data streams automatically to your Amazon ES domain whenever new data is added to your Amazon S3 bucket.

Creating a Lambda Deployment Package

Create a .zip file that contains your Lambda application code and any dependencies.

To create a deployment package:

  1. Create a directory structure like the following:

    Copy
    eslambda \node_modules

    This example uses eslambda for the name of the top-level folder, but you can use any name. However, the subfolder must be named node_modules.

  2. Place your application source code in the eslambda folder.

  3. Add or edit the following four global variables:

    • endpoint, the Amazon ES domain endpoint.

    • region, the AWS Region in which you created your Amazon ES domain.

    • index, the name of the Amazon ES index to use for data that is streamed from Amazon S3.

    • doctype, the Amazon ES document type of the streamed data. For more information, see Mapping Types in the Elasticsearch documentation.

    The following example from s3_lambda_es.js configures the sample application to use the streaming-logs domain endpoint in the us-east-1 AWS Region:

    Copy
    /* Globals */ var esDomain = { endpoint: 'search-streaming-logs-okga24ftzsbz2a2hzhsqw73jpy.us-east-1.es.example.com', region: 'us-east-1', index: 'streaming-logs', doctype: 'apache' };

  4. Install any dependencies that are required by your application.

    For example, if you use Node.js, you must execute the following command for each require statement in your application code:

    Copy
    npm install <dependency>

  5. Verify that all runtime dependencies that are required by your application code are located in the node_modules folder.

  6. Execute the following command to package the application code and dependencies:

    Copy
    zip -r eslambda.zip *

    The name of the zip file must match the top-level folder.

For more information about creating Lambda deployment packages, see Creating a Deployment Package (Node.js) and Creating a Deployment Package (Java).

Configuring a Lambda Function

Use AWS Lambda to create and configure your Lambda function. To do that, you can use either the AWS CLI or the AWS Lambda console. For a tutorial about creating and configuring a Lambda function using the AWS CLI, see Using AWS Lambda with Amazon S3. For configuration settings on the AWS Lambda console, see the following table.

Note

For more information about creating and configuring a Lambda function, see the AWS Lambda Developer Guide.

Function Configuration Description
IAM Execution Role The name of the IAM role that is used to execute actions on Amazon S3. While creating your Lambda function, the Lambda console automatically opens the IAM console to help you create the execution role. Later, you also must attach an IAM access policy to this role that permits Lambda to add data to your domain. For details, see Granting Authorization to Add Data to Your Amazon ES Domain.
Event Source Specifies the S3 bucket as the event source for the Lambda function. For instructions, see the Using AWS Lambda with Amazon S3 tutorial. AWS Lambda automatically adds the necessary permissions for Amazon S3 to invoke your Lambda function from this event source. Optionally, specify a file suffix to filter what kinds of files, such as .log, trigger the Lambda function.
Handler The name of the file that contains the application source code, but with the .handler file suffix. For example, if your application source code resides in a file named s3_lambda_es.js, you must configure the handler as s3_lambda_es.handler. For more information, see Getting Started in the AWS Lambda Developer Guide. Amazon ES provides a sample application in Node.js that you can download to get started: Lambda Sample Code for Amazon ES.
Timeout The length of time that Lambda should wait before canceling an invocation request. The default value of three seconds is too short for the Amazon ES use case. We recommend configuring your timeout for 10 seconds.

For more function configuration details, see Configuring a Lambda Function in this guide. For general information, see Lambda Functions in the AWS Lambda Developer Guide.

Granting Authorization to Add Data to Your Amazon ES Domain

When you choose S3 Execution Role as the IAM role to execute actions on S3, Lambda opens the IAM console and helps you to create a new execution role. Lambda automatically adds the necessary permissions to invoke your Lambda function from this event source. After you create the role, open it in the IAM console and attach the following IAM access policy to the role. This grants permissions to Lambda to stream data to Amazon ES:

Copy
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "es:*" ], "Effect": "Allow", "Resource": "arn:aws:es:us-west-2:123456789012:domain/streaming-logs/*" } ] }

For more information about attaching IAM access policies to roles, see Tutorial: Create and Attach Your First Customer Managed Policy in the IAM User Guide.

Loading Streaming Data into Amazon ES from Amazon Kinesis

You can load streaming data from Amazon Kinesis to Amazon ES. This integration relies on AWS Lambda as an event handler in the cloud. Amazon Kinesis requires Lambda to poll your Amazon Kinesis stream to determine whether it has new data that will automatically invoke your Lambda function. After your Lambda function finishes processing any new data, it streams the data to your Amazon ES domain.

At a high level, setting up to stream data to Amazon ES requires the following steps:

You also must create an Amazon Kinesis stream and an Amazon ES domain. Setting up this integration path has the following prerequisites.

Prerequisite Description
Amazon Kinesis Stream The event source for your Lambda function. For instructions about creating Amazon Kinesis streams, see Amazon Kinesis Streams.
Elasticsearch Domain The destination for data after it is processed by the application code in your Lambda function. For more information, see Creating Amazon ES Domains in this guide.
Lambda Function The Java or Node.js application code that runs when Amazon Kinesis pushes an event notification to Lambda. Amazon ES provides a sample application in Node.js, kinesis_lambda_es.js, that you can download to get started: Lambda Sample Code for Amazon ES.
Lambda Deployment Package A .zip file that consists of your Java or Node.js application code and any dependencies. For information about the required folder hierarchy, see Creating a Lambda Deployment Package. For general information about creating Lambda deployment packages, see Creating a Deployment Package (Node.js) and Creating a Deployment Package (Java).
Amazon ES Authorization

An IAM access policy that permits Lambda to add data to your domain. Attach the policy to the Amazon Kinesis execution role that you create as part of your Lambda function. For details, see Granting Authorization to Add Data to Your Amazon ES Domain.

Setting Up to Load Streaming Data into Amazon ES from Amazon Kinesis

This section provides more details about setting up the prerequisites for loading streaming data from Amazon Kinesis into Amazon ES. After you finish configuring the integration, Lambda automatically streams data to your Amazon ES domain whenever new data is added to your Amazon Kinesis stream. You can also view this video to learn how to use Amazon Kinesis Firehose to load streaming data into Amazon ES.

Creating a Lambda Deployment Package

Create a .zip file that contains your Lambda application code and any dependencies.

To create a deployment package:

  1. Create a directory structure like the following:

    Copy
    eslambda \node_modules

    You can use any name for the top-level folder rather than eslambda. However, you must name the subfolder node_modules.

  2. Place your application source code in the eslambda folder.

  3. Add or edit the following global variables in your sample application:

    • endpoint, the Amazon ES domain endpoint.

    • region, the AWS Region in which you created your Amazon ES domain.

    • index, the name of the Amazon ES index to use for data that is streamed from Amazon Kinesis.

    • doctype, the Amazon ES document type of the streamed data. For more information, see Mapping Types in the Elasticsearch documentation.

    The following example from kinesis_lambda_es.js configures the sample application to use the streaming-logs Amazon ES domain endpoint in the us-east-1 AWS Region.

    Copy
    /* Globals */ var esDomain = { endpoint: 'search-streaming-logs-okga24ftzsbz2a2hzhsqw73jpy.us-east-1.es.example.com', region: 'us-east-1', index: 'streaming-logs', doctype: 'apache' };

  4. Install any dependencies that are required by your application.

    For example, if you use Node.js, you must execute the following command for each require statement in your application code:

    Copy
    npm install <dependency>

  5. Verify that all runtime dependencies that are required by your application code are located in the node_modules folder.

  6. Execute the following command to package the application code and dependencies:

    Copy
    zip -r eslambda.zip *

    The name of the zip file must match the top-level folder.

For more information about creating Lambda deployment packages, see Creating a Deployment Package (Node.js) and Creating a Deployment Package (Java).

Configuring a Lambda Function

Use AWS Lambda to create and configure your Lambda function. To do that, you can use either the AWS CLI or the Creating a Deployment Package (Node.js) console. For a tutorial about creating and configuring a Lambda function using the AWS CLI, see Using AWS Lambda with Amazon Kinesis. For configuration settings on the AWS Lambda console, see the following table.

Note

For more information about creating and configuring a Lambda function, see the Getting Started tutorial in the AWS Lambda Developer Guide.

Configuration Description
Amazon Kinesis stream The event source of your Lambda function. For instructions, see Amazon Kinesis Streams.
IAM execution role The name of the IAM role that is used to execute actions on Amazon Kinesis. While configuring your Lambda function, the Lambda console automatically opens the IAM console to help you create the execution role. Later, you also must attach an IAM access policy to this role that permits Lambda to send data to your Amazon ES domain. For details, see Granting Authorization to Add Data to Your Amazon ES Domain.
Handler The name of the file that contains the application source code, but with the .handler file suffix. For example, if your application source code is in a file named kinesis_lambda_es.js, you must configure the handler as kinesis_lambda_es.handler. For more information, see Lambda Function Handler. Amazon ES provides a sample application in Node.js that you can download to get started: Lambda Sample Code for Amazon ES.
Timeout The length of time that Lambda should wait before canceling an invocation request. The default value of three seconds is too short for this use case. We recommend configuring your timeout for 10 seconds.

For more information, see Lambda Functions in the AWS Lambda Developer Guide.

Granting Authorization to Add Data to Your Amazon ES Domain

When you choose Kinesis Execution Role as the IAM role to execute actions on Amazon Kinesis, Lambda opens the IAM console and requires you to create a new execution role. Lambda automatically adds the necessary permissions to invoke your Lambda function from this event source. After you create the role, open it in the IAM console and attach the following IAM access policy to the role so that Lambda has permissions to stream data to Amazon ES:

Copy
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "es:*" ], "Effect": "Allow", "Resource": "arn:aws:es:us-west-2:123456789012:domain/streaming-logs/*" } ] }

For more information about attaching IAM access policies to roles, see Tutorial: Create and Attach Your First Customer Managed Policy in the IAM User Guide.

Loading Streaming Data into Amazon ES from Amazon DynamoDB

You can load streaming data from Amazon DynamoDB Streams to your Amazon ES domain. To do that, use the DynamoDB Logstash input plugin and the Amazon ES Logstash output plugin. For instructions, see Logstash Plugin for Amazon DynamoDB in the Amazon DynamoDB Developer Guide.

Loading Streaming Data into Amazon ES from Amazon CloudWatch

You can load streaming data from CloudWatch Logs to your Amazon ES domain by using a CloudWatch Logs subscription. For information about Amazon CloudWatch subscriptions, see Real-time Processing of Log Data with Subscriptions. For configuration information, see Streaming CloudWatch Logs Data to Amazon Elasticsearch Service in the Amazon CloudWatch Developer Guide.