Menu
Amazon Elasticsearch Service
Developer Guide (API Version 2015-01-01)

What Is Amazon Elasticsearch Service?

Amazon Elasticsearch Service (Amazon ES) is a managed service that makes it easy to create a domain and deploy, operate, and scale Elasticsearch clusters in the AWS Cloud. Elasticsearch is a popular open-source search and analytics engine for use cases such as log analytics, real-time application monitoring, and clickstream analytics. With Amazon ES, you get direct access to Elasticsearch open-source APIs so that existing code and applications work seamlessly together. To learn more about Elasticsearch and its uses, see Getting Started in the Elasticsearch Reference.

Amazon ES provisions all the resources for your Elasticsearch cluster and launches the cluster. Amazon ES also automatically detects and replaces failed Elasticsearch nodes, reducing the overhead associated with self-managed infrastructures. You can scale your cluster with a single API call or a few clicks in the console.

To get started using the service, you create an Amazon ES domain. An Amazon ES domain is an Elasticsearch cluster in the AWS Cloud that has the compute and storage resources that you specify. For example, you can specify the number of instances, instance types, and storage options.

Additionally, Amazon ES offers the following benefits of a managed service:

  • Cluster scaling options

  • Self-healing clusters

  • Replication for high availability

  • Data durability

  • Enhanced security

  • Node monitoring

You can use the Amazon ES console to set up and configure your domain in minutes. If you prefer programmatic access, you can use the AWS SDKs or the AWS CLI.

There are no upfront costs to set up clusters, and you pay only for the service resources that you use.

Features of Amazon Elasticsearch Service

Amazon ES includes the following features:

  • Multiple configurations of CPU, memory, and storage capacity, known as instance types

  • Storage volumes for your data using Amazon EBS volumes

  • Multiple geographical locations for your resources, known as regions and Availability Zones

  • Cluster node allocation across two Availability Zones in the same region, known as zone awareness

  • Security with AWS Identity and Access Management (IAM) access control

  • Dedicated master nodes to improve cluster stability

  • Domain snapshots to back up and restore Amazon ES domains and replicate domains across Availability Zones

  • Data visualization using the Kibana tool

  • Integration with Amazon CloudWatch for monitoring Amazon ES domain metrics

  • Integration with AWS CloudTrail for auditing configuration API calls to Amazon ES domains

  • Integration with Amazon S3, Amazon Kinesis, and Amazon DynamoDB for loading streaming data into Amazon ES

Supported Elasticsearch Versions

Amazon ES currently supports five Elasticsearch versions:

Compared to earlier versions of Elasticsearch, the 5.x versions offer powerful features that make them faster, more secure, and easier to use. Here are some highlights:

  • Support for Painless scripting – Painless is an Elasticsearch built-in scripting language. Painless lets you run advanced queries against your data and automate operations like partial index updates in a fast, highly secure way.

  • Higher indexing performance – The 5.x versions of Elasticsearch provide better indexing capabilities that significantly increase the throughput of data updates.

  • Improved aggregations – The 5.x versions of Elasticsearch offer several aggregation improvements, such as recalculating aggregations only when the data changes. These versions also deliver faster query performance.

For more information about the differences among Elasticsearch versions, see the Elasticsearch documentation. For information about the Elasticsearch APIs that Amazon ES supports, see Supported Elasticsearch Operations.

If you start a new Elasticsearch project, we strongly recommend that you choose the latest supported Elasticsearch version. If you have an existing domain that uses an older Elasticsearch version, you can choose to keep the domain or migrate your data. For more information, see Migrating to a Different Elasticsearch Version.

Getting Started with Amazon Elasticsearch Service

To get started, sign up for an AWS account if you don't already have one. For more information, see Signing Up for AWS.

After you are set up with an account, complete the Getting Started tutorial for Amazon Elasticsearch Service. Consult the following introductory topics if you need more information while learning about the service.

Get Up and Running

Learn the Basics

Choose Instance Types and Storage

Stay Secure

Signing Up for AWS

If you're not already an AWS customer, your first step is to create an AWS account. If you already have an AWS account, you are automatically signed up for Amazon ES. Your AWS account enables you to access Amazon ES and other services in the AWS platform, such as Amazon S3 and Amazon EC2. There are no sign-up fees, and you don't incur charges until you create a domain. As with other AWS services, you pay only for the resources that you use.

To create an AWS account

  1. Go to https://aws.amazon.com, and then choose Sign In to the Console.

  2. To sign up, follow the instructions. You must enter payment information before you can use Amazon ES.

Accessing Amazon Elasticsearch Service

You can access Amazon ES through the Amazon ES console, the AWS SDKs, or the AWS CLI.

  • The Amazon ES console lets you create, configure, and monitor your domains. Using the console is the easiest way to get started with Amazon ES.

  • The AWS SDKs support all the Amazon ES API operations, making it easy to manage your domains using your preferred technology. The SDKs automatically sign requests as needed using your AWS credentials.

  • The AWS CLI wraps all the Amazon ES API operations, providing a simple way to create and configure domains. The AWS CLI automatically signs requests as needed using your AWS credentials.

For information about Elasticsearch APIs and features, see the Elasticsearch documentation.

Regions and Endpoints for Amazon Elasticsearch Service

Amazon ES provides regional endpoints for accessing the configuration API and domain-specific endpoints for accessing the search API. You use the configuration service to create and manage your domains. The region-specific configuration service endpoints have this format:

es.region.amazonaws.com

For example, es.us-east-1.amazonaws.com. For a list of supported regions, see Regions and Endpoints in the AWS General Reference.

Amazon ES provides a single service endpoint for both search and data services:

  • http://search-domainname-domainid.us-east-1.es.amazonaws.com

You use a domain's search endpoint to upload data and submit search requests.

Scaling in Amazon Elasticsearch Service

A domain has one or more Elasticsearch instances, each with a finite amount of RAM, CPU, and storage resources for indexing data and processing requests. The number of instances that you need for your domain depends on the documents in your collection and the volume and complexity of your Elasticsearch requests.

When you create a domain, you choose an initial number of Elasticsearch instances and an instance type. However, these initial choices might not be adequate as the quantity and size of data increase and as Elasticsearch requests increase in number and complexity. You can accommodate the growth by scaling your Amazon ES domain. The following table provides guidelines for scaling a domain.

Domain Change Scaling Guidelines
Increase in data quantity or increase in data size

Use the following guidelines to scale for both increased data quantity and data size:

  • Choose a larger instance type or add instances

  • Increase the size of the EBS volume

Increase in traffic due to Elasticsearch request volume and complexity

Use the following guidelines to scale for increased traffic:

  • Choose a larger instance type

  • Add instances

  • Add replica shards

Replica shards provide failover. If a cluster node that contains a primary shard fails, Amazon ES promotes a replica shard to a primary shard. For more information about replica shards, see Shards and Replicas in the Elasticsearch documentation.

Scaling for Increased Data

Each Amazon ES domain has one or more search indices. The index stores data in one or more shards that are distributed across the search instances in your cluster. As your cluster grows, Amazon ES automatically migrates shards between search instances. However, the number of primary shards is fixed when the index is created. The number of primary shards defines the maximum amount of data that can be stored in an index. For more information about indices and shards, see Add an Index in the Elasticsearch documentation.

If you add instances, Amazon ES distributes index shards among the available search instances. For more information, see Scale Horizontally in the Elasticsearch documentation. Choosing a larger instance type provides larger local storage for your cluster. Use larger EBS volumes to accommodate larger indices.

Scaling for Increased Traffic

Search and document retrieval requests can be served by primary or replica shards. The more replica shards a cluster has, the more search requests the cluster can handle. Larger instance types have more hardware resources, such as RAM and CPU, which allow each shard to perform better. For more information, see Shards and Replicas in the Elasticsearch documentation.

Signing Service Requests

If you use a programming language that AWS provides an SDK for, we recommend that you use the SDK to submit HTTP requests to AWS. The AWS SDKs greatly simplify the process of signing requests, and save you a significant amount of time compared to natively accessing the Elasticsearch APIs. The SDKs integrate easily with your development environment and provide easy access to related commands. You also can use the Amazon ES console and AWS CLI to submit signed requests with no additional effort.

If you choose to call the Elasticsearch APIs directly, you must sign your own requests. Configuration service requests must always be signed. All requests must be signed unless you configure anonymous access for those services. Use the following procedure to sign a request:

  1. Calculate a digital signature using a cryptographic hash function. The input must include the text of your request and your secret access key.

    The function returns a hash value based on your input.

  2. Include the digital signature in the Authorization header of your request.

    The service recalculates the signature using the same hash function and input that you used. If the resulting signature matches the signature in the request, the service processes the request. Otherwise, the service rejects the request.

Amazon ES supports authentication using AWS Signature Version 4. For more information, see Signature Version 4 Signing Process.

Note

The service ignores parameters passed in URLs for HTTP POST requests that are signed with Signature Version 4.

Choosing an Instance Type

An instance type defines the memory, CPU, storage capacity, and hourly cost for an instance, the Amazon Machine Image (AMI) that runs as a virtual server in the AWS Cloud. Choose the instance type and the number of instances based on the anticipated size of the Elasticsearch indices, shards, and replicas that you intend to create on your cluster.

For general information about instance types, see Instance Types in the Amazon EC2 documentation. To see a list of the instance types that Amazon ES supports, see Supported Instance Types.

For information about charges that you incur if you change the configuration of a cluster, see Pricing for Amazon Elasticsearch Service.

Using Amazon EBS Volumes for Storage

You have the option of configuring your Amazon ES domain to use an Amazon EBS volume for storing indices rather than the default storage provided by the instance. An Amazon EBS volume is a durable, block-level storage device that you can attach to a single instance. Amazon ES supports the following EBS volume types:

  • Magnetic

  • General Purpose (SSD)

  • Provisioned IOPS (SSD)

For an overview, see Amazon EBS Volumes in the Amazon EC2 documentation. For procedures that show you how to use Amazon EBS volumes for your Amazon ES domain, see Configuring EBS-based Storage. For information about the minimum and maximum size of supported EBS volumes in an Amazon ES domain, see EBS Volume Size Limits.

Amazon ES commonly is used with the following services:

AWS CloudTrail

Use AWS CloudTrail to get a history of the Amazon ES API calls and related events for your account. CloudTrail is a web service that records API calls from your accounts and delivers the resulting log files to your Amazon S3 bucket. You also can use CloudTrail to track changes that were made to your AWS resources. For more information, see Auditing Amazon Elasticsearch Service Domains with AWS CloudTrail.

Amazon CloudWatch

An Amazon ES domain automatically sends metrics to Amazon CloudWatch so that you can gather and analyze performance statistics. You can monitor these metrics by using the AWS CLI or the AWS SDKs. For more information, see Monitoring Cluster Metrics and Statistics with Amazon CloudWatch (Console).

Kinesis

Kinesis is a managed service that scales elastically for real-time processing of streaming data at a massive scale. Amazon ES provides Lambda sample code for integration with Kinesis. For more information, see Loading Streaming Data into Amazon ES From Kinesis.

Amazon S3

Amazon Simple Storage Service (Amazon S3) provides storage for the Internet. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web. Amazon ES provides Lambda sample code for integration with Amazon S3. For more information, see Loading Streaming Data into Amazon ES from Amazon S3.

AWS IAM

AWS Identity and Access Management (IAM) is a web service that you can use to manage users and user permissions in AWS. You can use IAM to create user-based access policies for your Amazon ES domains. For more information, see the IAM documentation.

Amazon ES integrates with the following services to provide data ingestion:

AWS Lambda

AWS Lambda is a zero-administration compute platform for backend web developers that runs your code in the AWS Cloud. Amazon ES provides sample code to run on Lambda that integrates with Kinesis and Amazon S3. For more information, see Loading Streaming Data into Amazon ES.

Amazon DynamoDB

Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. Amazon ES provides a Logstash plugin to support DynamoDB Streams and to sign AWS service requests.

Pricing for Amazon Elasticsearch Service

With AWS, you pay only for what you use. For Amazon ES, you pay for each hour of use of an EC2 instance. You also can choose to pay for extra storage based on the cumulative size of EBS volumes that are attached to the data nodes in your domain.

If you qualify for the AWS Free Tier, you receive up to 750 hours per month of use with the t2.micro.elasticsearch or t2.small.elasticsearch instance types. You also receive up to 10 GB of Amazon EBS storage (Magnetic or General Purpose). For more information, see AWS Free Tier.

Charges for Configuration Changes to a Cluster

If you change the configuration for a cluster, Amazon ES creates a cluster with the new configuration and copies the data from the old cluster to the new cluster. During the migration of old to new, you incur the following charges:

  • If you change the instance type, you are charged for both clusters for the first hour. After the first hour, you are charged only for the new cluster.

    Example: You change the configuration from three m3.xlarge instances to four m4.large instances. For the first hour, you are charged for both clusters (3 * m3.xlarge + 4 * m4.large). After the first hour, you are charged only for the new cluster (4 * m4.large).

  • If you don’t change the instance type, you are charged only for the largest cluster for the first hour. After the first hour, you are charged only for the new cluster.

    Example: You change the configuration from six m3.xlarge instances to three m3.xlarge instances. For the first hour, you are charged for the largest cluster (6 * m3.xlarge). After the first hour, you are charged only for the new cluster (3 * m3.xlarge).