What Is Amazon Elasticsearch Service?
Amazon Elasticsearch Service (Amazon ES) is a managed service that makes it easy to create a domain and deploy, operate, and scale Elasticsearch clusters in the AWS Cloud. Elasticsearch is a popular open-source search and analytics engine for use cases such as log analytics, real-time application monitoring, and clickstream analytics. With Amazon ES, you get direct access to Elasticsearch open-source APIs so that existing code and applications work seamlessly together. Currently, Amazon ES supports Elasticsearch versions 1.5, 2.3, and 5.1. To learn more about Elasticsearch and its uses, see Getting Started in the Elasticsearch Reference.
Amazon ES provisions all the resources for your Elasticsearch cluster and launches the cluster. Amazon ES also automatically detects and replaces failed Elasticsearch nodes, reducing the overhead associated with self-managed infrastructures. You can scale your cluster with a single API call or a few clicks in the console.
To get started using the service, you create an Amazon ES domain. An Amazon ES domain is an Elasticsearch cluster in the AWS Cloud that has the compute and storage resources that you specify. For example, you can specify the number of instances, instance types, and storage options.
Additionally, Amazon ES offers the following benefits of a managed service:
Cluster scaling options
Replication for high availability
There are no upfront costs to set up clusters, and you pay only for the service resources that you use.
- Features of Amazon Elasticsearch Service
- Getting Started with Amazon Elasticsearch Service
- Signing Up for AWS
- Accessing Amazon Elasticsearch Service
- Regions and Endpoints for Amazon Elasticsearch Service
- Scaling in Amazon Elasticsearch Service
- Signing Service Requests
- Choosing an Instance Type
- Using Amazon EBS Volumes for Storage
- Choosing an Elasticsearch Version
- Related Services
- Pricing for Amazon Elasticsearch Service
Features of Amazon Elasticsearch Service
Amazon ES includes the following features:
Multiple configurations of CPU, memory, and storage capacity, known as instance types
Storage volumes for your data using Amazon EBS volumes
Multiple geographical locations for your resources, known as regions and Availability Zones
Cluster node allocation across two Availability Zones in the same region, known as zone awareness
Security with AWS Identity and Access Management (IAM) access control
Dedicated master nodes to improve cluster stability
Domain snapshots to back up and restore Amazon ES domains and replicate domains across Availability Zones
Data visualization using the Kibana tool
Integration with Amazon CloudWatch for monitoring Amazon ES domain metrics
Integration with AWS CloudTrail for auditing configuration API calls to Amazon ES domains
Integration with Amazon S3, Amazon Kinesis, and Amazon DynamoDB for loading streaming data into Amazon ES
Getting Started with Amazon Elasticsearch Service
To get started, sign up for an AWS account if you don't already have one. For more information, see Signing Up for AWS.
After you are set up with an account, complete the Getting Started tutorial for Amazon Elasticsearch Service. Consult the following introductory topics if you need more information while learning about the service.
Get Up and Running
Learn the Basics
Choose Instance Types and Storage
Signing Up for AWS
If you're not already an AWS customer, your first step is to create an AWS account. If you already have an AWS account, you are automatically signed up for Amazon ES. Your AWS account enables you to access Amazon ES and other services in the AWS platform, such as Amazon S3 and Amazon EC2. There are no sign-up fees, and you don't incur charges until you create a domain. As with other AWS services, you pay only for the resources that you use.
To create an AWS account
Go to https://aws.amazon.com, and then choose Sign In to the Console.
To sign up, follow the instructions. You must enter payment information before you can use Amazon ES.
Accessing Amazon Elasticsearch Service
You can access Amazon ES through the Amazon ES console, the AWS SDKs, or the AWS CLI.
The Amazon ES console lets you create, configure, and monitor your domains and upload data. Using the console is the easiest way to get started with Amazon ES.
The AWS SDKs support all the Amazon ES API operations, making it easy to manage your domains using your preferred technology. The SDKs automatically sign requests as needed using your AWS credentials.
The AWS CLI wraps all the Amazon ES API operations, providing a simple way to create and configure domains. The AWS CLI automatically signs requests as needed using your AWS credentials.
For information about Elasticsearch APIs and features, see the Elasticsearch documentation.
Regions and Endpoints for Amazon Elasticsearch Service
Amazon ES provides regional endpoints for accessing the configuration API and domain-specific endpoints for accessing the search API. You use the configuration service to create and manage your domains. The region-specific configuration service endpoints have this format:
es.us-east-1.amazonaws.com. For a list of supported
regions, see Regions
and Endpoints in the AWS General Reference.
Amazon ES provides a single service endpoint for both search and data services:
You use a domain's search endpoint to upload data and submit search requests.
Scaling in Amazon Elasticsearch Service
A domain has one or more Elasticsearch instances, each with a finite amount of RAM, CPU, and storage resources for indexing data and processing requests. The number of instances that you need for your domain depends on the documents in your collection and the volume and complexity of your Elasticsearch requests.
When you create a domain, you choose an initial number of Elasticsearch instances and an instance type. However, these initial choices might not be adequate as the quantity and size of data increase and as Elasticsearch requests increase in number and complexity. You can accommodate the growth by scaling your Amazon ES domain. The following table provides guidelines for scaling a domain.
|Domain Change||Scaling Guidelines|
|Increase in data quantity
Increase in data size
Use the following guidelines to scale for both increased data quantity and data size:
|Increase in traffic due to Elasticsearch request volume and complexity||
Use the following guidelines to scale for increased traffic:
Replica shards provide failover. If a cluster node that contains a primary shard fails, Amazon ES promotes a replica shard to a primary shard. For more information about replica shards, see Shards and Replicas in the Elasticsearch documentation.
Scaling for Increased Data
Each Amazon ES domain has one or more search indices. The index stores data in one or more shards that are distributed across the search instances in your cluster. As your cluster grows, Amazon ES automatically migrates shards between search instances. However, the number of primary shards is fixed when the index is created. The number of primary shards defines the maximum amount of data that can be stored in an index. For more information about indices and shards, see Add an Index in the Elasticsearch documentation.
If you add instances, Amazon ES distributes index shards among the available search instances. For more information, see Scale Horizontally in the Elasticsearch documentation. Choosing a larger instance type provides larger local storage for your cluster. Use larger EBS volumes to accommodate larger indices.
Scaling for Increased Traffic
Search and document retrieval requests can be served by primary or replica shards. The more replica shards a cluster has, the more search requests the cluster can handle. Larger instance types have more hardware resources, such as RAM and CPU, which allow each shard to perform better. For more information, see Shards and Replicas in the Elasticsearch documentation.
Signing Service Requests
If you use a programming language that AWS provides an SDK for, we recommend that you use the SDK to submit HTTP requests to AWS. The AWS SDKs greatly simplify the process of signing requests, and save you a significant amount of time compared to natively accessing the Elasticsearch APIs. The SDKs integrate easily with your development environment and provide easy access to related commands. You also can use the Amazon ES console and AWS CLI to submit signed requests with no additional effort.
If you choose to call the Elasticsearch APIs directly, you must sign your own requests. Configuration service requests must always be signed. All requests must be signed unless you configure anonymous access for those services. Use the following procedure to sign a request:
Calculate a digital signature using a cryptographic hash function. The input must include the text of your request and your secret access key.
The function returns a hash value based on your input.
Include the digital signature in the Authorization header of your request.
The service recalculates the signature using the same hash function and input that you used. If the resulting signature matches the signature in the request, the service processes the request. Otherwise, the service rejects the request.
Amazon ES supports authentication using AWS Signature Version 4. For more information, see Signature Version 4 Signing Process.
The service ignores parameters passed in URLs for HTTP POST requests that are signed with Signature Version 4.
Choosing an Instance Type
An instance type defines the memory, CPU, storage capacity, and hourly cost for an instance, the Amazon Machine Image (AMI) that runs as a virtual server in the AWS Cloud. Choose the instance type and the number of instances based on the anticipated size of the Elasticsearch indices, shards, and replicas that you intend to create on your cluster.
t2.micro.elasticsearchinstance is supported only with Elasticsearch version 2.3 or 1.5.
The M3 instance type is not available in the us-east-2, ca-central-1, eu-west-2, ap-northeast-2, and ap-south-1 regions.
The I2 instance type is not available in the sa-east-1, ca-central-1, eu-west-2, and us-east-2 regions.
The R3 instance type is not available in the ca-central-1, eu-west-2, and sa-east-1 regions.
Using Amazon EBS Volumes for Storage
You have the option of configuring your Amazon ES domain to use an Amazon EBS volume for storing indices rather than the default storage provided by the instance. An Amazon EBS volume is a durable, block-level storage device that you can attach to a single instance. Amazon ES supports the following EBS volume types:
General Purpose (SSD)
Provisioned IOPS (SSD)
For an overview, see Amazon EBS Volumes in the Amazon EC2 documentation. For procedures that show you how to use Amazon EBS volumes for your Amazon ES domain, see Configuring EBS-based Storage. For information about the minimum and maximum size of supported EBS volumes in an Amazon ES domain, see EBS Volume Size Limits.
Choosing an Elasticsearch Version
Amazon ES currently supports three Elasticsearch versions: 1.5, 2.3, and 5.1. Compared to earlier versions of Elasticsearch, version 5.1 offers powerful features that make it faster, more secure, and easier to use. Here are some of the highlights:
Support for Painless scripting – Painless is an Elasticsearch built-in scripting language. Painless lets you run advanced queries against your data and automate operations like partial index updates in a fast, highly secure way.
Higher indexing performance – Elasticsearch 5.1 provides better indexing capabilities that significantly increase the throughput of data updates.
Improved aggregations – Elasticsearch 5.1 offers several aggregation improvements, such as recalculating aggregations only when the data changes. It also delivers faster query performance.
For more information about the differences among Elasticsearch versions, see the Elasticsearch documentation. For information about the Elasticsearch APIs that Amazon ES supports for 1.5, 2.3, and 5.1, see Supported Elasticsearch Operations.
If you start a new Elasticsearch project, we strongly recommend that you choose version 5.1. If you have an existing 1.5 or 2.3 domain, you can choose to keep the domain or migrate your data to a new 5.1 domain. For more information, see Migrating to a Different Elasticsearch Version.
Amazon ES commonly is used with the following services:
- AWS CloudTrail
Use AWS CloudTrail to get a history of the Amazon ES API calls and related events for your account. CloudTrail is a web service that records API calls from your accounts and delivers the resulting log files to your Amazon S3 bucket. You also can use CloudTrail to track changes that were made to your AWS resources. For more information, see CloudTrail Support.
- Amazon CloudWatch
An Amazon ES domain automatically sends metrics to Amazon CloudWatch so that you can gather and analyze performance statistics. You can monitor these metrics by using the AWS CLI or the AWS SDKs. For more information, see Monitoring an Amazon ES domain with Amazon CloudWatch.
- Amazon Kinesis
Amazon Kinesis is a managed service that scales elastically for real-time processing of streaming data at a massive scale. Amazon ES provides Lambda sample code for integration with Amazon Kinesis. For more information, see Streaming Data to Amazon ES From Amazon Kinesis.
- Amazon S3
Amazon Simple Storage Service (Amazon S3) provides storage for the Internet. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web. Amazon ES provides Lambda sample code for integration with Amazon S3. For more information, see Streaming Data to Amazon ES from Amazon S3.
- AWS IAM
AWS Identity and Access Management (IAM) is a web service that you can use to manage users and user permissions in AWS. You can use IAM to create user-based access policies for your Amazon ES domains. For more information, see the IAM documentation.
Amazon ES integrates with the following services to provide data ingestion:
- AWS Lambda
AWS Lambda is a zero-administration compute platform for backend web developers that runs your code in the AWS Cloud. Amazon ES provides sample code to run on Lambda that integrates with Amazon Kinesis and Amazon S3. For more information, see Streaming Data to Amazon ES.
- Amazon DynamoDB
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. Amazon ES provides a Logstash plugin to support DynamoDB Streams and to sign AWS service requests.
Pricing for Amazon Elasticsearch Service
With AWS, you pay only for what you use. For Amazon ES, you pay for each hour of use of an EC2 instance. You also can choose to pay for extra storage based on the cumulative size of EBS volumes that are attached to the data nodes in your domain.
If you qualify for the AWS Free Tier, you receive up to 750 hours per month of use with the t2.micro.elasticsearch or t2.small.elasticsearch instance type. You also receive up to 10 GB of Amazon EBS storage (Magnetic or General Purpose). For more information, see AWS Free Tier.