Menu
Amazon Elasticsearch Service
Developer Guide (API Version 2015-01-01)

Managing Amazon Elasticsearch Service Domains

As the size and number of documents in your Amazon Elasticsearch Service (Amazon ES) domain grow and as network traffic increases, you likely will need to update the configuration of your Elasticsearch cluster. To know when it's time to reconfigure your domain, you need to monitor domain metrics. You also have the option of managing your own index snapshots, auditing data-related API calls to your domain, and assigning tags to your domain. You can load data in bulk to your domain using the Logstash plugin provided by the service. This section describes how to perform these and other tasks related to managing your domains.

About Dedicated Master Nodes

Amazon Elasticsearch Service (Amazon ES) uses dedicated master nodes to increase cluster stability. A dedicated master node is a cluster node that performs cluster management tasks, but does not hold data or respond to data upload requests. This offloading of cluster management tasks increases the stability of your Elasticsearch clusters.

Note

We recommend that you allocate three dedicated master nodes for each Amazon ES domain in production.

A dedicated master node performs the following cluster management tasks:

  • Tracks all nodes in the cluster

  • Tracks the number of indices in the cluster

  • Tracks the number of shards belonging to each index

  • Maintains routing information for nodes in the cluster

  • Updates the cluster state after state changes, such as creating an index and adding or removing nodes in the cluster

  • Replicates changes to the cluster state across all nodes in the cluster

  • Monitors the health of all cluster nodes by sending heartbeat signals, periodic signals that monitor the availability of the data nodes in the cluster

The following illustration shows an Amazon ES domain with ten instances. Seven of the instances are data nodes and three are dedicated master nodes. Only one of the dedicated master nodes is active; the two gray dedicated master nodes wait as backup in case the active dedicated master node fails. All data upload requests are served by the seven data nodes, and all cluster management tasks are offloaded to the active dedicated master node.

Although the dedicated master instances do not process search and query requests, their size is highly correlated with the number of instances, indices, and shards that they can manage. For production clusters, we recommend the following sizing for dedicated master instances. These recommendations are based on typical workloads on the service and can vary based on your workload requirements.

Instance Count

Recommended Minimum Dedicated Master Instance

5–10

m3.medium.elasticsearch

The M3 instance type is not available in the us-east-2, ca-central-1, eu-west-2, ap-northeast-2, and ap-south-1 regions.

10–20

m4.large.elasticsearch

20–50

c4.xlarge.elasticsearch

The default limit is 20 instances per domain. To request an increase up to 100 instances per domain, create a case with the AWS Support Center.

50–100

c4.2xlarge.elasticsearch

The default limit is 20 instances per domain. To request an increase up to 100 instances per domain, create a case with the AWS Support Center.

For more information about specific instance types, including vCPU, memory, and pricing, see Amazon Elasticsearch Instance Prices. For information about charges that you incur if you change the configuration of a cluster, see Pricing for Amazon Elasticsearch Service.

Additionally, Amazon ES uses a blue/green deployment process when performing domain update operations (such as making configuration changes or performing software updates). While the domain is in this state, the domain status is Processing. It's important to maintain sufficient capacity on the dedicated master to handle the management overhead that is associated with the blue/green updates.

To prevent overloading a dedicated master node, you can monitor usage with the Amazon CloudWatch metrics that are shown in the following table. Use a larger instance type for dedicated master nodes when these metrics reach their respective maximum values.

CloudWatch Metric Guideline
MasterCPUUtilization Measures the percentage utilization of the CPU for the dedicated master nodes. We recommend increasing the size of the instance type when this metric exceeds 40% with a domain status of Active and exceeds 60% with a domain status of Processing.
MasterJVMMemoryPressure Measures the percentage utilization of the JVM memory for the dedicated master nodes. We recommend increasing the size of the instance type when this metric exceeds 60% with a domain status of Active and exceeds 85% with a domain status of Processing.

For more information, see the following topics:

Enabling Zone Awareness (Console)

Each AWS Region is a separate geographic area with multiple, isolated locations known as Availability Zones. To prevent data loss and minimize downtime in the event of node and data center failure, you can use the Amazon ES console to allocate nodes and replica index shards that belong to an Elasticsearch cluster across two Availability Zones in the same region. This allocation is known as zone awareness. If you enable zone awareness, you also must use the native Elasticsearch API to create replica shards for your cluster. Amazon ES distributes the replicas across the nodes in the Availability Zones, which increases the availability of your cluster. Enabling zone awareness for a cluster slightly increases network latencies.

Important

Zone awareness requires an even number of instances in the instance count. The default configuration for any index is a replica count of 1. If you specify a replica count of 0 for an index, zone awareness doesn't replicate the shards to the second Availability Zone. Without replica shards, there are no replicas to distribute to a second Availability Zone, and enabling the feature doesn't provide protection from data loss.

The following illustration shows a four-node cluster with zone awareness enabled. The service places all the primary index shards in one Availability Zone and all the replica shards in the second Availability Zone.

To enable zone awareness (console)

  1. Go to https://aws.amazon.com, and then choose Sign In to the Console.

  2. Under Analytics, choose Elasticsearch Service.

  3. In the navigation pane, under My domains, choose your Amazon ES domain.

  4. Choose Configure cluster.

  5. In the Node configuration pane, choose Enable zone awareness.

  6. Choose Submit.

For more information, see Regions and Availability Zones in the EC2 documentation.

Working with Manual Index Snapshots (AWS CLI)

Amazon Elasticsearch Service (Amazon ES) takes daily automated snapshots of the primary index shards in an Amazon ES domain, as described in Configuring Snapshots. However, you must contact the AWS Support team to restore an Amazon ES domain with an automated snapshot. If you need greater flexibility, you can take snapshots manually and manage them in a snapshot repository, an Amazon S3 bucket.

Manual snapshots provide a convenient way to migrate data across Amazon ES domains and to recover from failure. For more information, see Snapshot and Restore in the Elasticsearch documentation. The service supports restoring indices and creating new indices from manual snapshots taken on both Amazon ES domains and self-managed Elasticsearch clusters.

Manual snapshots also allow you to address a red cluster service error yourself without waiting for AWS Support to restore the latest automatic snapshot. Your snapshot repository can hold several snapshots, each identified by a unique name. For a complete description of manual index snapshots, see Snapshot and Restore in the Elasticsearch documentation.

Note

If your Amazon ES domain experiences a red cluster error, AWS Support might contact you to ask whether you want to address the problem with your own manual index snapshots or you want the support team to restore the latest automatic snapshot of the domain. If you don't respond within seven days, AWS Support restores the latest automatic snapshot.

Snapshot Prerequisites

To create and restore index snapshots manually, you must work with IAM and Amazon S3. Verify that you have met the following prerequisites before you attempt to take a snapshot.

Prerequisite Description
S3 bucket Stores manual snapshots for your Amazon ES domain. For more information, see Create a Bucket in the Amazon S3 Getting Started Guide.
IAM role Delegates permissions to Amazon Elasticsearch Service. The trust relationship for the role must specify Amazon Elasticsearch Service in the Principal statement. The role type must be Amazon EC2. For instructions, see Creating a Role to Delegate Permissions to an AWS Service in the IAM documentation. The IAM role also is required to register your snapshot repository with Amazon ES. Only IAM users with access to this role may register the snapshot repository. For more information, see Registering a Snapshot Repository.
IAM policy Specifies the actions that Amazon S3 may perform with your S3 bucket. The policy must be attached to the IAM role that delegates permissions to Amazon Elasticsearch Service. The policy must specify an S3 bucket in a Resource statement. For more information, see Creating Customer Managed Policies and Attaching Managed Policies in the Using IAM Guide.

S3 Bucket

Make a note of the Amazon Resource Name (ARN) for the S3 bucket where you store manual snapshots. You need it for the following:

  • Resource statement of the IAM policy that is attached to your IAM role

  • Python client that is used to register a snapshot repository

The following example shows an ARN for an S3 bucket:

Copy
arn:aws:s3:::es-index-backups

For more information, see Create a Bucket in the Amazon S3 Getting Started Guide.

IAM Role

The role must specify Amazon Elasticsearch Service, es.amazonaws.com, in a Service statement in its trust relationship, as shown in the following example:

Copy
{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "es.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }

Note

Only IAM users with access to this role may register the snapshot repository.

For instructions, see Creating a Role to Delegate Permissions to an AWS Service in the Using IAM Guide.

IAM Policy

An IAM policy must be attached to the role. The policy must specify the S3 bucket that is used to store manual snapshots for your Amazon ES domain. The following example specifies the ARN of the es-index-backups bucket:

Copy
{ "Version":"2012-10-17", "Statement":[ { "Action":[ "s3:ListBucket" ], "Effect":"Allow", "Resource":[ "arn:aws:s3:::es-index-backups" ] }, { "Action":[ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "iam:PassRole" ], "Effect":"Allow", "Resource":[ "arn:aws:s3:::es-index-backups/*" ] } ] }

For instructions, see Creating Customer Managed Policies and Attaching Managed Policies in the Using IAM Guide.

Registering a Snapshot Directory

As an IAM user with access to the new role, you must register the snapshot directory with Amazon Elasticsearch Service before you take manual index snapshots. This one-time operation requires that you sign your AWS request with the IAM role that grants permissions to Amazon ES.

Note

You cannot use curl to perform this operation because it does not support AWS request signing. Instead, use the sample Python client to register your snapshot directory.

Sample Python Client

Save the following sample Python code as a Python file, such as snapshot.py. Registering the snapshot directory with the service is a one-time operation. You can use curl to take subsequent snapshots, as described in Taking Manual Snapshots.

You must update the following in the sample code:

region

AWS Region where you created the snapshot repository

endpoint

Endpoint for your Amazon ES domain

aws_access_key_id

IAM credential

aws_secret_access_key

IAM credential

path

Location of the snapshot repository

Note

The Python client requires that you install version 2.x of the boto package on the computer where you register your snapshot repository.

Copy
from boto.connection import AWSAuthConnection class ESConnection(AWSAuthConnection): def __init__(self, region, **kwargs): super(ESConnection, self).__init__(**kwargs) self._set_auth_region_name(region) self._set_auth_service_name("es") def _required_auth_capability(self): return ['hmac-v4'] if __name__ == "__main__": client = ESConnection( region='us-east-1', host='search-weblogs-etrt4mbbu254nsfupy6oiytuz4.us-east-1.es.example.com', aws_access_key_id='my-access-key-id', aws_secret_access_key='my-access-key', is_secure=False) print 'Registering Snapshot Repository' resp = client.make_request(method='POST', path='/_snapshot/weblogs-index-backups', data='{"type": "s3","settings": { "bucket": "es-index-backups","region": "us-east-1","role_arn": "arn:aws:iam::123456789012:role/MyElasticsearchRole"}}') body = resp.read() print body

Taking Manual Snapshots (AWS CLI)

You must specify two pieces of information when you create a snapshot:

  • Name of your snapshot repository

  • Name for the snapshot

To manually take a snapshot (AWS CLI)

  • Run the following command to manually take a snapshot:

    Copy
    curl -XPUT 'http://<Elasticsearch_domain_endpoint>/_snapshot/snapshot_repository/snapshot_name'

The following example takes a snapshot named snapshot_1 and stores it in the weblogs-index-backups snapshot repository:

Copy
curl -XPUT 'http://<Elasticsearch_domain_endpoint>/_snapshot/weblogs-index-backups/snapshot_1'

Note

The time required to take a snapshot increases with the size of the Amazon ES domain. Long-running snapshot operations commonly encounter the following error: BotoServerError: 504 GATEWAY_TIMEOUT. Typically, you can ignore these errors and wait for the operation to complete successfully. Use the following command to verify the state of all snapshots of your domain:

Copy
curl -XGET 'http://<Elasticsearch_domain_endpoint>/_snapshot/<snapshot_repository>/_all?pretty'

Restoring Manual Snapshots (AWS CLI)

To restore a snapshot, perform the following procedure.

To manually restore a snapshot (AWS CLI)

  1. Delete or rename all open indices in the Amazon ES domain.

    You cannot restore a snapshot of your indices to an Elasticsearch cluster that already contains indices with the same names. Currently, Amazon ES does not support the Elasticsearch _close API, so you must use one of the following alternatives:

    • Delete the indices on the same Amazon ES domain, then restore the snapshot

    • Restore the snapshot to a different Amazon ES domain

    The following example demonstrates how to delete the existing indices for the weblogs domain:

    Copy
    curl -XDELETE 'http://search-weblogs-abcdefghijklmnojiu.us-east-1.example.com/_all'
  2. To restore a snapshot, run the following command :

    Copy
    curl -XPOST 'http://<Elasticsearch_domain_endpoint>/_snapshot/snapshot_repository/snapshot_name/_restore'

    The following example restores snapshot_1 from the weblogs-index-backups snapshot repository:

    Copy
    curl -XPOST 'http://search-weblogs-abcdefghijklmnojiu.us-east-1.example.com/_snapshot/weblogs-index-backups/snapshot_1/_restore'

Monitoring Cluster Metrics and Statistics with Amazon CloudWatch (Console)

An Elasticsearch cluster is a collection of one or more data nodes, optional dedicated master nodes, and storage required to run Elasticsearch and operate your Amazon ES domain. Each node in an Elasticsearch cluster automatically sends performance metrics to Amazon CloudWatch in one-minute intervals. Use the Monitoring tab in the Amazon Elasticsearch Service console to view these metrics, provided at no charge.

Statistics provide you with broader insight into each metric. For example, view the Average statistic for the CPUUtilization metric to compute the average CPU utilization for all nodes in the cluster. Each of the metrics falls into one of three categories:

Note

The service archives the metrics for two weeks before discarding them.

To view configurable statistics for a metric (console)

  1. Go to https://aws.amazon.com, and then choose Sign In to the Console.

  2. Under Analytics, choose Elasticsearch Service.

  3. In the navigation pane, under My domains, choose your Amazon ES domain.

  4. Choose the Monitoring tab.

  5. Choose the metric that you want to view.

  6. From the Statistic list, select a statistic.

    For a list of relevant statistics for each metric, see the tables in Cluster Metrics. Some statistics are not relevant for a given metric. For example, the Sum statistic is not meaningful for the Nodes metric.

  7. Choose Update graph.

Cluster Metrics

Note

To check your cluster metrics if metrics are unavailable in the Amazon Elasticsearch Service console, use Amazon CloudWatch.

The AWS/ES namespace includes the following metrics for clusters.

Metric Description
ClusterStatus.green

Indicates that all index shards are allocated to nodes in the cluster.

Relevant statistics: Minimum, Maximum

ClusterStatus.yellow Indicates that the primary shards for all indices are allocated to nodes in a cluster, but the replica shards for at least one index are not. Single node clusters always initialize with this cluster status because there is no second node to which a replica can be assigned. You can either increase your node count to obtain a green cluster status, or you can use the Elasticsearch API to set the number_of_replicas setting for your index to 0. For more information, see Configuring Amazon Elasticsearch Service Domains and Update Indices Settings in the Elasticsearch documentation.

Relevant statistics: Minimum, Maximum

ClusterStatus.red

Indicates that the primary and replica shards of at least one index are not allocated to nodes in a cluster. A common cause for this state is a lack of free storage space on one or more of the data nodes in the cluster. In turn, a lack of free storage space prevents the service from distributing replica shards to the affected data node or nodes, and all new indices to start with a red cluster status. To recover, you must add EBS-based storage to existing data nodes, use larger instance types, or delete the indices and restore them from a snapshot. For more information, see Red Cluster Status.

Relevant statistics: Minimum, Maximum

Nodes

The number of nodes in the Amazon ES cluster.

Relevant Statistics: Minimum, Maximum, Average

SearchableDocuments

The total number of searchable documents across all indices in the cluster.

Relevant statistics: Minimum, Maximum, Average

DeletedDocuments

The total number of deleted documents across all indices in the cluster.

Relevant statistics: Minimum, Maximum, Average

CPUUtilization

The maximum percentage of CPU resources used for data nodes in the cluster.

Relevant statistics: Maximum, Average

FreeStorageSpace

The free space, in megabytes, for all data nodes in the cluster. Amazon ES throws a ClusterBlockException when this metric reaches 0. To recover, you must either delete indices, add larger instances, or add EBS-based storage to existing instances. To learn more, see Recovering from a Lack of Free Storage Space

Relevant statistics: Minimum

ClusterUsedSpace

The total used space, in megabytes, for a cluster. You can view this metric in the Amazon CloudWatch console, but not in the Amazon ES console.

Relevant statistics: Minimum, Maximum

ClusterIndexWritesBlocked

Indicates whether your cluster is accepting or blocking incoming write requests. A value of 0 means that the cluster is accepting requests. A value of 1 means that it is blocking requests.

Many factors can cause a cluster to begin blocking requests. Some common factors include the following: FreeStorageSpace is too low, JVMMemoryPressure is too high, or CPUUtilization is too high. To alleviate this issue, consider adding more disk space or scaling your cluster.

Relevant statistics: Maximum

Note

You can view this metric in the Amazon CloudWatch console, but not the Amazon ES console.

JVMMemoryPressure

The maximum percentage of the Java heap used for all data nodes in the cluster.

Relevant statistics: Maximum

AutomatedSnapshotFailure

The number of failed automated snapshots for the cluster. A value of 1 indicates that no automated snapshot was taken for the domain in the previous 36 hours.

Relevant statistics: Minimum, Maximum

CPUCreditBalance

The remaining CPU credits available for data nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. For more information, see CPU Credits in the Amazon EC2 Developer Guide. This metric is available only for the t2.micro.elasticsearch, t2.small.elasticsearch, and t2.medium.elasticsearch instance types.

Relevant statistics: Minimum

KibanaHealthyNodes

A health check for Kibana. A value of 1 indicates normal behavior. A value of 0 indicates that Kibana is inaccessible. In most cases, the health of Kibana mirrors the health of the cluster.

Relevant statistics: Minimum

Note

You can view this metric on the Amazon CloudWatch console, but not the Amazon ES console.

The following screenshot shows the cluster metrics that are described in the preceding table.

Dedicated Master Node Metrics

The AWS/ES namespace includes the following metrics for dedicated master nodes.

Metric Description
MasterCPUUtilization

The maximum percentage of CPU resources used by the dedicated master nodes. We recommend increasing the size of the instance type when this metric reaches 60 percent.

Relevant statistics: Average

MasterFreeStorageSpace

This metric is not relevant and can be ignored. The service does not use master nodes as data nodes.

MasterJVMMemoryPressure

The maximum percentage of the Java heap used for all dedicated master nodes in the cluster. We recommend moving to a larger instance type when this metric reaches 85 percent.

Relevant statistics: Maximum

MasterCPUCreditBalance

The remaining CPU credits available for dedicated master nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. For more information, see CPU Credits in the Amazon EC2 User Guide for Linux Instances. This metric is available only for the t2.micro.elasticsearch, t2.small.elasticsearch, and t2.medium.elasticsearch instance types.

Relevant statistics: Minimum

MasterReachableFromNode

A health check for MasterNotDiscovered exceptions. A value of 1 indicates normal behavior. A value of 0 indicates that /_cluster/health/ is failing.

Failures mean that the master node stopped or is not reachable. They are usually the result of a network connectivity issue or AWS dependency problem.

Relevant statistics: Minimum

Note

You can view this metric on the Amazon CloudWatch console, but not the Amazon ES console.

The following screenshot shows the dedicated master nodes metrics that are described in the preceding table.

EBS Volume Metrics

The AWS/ES namespace includes the following metrics for EBS volumes.

Metric Description
ReadLatency

The latency, in seconds, for read operations on EBS volumes.

Relevant statistics: Minimum, Maximum, Average

WriteLatency

The latency, in seconds, for write operations on EBS volumes.

Relevant statistics: Minimum, Maximum, Average

ReadThroughput

The throughput, in bytes per second, for read operations on EBS volumes.

Relevant statistics: Minimum, Maximum, Average

WriteThroughput

The throughput, in bytes per second, for write operations on EBS volumes.

Relevant statistics: Minimum, Maximum, Average

DiskQueueDepth

The number of pending input and output (I/O) requests for an EBS volume.

Relevant statistics: Minimum, Maximum, Average

ReadIOPS

The number of input and output (I/O) operations per second for read operations on EBS volumes.

Relevant statistics: Minimum, Maximum, Average

WriteIOPS

The number of input and output (I/O) operations per second for write operations on EBS volumes.

Relevant statistics: Minimum, Maximum, Average

The following screenshot shows the EBS volume metrics that are described in the preceding table.

Auditing Amazon Elasticsearch Service Domains with AWS CloudTrail

Amazon Elasticsearch Service (Amazon ES) is integrated with AWS CloudTrail, a service that logs all AWS API calls made by, or on behalf of, your AWS account. The log files are delivered to an Amazon S3 bucket that you create and configure with a bucket policy that grants CloudTrail permissions to write log files to the bucket. CloudTrail captures all Amazon ES configuration service API calls, including those submitted by the Amazon Elasticsearch Service console.

You can use the information collected by CloudTrail to monitor activity for your search domains. You can determine the request that was made to Amazon ES, the source IP address from which the request was made, who made the request, and when it was made. To learn more about CloudTrail, including how to configure and enable it, see the AWS CloudTrail User Guide. To learn more about how to create and configure an S3 bucket for CloudTrail, see Amazon S3 Bucket Policy for CloudTrail.

Note

CloudTrail logs events only for configuration-related API calls to Amazon Elasticsearch Service. Data-related APIs are not logged.

The following example shows a sample CloudTrail log for Amazon ES:

Copy
{ "Records": [ { "eventVersion": "1.03", "userIdentity": { "type": "Root", "principalId": "000000000000", "arn": "arn:aws:iam::000000000000:root", "accountId": "000000000000", "accessKeyId": "A*****************A" }, "eventTime": "2015-07-31T21:28:06Z", "eventSource": "es.amazonaws.com", "eventName": "CreateElasticsearchDomain", "awsRegion": "us-east-1", "sourceIPAddress": "Your IP", "userAgent": "es/test", "requestParameters": { "elasticsearchClusterConfig": {}, "snapshotOptions": { "automatedSnapshotStartHour": "0" }, "domainName": "your-domain-name", "eBSOptions": { "eBSEnabled": false } }, "responseElements": { "domainStatus": { "created": true, "processing": true, "aRN": "arn:aws:es:us-east-1:000000000000:domain/your-domain-name", "domainId": "000000000000/your-domain-name", "elasticsearchClusterConfig": { "zoneAwarenessEnabled": false, "instanceType": "m3.medium.elasticsearch", "dedicatedMasterEnabled": false, "instanceCount": 1 }, "deleted": false, "domainName": "your-domain-name", "domainVersion": "1.5", "accessPolicies": "", "advancedOptions": { "rest.action.multi.allow_explicit_index": "true" }, "snapshotOptions": { "automatedSnapshotStartHour": "0" }, "eBSOptions": { "eBSEnabled": false } } }, "requestID": "05dbfc84-37cb-11e5-a2cd-fbc77a4aae72", "eventID": "c21da94e-f5ed-41a4-8703-9a5f49e2ec85", "eventType": "AwsApiCall", "recipientAccountId": "000000000000" } ] }

Amazon Elasticsearch Service Information in CloudTrail

When CloudTrail logging is enabled in your AWS account, API calls made to Amazon Elasticsearch Service (Amazon ES) operations are tracked in log files. Amazon ES records are written together with other AWS service records in a log file. CloudTrail determines when to create and write to a new file based on a time period and file size.

All Amazon ES configuration service operations are logged. For example, calls to CreateElasticsearchDomain, DescribeElasticsearchDomain, and UpdateElasticsearchDomainConfig generate entries in the CloudTrail log files. Every log entry contains information about who generated the request. The user identity information in the log helps you determine whether the request was made with root or IAM user credentials, with temporary security credentials for a role or federated user, or by another AWS service. For more information, see the userIdentity field in the CloudTrail Event Reference.

You can store your log files in your bucket indefinitely, or you can define Amazon S3 lifecycle rules to archive or delete log files automatically. By default, your log files are encrypted using Amazon S3 server-side encryption (SSE). You can choose to have CloudTrail publish Amazon SNS notifications when new log files are delivered if you want to take quick action upon log file delivery. For more information, see Configuring Amazon SNS Notifications for CloudTrail. You also can aggregate Amazon ES log files from multiple AWS Regions and multiple AWS accounts into a single Amazon S3 bucket. For more information, see Receiving CloudTrail Log Files from Multiple Regions.

Understanding Amazon Elasticsearch Service Log File Entries

CloudTrail log files contain one or more log entries where each entry is made up of multiple JSON-formatted events. A log entry represents a single request from any source and includes information about the requested action, any parameters, the date and time of the action, and so on. The log entries are not guaranteed to be in any particular order—they are not an ordered stack trace of the public API calls. CloudTrail log files include events for all AWS API calls for your AWS account, not just calls to the Amazon ES configuration service API. However, you can read the log files and scan for eventSource es.amazonaws.com. The eventName element contains the name of the configuration service action that was called.

Visualizing Data with Kibana

Kibana is a popular open-source visualization tool designed to work with Elasticsearch. Amazon Elasticsearch Service (Amazon ES) provides a default installation of Kibana with every Amazon ES domain. You can find a link to Kibana on your domain dashboard in the Amazon ES console. For more information about using Kibana to visualize your data, see the Kibana User Guide.

Note

To allow only certain users access to Kibana, you must configure an IP-based access policy. The default installation of Kibana does not support IAM authentication at this time. To learn more about IP-based access policies, see Configuring Access Policies .

Connecting a Local Kibana Server to Amazon ES

Many customers have invested significant time configuring their own local Kibana servers. Instead of repeating that work with the default Kibana instance that Amazon ES provides, you can configure your local Kibana server to connect to the service by making the following changes to kibana.yml:

Copy
kibana_index: ".kibana-4" elasticsearch_url: http:<elasticsearch_domain_endpoint>:80

You must add the http: prefix to your Amazon ES domain endpoint.

Loading Bulk Data with the Logstash Plugin

Logstash provides a convenient way to use the bulk API to upload data into your Amazon ES domain with the S3 plugin. The service also supports all other standard Logstash input plugins that are provided by Elasticsearch. Amazon ES also supports two Logstash output plugins: the standard elasticsearch plugin and the logstash-output-amazon-es plugin, which signs and exports Logstash events to Amazon ES.

You must install your own local instance of Logstash and make the following changes in the Logstash configuration file to enable interaction with Amazon ES.

Configuration Field Input | Output Plugin Description
bucket Input Specifies the Amazon S3 bucket containing the data that you want to load into an Amazon ES domain. You can find this service endpoint in the Amazon Elasticsearch Service console dashboard.
region Input Specifies the AWS Region where the Amazon S3 bucket resides.
hosts Output Specifies the service endpoint for the target Amazon ES domain.
ssl Output Specifies whether to use SSL to connect to Amazon ES.
flush_size Output

By default, Logstash fills a buffer with 5,000 events before sending the entire batch onward. However, if your documents are large, approaching 100 MB in size, we recommend configuring flush_size option to a larger value to prevent the buffer from filling too quickly.

If you increase flush_size, we recommend also setting the Logstash LS_HEAP_SIZE environment variable to 2048 MB to prevent running out of memory.

For more information, see flush_size in the Elasticsearch documentation.

The following example configures Logstash to do the following:

  • Point the output plugin to an Amazon ES endpoint

  • Point to the input plugin to the wikipedia-stats-log bucket in S3

  • Use SSL to connect to Amazon ES

Copy
input{ s3 { bucket => "wikipedia-stats-log" access_key_id => "lizards" secret_access_key => "lollipops" region => "us-east-1" } } output{ elasticsearch { hosts => "search-logs-demo0-cpxczkdpi4bkb4c44g3csyln5a.us-east-1.es.example.com" ssl => true flush_size => 250000 } }

The following example demonstrates the same configuration, but connects to Amazon ES without SSL:

Copy
input{ s3 { bucket => "wikipedia-stats-log" access_key_id => "lizards" secret_access_key => "lollipops" region => "us-east-1" } } output{ elasticsearch { hosts => "search-logs-demo0-cpxczkdpi4bkb4c44g3csyln5a.us-east-1.es.example.com" ssl => false flush_size => 250000 } }

Note

The service request in the preceding example must be signed. For more information about signing requests, see Signing Amazon Elasticsearch Service Requests. Use the logstash-output-amazon-es output plugin to sign and export Logstash events to Amazon ES. For instructions, see README.md.

Signing Amazon Elasticsearch Service Requests

If you're using a language for which AWS provides an SDK, we recommend that you use the SDK to submit Amazon Elasticsearch Service (Amazon ES) requests. All the AWS SDKs greatly simplify the process of signing requests, and save you a significant amount of time when compared with using the Amazon ES APIs directly. The SDKs integrate easily with your development environment and provide easy access to related commands.

If you choose to call the Amazon ES configuration service operations directly, you must sign your own requests. Configuration service requests must always be signed. Upload and search requests must be signed unless you configure anonymous access for those services.

To sign a request, you calculate a digital signature using a cryptographic hash function, which returns a hash value based on the input. The input includes the text of your request and your secret access key. The hash function returns a hash value that you include in the request as your signature. The signature is part of the Authorization header of your request.

After receiving your request, Amazon ES recalculates the signature using the same hash function and input that you used to sign the request. If the resulting signature matches the signature in the request, Amazon ES processes the request. Otherwise, the request is rejected.

Amazon ES supports authentication using AWS Signature Version 4. For more information, see Signature Version 4 Signing Process.

Note

Amazon ES provides a Logstash output plugin to sign and export Logstash events to the service. Download the logstash-output-amazon-es plugin, and see README.md for instructions.

Tagging Amazon Elasticsearch Service Domains

You can use Amazon ES tags to add metadata to your Amazon ES domains. AWS does not apply any semantic meaning to your tags. Tags are interpreted strictly as character strings. All tags have the following elements.

Tag Element Description
Tag key The tag key is the required name of the tag. Tag keys must be unique for the Amazon ES domain to which they are attached. For a list of basic restrictions on tag keys and values, see User-Defined Tag Restrictions.
Tag value The tag value is an optional string value of the tag. Tag values can be null and do not have to be unique in a tag set. For example, you can have a key-value pair in a tag set of project/Trinity and cost-center/Trinity. For a list of basic restrictions on tag keys and values, see User-Defined Tag Restrictions.

Each Amazon ES domain has a tag set, which contains all the tags that are assigned to that Amazon ES domain. AWS does not automatically set any tags on Amazon ES domains. A tag set can contain up to ten tags, or it can be empty. If you add a tag to an Amazon ES domain that has the same key as an existing tag for a resource, the new value overwrites the old value.

You can use these tags to track costs by grouping expenses for similarly tagged resources. An Amazon ES domain tag is a name-value pair that you define and associate with an Amazon ES domain. The name is referred to as the key. You can use tags to assign arbitrary information to an Amazon ES domain. A tag key could be used, for example, to define a category, and the tag value could be an item in that category. For example, you could define a tag key of “project” and a tag value of “Salix,” indicating that the Amazon ES domain is assigned to the Salix project. You could also use tags to designate Amazon ES domains as being used for test or production by using a key such as environment=test or environment=production. We recommend that you use a consistent set of tag keys to make it easier to track metadata that is associated with Amazon ES domains.

You also can use tags to organize your AWS bill to reflect your own cost structure. To do this, sign up to get your AWS account bill with tag key values included. Then, organize your billing information according to resources with the same tag key values to see the cost of combined resources. For example, you can tag several Amazon ES domains with key-value pairs, and then organize your billing information to see the total cost for each domain across several services. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management documentation.

Note

Tags are cached for authorization purposes. Because of this, additions and updates to tags on Amazon ES domains might take several minutes before they are available.

Working with Tags (Console)

Use the following procedure to create a resource tag.

To create a tag (console)

  1. Go to https://aws.amazon.com, and then choose Sign In to the Console.

  2. Under Analytics, choose Elasticsearch Service.

  3. In the navigation pane, choose your Amazon ES domain.

  4. On the domain dashboard, choose Manage tags.

  5. In the Key column, type a tag key.

  6. (Optional) In the Value column, type a tag value.

  7. Choose Submit.

To delete a tag (console)

Use the following procedure to delete a resource tag.

  1. Go to https://aws.amazon.com, and then choose Sign In to the Console.

  2. Under Analytics, choose Elasticsearch Service.

  3. In the navigation pane, choose your Amazon ES domain.

  4. On the domain dashboard, choose Manage tags.

  5. Next to the tag that you want to delete, choose Remove.

  6. Choose Submit.

For more information about using the console to work with tags, see Working with Tag Editor in the AWS Management Console Getting Started Guide.

Working with Tags (AWS CLI)

You can create resource tags using the AWS CLI with the --add-tags command.

Syntax

add-tags --arn=<domain_arn> --tag-list Key=<key>,Value=<value>

Parameter Description
--arn Amazon resource name for the Amazon ES domain to which the tag is attached.
--tag-list Set of space-separated key-value pairs in the following format: Key=<key>,Value=<value>

Example

The following example creates two tags for the logs domain:

Copy
aws es add-tags --arn arn:aws:es:us-east-1:379931976431:domain/logs --tag-list Key=service,Value=Elasticsearch Key=instances,Value=m3.2xlarge

You can remove tags from an Amazon ES domain using the remove-tags command.

Syntax

remove-tags --arn=<domain_arn> --tag-keys Key=<key>,Value=<value>

Parameter Description
--arn Amazon Resource Name (ARN) for the Amazon ES domain to which the tag is attached.
--tag-keys Set of space-separated key-value pairs that you want to remove from the Amazon ES domain.

Example

The following example removes two tags from the logs domain that were created in the preceding example:

Copy
aws es remove-tags --arn arn:aws:es:us-east-1:379931976431:domain/logs --tag-keys service instances

You can view the existing tags for an Amazon ES domain with the list-tags command:

Syntax

list-tags --arn=<domain_arn>

Parameter Description
--arn Amazon Resource Name (ARN) for the Amazon ES domain to which the tags are attached.

Example

The following example lists all resource tags for the logs domain:

Copy
aws es list-tags --arn arn:aws:es:us-east-1:379931976431:domain/logs

Working with Tags (AWS SDKs)

The AWS SDKs (except the Android and iOS SDKs) support all the actions defined in the Amazon ES Configuration API Reference, including the AddTags, ListTags, and RemoveTags operations. For more information about installing and using the AWS SDKs, see AWS Software Development Kits.