Cold storage for Amazon Elasticsearch Service - Amazon Elasticsearch Service

Cold storage for Amazon Elasticsearch Service

Cold storage lets you store any amount of infrequently accessed or historical data on your Amazon Elasticsearch Service (Amazon ES) domain and analyze it on demand, at a lower cost than other storage tiers. Cold storage is appropriate if you need to do periodic research or forensic analysis on your older data. Practical examples of data suitable for cold storage include infrequently accessed logs, data that must be preserved to meet compliance requirements, or logs that have historical value.

Similar to UltraWarm storage, cold storage is backed by Amazon S3. When you need to query cold data, you can selectively attach it to existing UltraWarm nodes. You can manage the migration and lifecycle of your cold data manually or with Index State Management policies.

Prerequisites

Cold storage has the following prerequisites:

  • Cold storage requires Elasticsearch version 7.9 or later.

  • To enable cold storage on an Amazon ES domain, you must also enable UltraWarm on the same domain.

  • To use cold storage, domains must have dedicated master nodes.

  • If your domain uses a T2 or T3 instance type for your data nodes, you can't use cold storage.

  • If the domain uses fine-grained access control, non-admin users must be mapped to the cold_manager role in Kibana in order to manage cold indices.

Note

The cold_manager role might not exist on some preexisting Amazon ES domains. If you don't see the role in Kibana, you need to manually create it.

Configure permissions

If you enable cold storage on a preexisting Amazon ES domain, the cold_manager role might not be defined on the domain. If the domain uses fine-grained access control, non-admin users must be mapped to this role in order to manage cold indices. To manually create the cold_manager role, perform the following steps:

  1. In Kibana, go to Security and choose Permissions.

  2. Choose Create action group and configure the following groups:

    Group name Permissions
    cold_cluster
    • cluster:monitor/nodes/stats

    • cluster:admin/ultrawarm*

    • cluster:admin/cold/*

    cold_index
    • indices:monitor/stats

    • indices:data/read/minmax

    • indices:admin/ultrawarm/migration/get

    • indices:admin/ultrawarm/migration/cancel

  3. Choose Roles and Create role.

  4. Name the role cold_manager.

  5. For Cluster permissions, choose the cold_cluster group you created.

  6. For Index, enter *.

  7. For Index permissions, choose the cold_index group you created.

  8. Choose Create.

  9. After you create the role, map it to any user or backend role that manages cold indices.

Cold storage requirements and performance considerations

Because cold storage uses Amazon S3, it incurs none of the overhead of hot storage, such as replicas, Linux reserved space, and Amazon ES reserved space. Cold storage doesn't have specific instance types because it doesn't have any compute capacity attached to it. You can store any amount of data in cold storage. Monitor the ColdStorageSpaceUtilization metric in Amazon CloudWatch to see how much cold storage space you're using.

Cold storage pricing

Similar to UltraWarm storage, with cold storage you only pay for data storage. There's no compute cost for cold data and you wont get billed if theres no data in cold storage.

You don't incur any transfer charges when moving data between cold and warm storage. While indices are being migrated between warm and cold storage, you continue to pay for only one copy of the index. After the migration completes, the index is billed according to the storage tier it was migrated to. For more information about cold storage pricing, see Pricing for Amazon Elasticsearch Service.

Enabling cold storage

The console is the simplest way to create a domain that uses cold storage. While creating the domain, choose Enable cold storage. The same process works on existing domains as long as you meet the prerequisites. Even after the domain state changes from Processing to Active, cold storage might not be available for several hours.

You can also use the AWS CLI or configuration API to enable cold storage.

Sample CLI command

The following AWS CLI command creates a domain with three data nodes, three dedicated master nodes, cold storage enabled, and fine-grained access control enabled:

aws es create-elasticsearch-domain \ --domain-name my-domain \ --elasticsearch-version 7.10 \ --elasticsearch-cluster-config ColdStorageOptions={Enabled=true},WarmEnabled=true,WarmCount=4,WarmType=ultrawarm1.medium.elasticsearch,InstanceType=r6g.large.elasticsearch,DedicatedMasterEnabled=true,DedicatedMasterType=r6g.large.elasticsearch,DedicatedMasterCount=3,InstanceCount=3 \ --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=11 \ --node-to-node-encryption-options Enabled=true \ --encryption-at-rest-options Enabled=true \ --domain-endpoint-options EnforceHTTPS=true,TLSSecurityPolicy=Policy-Min-TLS-1-2-2019-07 \ --advanced-security-options Enabled=true,InternalUserDatabaseEnabled=true,MasterUserOptions='{MasterUserName=master-user,MasterUserPassword=master-password}' \ --region us-east-2

For detailed information, see the AWS CLI Command Reference.

Sample configuration API request

The following request to the configuration API creates a domain with three data nodes, three dedicated master nodes, cold storage enabled, and fine-grained access control enabled:

POST https://es.us-east-2.amazonaws.com/2015-01-01/es/domain { "ElasticsearchClusterConfig": { "InstanceCount": 3, "InstanceType": "r6g.large.elasticsearch", "DedicatedMasterEnabled": true, "DedicatedMasterType": "r6g.large.elasticsearch", "DedicatedMasterCount": 3, "ZoneAwarenessEnabled": true, "ZoneAwarenessConfig": { "AvailabilityZoneCount": 3 }, "WarmEnabled": true, "WarmCount": 4, "WarmType": "ultrawarm1.medium.elasticsearch", "ColdStorageOptions": { "Enabled": true } }, "EBSOptions": { "EBSEnabled": true, "VolumeType": "gp2", "VolumeSize": 11 }, "EncryptionAtRestOptions": { "Enabled": true }, "NodeToNodeEncryptionOptions": { "Enabled": true }, "DomainEndpointOptions": { "EnforceHTTPS": true, "TLSSecurityPolicy": "Policy-Min-TLS-1-2-2019-07" }, "AdvancedSecurityOptions": { "Enabled": true, "InternalUserDatabaseEnabled": true, "MasterUserOptions": { "MasterUserName": "master-user", "MasterUserPassword": "master-password" } }, "ElasticsearchVersion": "7.10", "DomainName": "my-domain" }

For detailed information, see Configuration API reference for Amazon Elasticsearch Service.

Managing cold indices in Kibana

You can manage hot, warm and cold indices with the existing Kibana interface in your Amazon ES domain. Kibana enables you to migrate indices between warm and cold storage, and monitor index migration status, without using the CLI or configuration API. For more information, see Managing indices in Kibana.

Migrating indices to cold storage

When you migrate indices to cold storage, you provide a time range for the data to make discovery easier. You can select a timestamp field based on the data in your index, manually provide a start and end timestamp, or choose to not specify one.

Parameter Supported value Description
timestamp_field The date/time field from the index mapping.

The minimum and maximum values of the provided field are computed and stored as the start_time and end_time metadata for the cold index.

start_time and end_time

One of the following formats:

  • strict_date_optional_time. For example: yyyy-MM-dd'T'HH:mm:ss.SSSZ or yyyy-MM-dd

  • Epoch time in milliseconds

The provided values are stored as the start_time and end_time metadata for the cold index.

If you don't want to specify a timestamp, add ?ignore=timestamp to the request instead.

The following request migrates a warm index to cold storage and provides start and end times for the data in that index:

POST _ultrawarm/migration/my-index/_cold { "start_time": "2020-03-09", "end_time": "2020-03-09T23:00:00Z" }

Then check the status of the migration:

GET _ultrawarm/migration/my-index/_status

You can migrate indices from warm to cold storage in batches of 10, with a maximum of 100 simultaneous requests. The migration process has the following states:

ACCEPTED_COLD_MIGRATION - Migration request is accepted and queued. RUNNING_METADATA_MIGRATION - The migration request was selected for execution and metadata is migrating to cold storage. FAILED_METADATA_MIGRATION - The attempt to add index metadata has failed and all retries are exhausted. PENDING_INDEX_DETACH - Index metadata migration to cold storage is completed. Preparing to detach the warm index state from the local cluster. RUNNING_INDEX_DETACH - Local warm index state from the cluster is being removed. Upon success, the migration request will be completed. FAILED_INDEX_DETACH - The index detach process failed and all retries are exhausted.

Automating migrations to cold storage

You can use Index State Management to automate the migration process after an index reaches a certain age or meets other conditions. See the sample policy, which demonstrates how to automatically migrate indices from hot to UltraWarm to cold storage.

Note

An explicit timestamp_field is required in order to move indices to cold storage using an Index State Management policy.

Canceling migrations to cold storage

If a migration to cold storage is queued or in a failed state, you can cancel the migration using the following request:

POST _ultrawarm/migration/_cancel/my-index

If your domain uses fine-grained access control, you need the indices:admin/ultrawarm/migration/cancel permission to make this request.

Listing cold indices

Before querying, you can list the indices in cold storage to decide which ones to migrate to UltraWarm for further analysis. The following request lists all cold indices, sorted by index name:

GET _cold/indices/_search

Filtering

You can filter cold indices based on a prefix-based index pattern and time range offsets.

The following request lists indices that match the prefix pattern of event-*:

GET _cold/indices/_search { "filters":{ "index_pattern": "event-*" } }

The following request returns indices with start_time and end_time metadata fields between 2019-03-01 and 2020-03-01:

GET _cold/indices/_search { "filters": { "time_range": { "start_time": "2019-03-01", "end_time": "2020-03-01" } } }

Sorting

You can sort cold indices by metadata fields such as index name or size. The following request lists all indices sorted by size in descending order:

GET _cold/indices/_search { "sort_key": "size:desc" }

Other valid sort keys are start_time:asc/desc, end_time:asc/desc, and index_name:asc/desc.

Pagination

You can paginate a list of cold indices. Configure the number of indices to be returned per page with the page_size parameter (default is 10). Every _search request on your cold indices returns a pagination_id which you can use for subsequent calls.

The following request paginates the results of a _search request of your cold indices and displays the next 100 results:

GET _cold/indices/_search?page_size=100 { "pagination_id": "je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY" }

Migrating cold indices to warm storage

After you narrow down your list of cold indices with the filtering criteria in the previous section, migrate them back to UltraWarm where you can query the data and use it to create visualizations.

The following request migrates two cold indices back to warm storage:

POST _cold/migration/_warm { "indices": "my-index1,my-index2" }

To check the status of the migration and retrieve the migration ID, send the following request:

GET _cold/migration/_status

To get index-specific migration information, include the index name:

GET _cold/migration/my-index/_status

Rather than specifying an index, you can list the indices by their current migration status. Valid values are _failed, _accepted, and _all.

The following command gets the status of all indices in a single migration request:

GET _cold/migration/_status?migration_id=my-migration-id

Retrieve the migration ID using the status request. For detailed migration information, add &verbose=true.

You can migrate indices from cold to warm storage in batches of 10, with a maximum of 100 simultaneous requests. The migration process has the following states:

ACCEPTED_MIGRATION_REQUEST - Migration request is accepted and queued. RUNNING_INDEX_CREATION - Migration request is picked up for processing and will create warm indices in the cluster. PENDING_COLD_METADATA_CLEANUP - Warm index is created and the migration service will attempt to clean up cold metadata. RUNNING_COLD_METADATA_CLEANUP - Cleaning up cold metadata from the indices migrated to warm storage. FAILED_COLD_METADATA_CLEANUP - Failed to clean up metadata in the cold tier. FAILED_INDEX_CREATION - Failed to create an index in the warm tier.

Canceling migrations from cold to warm storage

If an index migration from cold to warm storage is queued or in a failed state, you can cancel it with the following request:

POST _cold/migration/my-index/_cancel

To cancel migration for a batch of indices (maximum of 10 at a time), specify the migration ID:

POST _cold/migration/_cancel?migration_id=my-migration-id

Retrieve the migration ID using the status request.

Updating cold index metadata

You can update the start_time and end_time fields for a cold index:

PATCH _cold/my-index { "start_time": "2020-01-01", "end_time": "2020-02-01" }

You can't update the timestamp_field of an index in cold storage.

Note

Kibana doesn't support the PATCH method. Use curl, Postman, or some other method to update cold metadata.

Deleting cold indices

If you're not using an ISM policy you can delete cold indices manually. The following request deletes a cold index:

DELETE _cold/my-index

Disabling cold storage

The Amazon ES console is the simplest way to disable cold storage. Select the domain and choose Edit domain, then deselect Enable cold storage.

To use the AWS CLI or configuration API, under ColdStorageOptions, set "Enabled"="false".

Before you disable cold storage, you must either delete all cold indices or migrate them back to warm storage, otherwise the disable action fails.