Monitoring OpenSearch Service events with Amazon EventBridge - Amazon OpenSearch Service

Monitoring OpenSearch Service events with Amazon EventBridge

Amazon OpenSearch Service integrates with Amazon EventBridge to notify you of certain events that affect your domains. Events from AWS services are delivered to EventBridge in near real time. The same events are also sent to Amazon CloudWatch Events, the predecessor of Amazon EventBridge. You can write simple rules to indicate which events are of interest to you, and what automated actions to take when an event matches a rule. The actions that can be automatically triggered include the following:

  • Invoking an AWS Lambda function

  • Invoking an Amazon EC2 Run Command

  • Relaying the event to Amazon Kinesis Data Streams

  • Activating an AWS Step Functions state machine

  • Notifying an Amazon SNS topic or an Amazon SQS queue

For more information, see Get started with Amazon EventBridge in the Amazon EventBridge User Guide.

Service software update events

OpenSearch Service sends events to EventBridge when one of the following service software update events occur.

Service software update available

OpenSearch Service sends this event when a service software update is available.

Example

The following is an example event of this type:

{ "version": "0", "id": "01234567-0123-0123-0123-012345678901", "detail-type": "Amazon OpenSearch Service Software Update Notification", "source": "aws.es", "account": "123456789012", "time": "2016-11-01T13:12:22Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Service Software Update", "status": "Available", "severity": "Informational", "description": "Service software update [R20200330-p1] available." } }

Service software update started

OpenSearch Service sends this event when a service software update has started.

Example

The following is an example event of this type:

{ "version": "0", "id": "01234567-0123-0123-0123-012345678901", "detail-type": "Amazon OpenSearch Service Software Update Notification", "source": "aws.es", "account": "123456789012", "time": "2016-11-01T13:12:22Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Service Software Update", "status": "Started", "severity": "Informational", "description": "Service software update [R20200330-p1] started." } }

Service software update completed

OpenSearch Service sends this event when a service software update has completed.

Example

The following is an example event of this type:

{ "version": "0", "id": "01234567-0123-0123-0123-012345678901", "detail-type": "Amazon OpenSearch Service Software Update Notification", "source": "aws.es", "account": "123456789012", "time": "2016-11-01T13:12:22Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Service Software Update", "status": "Completed", "severity": "Informational", "description": "Service software update [R20200330-p1] completed." } }

Service software update failed

OpenSearch Service sends this event when a service software update failed.

Example

The following is an example event of this type:

{ "version": "0", "id": "01234567-0123-0123-0123-012345678901", "detail-type": "Amazon OpenSearch Service Software Update Notification", "source": "aws.es", "account": "123456789012", "time": "2016-11-01T13:12:22Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Service Software Update", "status": "Failed", "severity": "Medium", "description": "Service software update [R20200330-p1] failed." } }

Service software update required

OpenSearch Service sends this event when a service software update is required.

Example

The following is an example event of this type:

{ "version": "0", "id": "01234567-0123-0123-0123-012345678901", "detail-type": "Amazon OpenSearch Service Software Update Notification", "source": "aws.es", "account": "123456789012", "time": "2016-11-01T13:12:22Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Service Software Update", "status": "Required", "severity": "High", "description": "Service software update [R20200330-p1] available. Update will be automatically installed after [30/04/2020] if no action is taken." } }

Auto-Tune events

OpenSearch Service sends events to EventBridge when one of the following Auto-Tune events occur.

Auto-Tune pending

OpenSearch Service sends this event when Auto-Tune has identified tuning recommendations for improved cluster performance and availability. You'll only see this event for domains with Auto-Tune disabled.

Example

The following is an example event of this type:

{ "version": "0", "id": "3acb26c8-397c-4c89-a80a-ce672a864c55", "detail-type": "Amazon OpenSearch Service Auto-Tune Notification", "source": "aws.es", "account": "123456789012", "time": "2020-10-30T22:06:31Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Auto-Tune Event", "severity": "Informational", "status": "Pending", "description": "Auto-Tune recommends new settings for your domain. Enable Auto-Tune to improve cluster stability and performance.", "scheduleTime": "{iso8601-timestamp}" } }

Auto-Tune started

OpenSearch Service sends this event when Auto-Tune begins to apply new settings to your domain.

Example

The following is an example event of this type:

{ "version": "0", "id": "3acb26c8-397c-4c89-a80a-ce672a864c55", "detail-type": "Amazon OpenSearch Service Auto-Tune Notification", "source": "aws.es", "account": "123456789012", "time": "2020-10-30T22:06:31Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Auto-Tune Events", "severity": "Informational", "status": "Started", "scheduleTime": "{iso8601-timestamp}", "startTime": "{iso8601-timestamp}", "description" : "Auto-Tune is applying new settings to your domain." } }

Auto-Tune requires a scheduled blue/green deployment

OpenSearch Service sends this event when Auto-Tune has identified tuning recommendations that require a scheduled blue/green deployment.

Example

The following is an example event of this type:

{ "version": "0", "id": "3acb26c8-397c-4c89-a80a-ce672a864c55", "detail-type": "Amazon OpenSearch Service Auto-Tune Notification", "source": "aws.es", "account": "123456789012", "time": "2020-10-30T22:06:31Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Auto-Tune Event", "severity": "Low", "status": "Pending", "startTime": "{iso8601-timestamp}", "description": "Auto-Tune has identified new settings for your domain that require a blue/green deployment. You can schedule the deployment for your preferred time." } }

Auto-Tune cancelled

OpenSearch Service sends this event when Auto-Tune schedule has been cancelled because there is no pending tuning recommendations.

Example

The following is an example event of this type:

{ "version": "0", "id": "3acb26c8-397c-4c89-a80a-ce672a864c55", "detail-type": "Amazon OpenSearch Service Auto-Tune Notification", "source": "aws.es", "account": "123456789012", "time": "2020-10-30T22:06:31Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Auto-Tune Event", "severity": "Low", "status": "Cancelled", "scheduleTime": "{iso8601-timestamp}", "description": "Auto-Tune has cancelled the upcoming blue/green deployment." } }

Auto-Tune completed

OpenSearch Service sends this event when Auto-Tune has completed the blue/green deployment and the cluster is operational with new JVM settings in place.

Example

The following is an example event of this type:

{ "version": "0", "id": "3acb26c8-397c-4c89-a80a-ce672a864c55", "detail-type": "Amazon OpenSearch Service Auto-Tune Notification", "source": "aws.es", "account": "123456789012", "time": "2020-10-30T22:06:31Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Auto-Tune Event", "severity": "Informational", "status": "Completed", "completionTime": "{iso8601-timestamp}", "description": "Auto-Tune has completed the blue/green deployment and successfully applied the updated settings." } }

Auto-Tune disabled and changes reverted

OpenSearch Service sends this event when Auto-Tune has been disabled and the applied changes were rolled back.

Example

The following is an example event of this type:

{ "version": "0", "id": "3acb26c8-397c-4c89-a80a-ce672a864c55", "detail-type": "Amazon OpenSearch Service Auto-Tune Notification", "source": "aws.es", "account": "123456789012", "time": "2020-10-30T22:06:31Z", "region": "us-east-1", "resources": [ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail": { "event": "Auto-Tune Event", "severity": "Informational", "status": "Completed", "description": "Auto-Tune is now disabled. All settings have been reverted. Auto-Tune will continue to evaluate cluster performance and provide recommendations.", "completionTime": "{iso8601-timestamp}" } }

Auto-Tune disabled and changes retained

OpenSearch Service sends this event when Auto-Tune has been disabled and the applied changes were retained.

Example

The following is an example event of this type:

{ "version": "0", "id": "3acb26c8-397c-4c89-a80a-ce672a864c55", "detail-type": "Amazon OpenSearch Service Auto-Tune Notification", "source": "aws.es", "account": "123456789012", "time": "2020-10-30T22:06:31Z", "region": "us-east-1", "resources": ["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail": { "event": "Auto-Tune Event", "severity": "Informational", "status": "Completed", "description": "Auto-Tune is now disabled. The most-recent settings by Auto-Tune have been retained. Auto-Tune will continue to evaluate cluster performance and provide recommendations.", "completionTime": "{iso8601-timestamp}" } }

Cluster health events

OpenSearch Service sends certain events to EventBridge when your cluster's health is compromised.

Red cluster recovery started

OpenSearch Service sends this event after your cluster status has been continuously red for more than an hour. It attempts to automatically restore one or more red indexes from a snapshot in order to fix the cluster status.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Cluster Status Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "event":"Automatic Snapshot Restore for Red Indices", "status":"Started", "severity":"High", "description":"Your cluster status is red. We have started automatic snapshot restore for the red indices. No action is needed from your side. Red indices [red-index-0, red-index-1]" } }

Red cluster recovery partially completed

OpenSearch Service sends this event when it was only able to restore a subset of red indexes from a snapshot while attempting to fix a red cluster status.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Cluster Status Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "event":"Automatic Snapshot Restore for Red Indices", "status":"Partially Restored", "severity":"High", "description":"Your cluster status is red. We were able to restore the following Red indices from snapshot: [red-index-0]. Indices not restored: [red-index-1]. Please refer https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html#handling-errors-red-cluster-status for troubleshooting steps." } }

Red cluster recovery failed

OpenSearch Service sends this event when it fails to restore any indexes while attempting to fix a red cluster status.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Cluster Status Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "event":"Automatic Snapshot Restore for Red Indices", "status":"Failed", "severity":"High", "description":"Your cluster status is red. We were unable to restore the Red indices automatically. Indices not restored: [red-index-0, red-index-1]. Please refer https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html#handling-errors-red-cluster-status for troubleshooting steps." } }

Shards to be deleted

OpenSearch Service sends this event when it has attempted to automatically fix your red cluster status after it was continuously red for 14 days, but one or more indexes remains red. After 7 more days (21 total days of being continuously red), OpenSearch Service proceeds to delete unassigned shards on all red indexes.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Cluster Status Notification", "source":"aws.es", "account":"123456789012", "time":"2022-04-09T10:36:48Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "severity":"Medium", "description":"Your cluster status is red. Please fix the red indices as soon as possible. If not fixed by 2022-04-12 01:51:47+00:00, we will delete all unassigned shards, the unit of storage and compute, for these red indices to recover your domain and make it green. Please refer to https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html#handling-errors-red-cluster-status for troubleshooting steps. test_data, test_data1", "event":"Automatic Snapshot Restore for Red Indices", "status":"Shard(s) to be deleted" } }

Shards deleted

OpenSearch Service sends this event after your cluster status has been continuously red for 21 days. It proceeds to delete the unassigned shards (storage and compute) on all red indexes. For details, see Automatic remediation of red clusters.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Cluster Status Notification", "source":"aws.es", "account":"123456789012", "time":"2022-04-09T10:54:48Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "severity":"High", "description":"We have deleted unassinged shards, the unit of storage and compute, in red indices: index-1, index-2 because these indices were red for more than 21 days and could not be restored with the automated restore process. Please refer to https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html#handling-errors-red-cluster-status for troubleshooting steps.", "event":"Automatic Snapshot Restore for Red Indices", "status":"Shard(s) deleted" } }

High shard count warning

OpenSearch Service sends this event when the average shard count across your hot data nodes has exceeded 90% of the recommended default limit of 1,000. Although later versions of Elasticsearch and OpenSearch support a configurable max shard count per node limit, we recommend you have no more than 1,000 shards per node. See Choosing the number of shards.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail":{ "event":"High Shard Count", "status":"Warning", "severity":"Low", "description":"One or more data nodes have close to 1000 shards. To ensure optimum performance and stability of your cluster, please refer to the best practice guidelines - https://docs.aws.amazon.com/opensearch-service/latest/developerguide/sizing-domains.html#bp-sharding." } }

Shard count limit exceeded

OpenSearch Service sends this event when the average shard count across your hot data nodes has exceeded the recommended default limit of 1,000. Although later versions of Elasticsearch and OpenSearch support a configurable max shard count per node limit, we recommend you have no more than 1,000 shards per node. See Choosing the number of shards.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail":{ "event":"High Shard Count", "status":"Warning", "severity":"Medium", "description":"One or more data nodes have more than 1000 shards. To ensure optimum performance and stability of your cluster, please refer to the best practice guidelines - https://docs.aws.amazon.com/opensearch-service/latest/developerguide/sizing-domains.html#bp-sharding." } }

Low disk space

OpenSearch Service sends this event when one or more nodes in your cluster has less than 25% of available storage space, or less than 25 GB.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Notification", "source":"aws.es", "account":"123456789012", "time":"2017-12-01T13:12:22Z", "region":"us-east-1", "resources":["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail":{ "event":"Low Disk Space", "status":"Warning", "severity":"Medium", "description":"One or more data nodes in your cluster has less than 25% of storage space or less than 25GB. Your cluster will be blocked for writes at 20% or 20GB. Please refer to the documentation for more information - https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html#troubleshooting-cluster-block" } }

EBS burst balance below 70%

OpenSearch Service sends this event when the EBS burst balance on one or more data nodes falls below 70%. EBS burst balance depletion can cause widespread cluster unavailability and throttling of I/O requests, which can lead to high latencies and timeouts on indexing and search requests. For steps to fix this issue, see Low EBS burst balance.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Notification", "source":"aws.es", "account":"123456789012", "time":"2017-12-01T13:12:22Z", "region":"us-east-1", "resources":["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail":{ "event":"EBS Burst Balance", "status":"Warning", "severity":"Medium", "description":"EBS burst balance on one or more data nodes is below 70%. Follow https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html#handling-errors-low-ebs-burst to fix this issue." } }

EBS burst balance below 20%

OpenSearch Service sends this event when the EBS burst balance on one or more data nodes falls below 20%. EBS burst balance depletion can cause widespread cluster unavailability and throttling of I/O requests, which can lead to high latencies and timeouts on indexing and search requests. For steps to fix this issue, see Low EBS burst balance.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Notification", "source":"aws.es", "account":"123456789012", "time":"2017-12-01T13:12:22Z", "region":"us-east-1", "resources":["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail":{ "event":"EBS Burst Balance", "status":"Warning", "severity":"High", "description":"EBS burst balance on one or more data nodes is below 20%. Follow https://docs.aws.amazon.com/opensearch-service/latest/developerguide/handling-errors.html#handling-errors-low-ebs-burst to fix this issue. } }

Disk throughput throttle

OpenSearch Service sends this event when read and write requests to your domain are being throttled due to the throughput limitations of your EBS volumes. If you receive this notification, consider scaling up your instances following AWS recommended best practices.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Notification", "source":"aws.es", "account":"123456789012", "time":"2017-12-01T13:12:22Z", "region":"us-east-1", "resources":["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail":{ "event":"Disk Throughput Throttle", "status":"Warning", "severity":"Medium", "description":"Your domain is experiencing throttling as you have hit disk throughout limits. Please consider scaling your domain to suit your throughput needs. Please refer to the documentation for more information." } }

VPC endpoint events

OpenSearch Service sends certain events to EventBridge related to AWS PrivateLink interface endpoints.

VPC endpoint creation failed

OpenSearch Service sends this event when it's unable to create a requested VPC endpoint. This error might occur because you've reached the limit on the number of VPC endoints allowed within a Region. You will also see this error if a specified subnet or security group doesn't exist.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service VPC Endpoint Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "event":"VPC Endpoint Create Validation", "status":"Failed", "severity":"High", "description":"Unable to create VPC endpoint aos-0d4c74c0342343 for domain arn:aws:es:eu-south-1:123456789012:domain/my-domain due to the following validation failures: You've reached the limit on the number of VPC endpoints that you can create in the AWS Region." } }

VPC endpoint update failed

OpenSearch Service sends this event when it's unable to delete a requested VPC endpoint.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service VPC Endpoint Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "event":"VPC Endpoint Update Validation", "status":"Failed", "severity":"High", "description":"Unable to update VPC endpoint aos-0d4c74c0342343 for domain arn:aws:es:eu-south-1:123456789012:domain/my-domain due to the following validation failures: <failure message>." } }

VPC endpoint deletion failed

OpenSearch Service sends this event when it's unable to delete a requested VPC endpoint.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service VPC Endpoint Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "event":"VPC Endpoint Delete Validation", "status":"Failed", "severity":"High", "description":"Unable to delete VPC endpoint aos-0d4c74c0342343 for domain arn:aws:es:eu-south-1:123456789012:domain/my-domain due to the following validation failures: Specified subnet doesn't exist." } }

Domain error events

OpenSearch Service sends events to EventBridge when one of the following domain errors occur.

Domain update validation failure

OpenSearch Service sends this event if it encounters one or more validation failures when attempting to update or perform a configuration change on a domain. For steps to resolve these failures, see Troubleshooting validation errors.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Amazon OpenSearch Service Domain Update Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":[ "arn:aws:es:us-east-1:123456789012:domain/test-domain" ], "detail":{ "event":"Domain Update Validation", "status":"Failed", "severity":"High", "description":"Unable to perform updates to your domain due to the following validation failures: <failures> Please see the documentation for more information https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-configuration-changes.html#validation" } }

KMS key inaccessible

OpenSearch Service sends this event when it can't access your AWS KMS key.

Example

The following is an example event of this type:

{ "version":"0", "id":"01234567-0123-0123-0123-012345678901", "detail-type":"Domain Error Notification", "source":"aws.es", "account":"123456789012", "time":"2016-11-01T13:12:22Z", "region":"us-east-1", "resources":["arn:aws:es:us-east-1:123456789012:domain/test-domain"], "detail":{ "event":"KMS Key Inaccessible", "status":"Error", "severity":"High", "description":"The KMS key associated with this domain is inaccessible. You are at risk of losing access to your domain. For more information, please refer https://docs.aws.amazon.com/opensearch-service/latest/developerguide/encryption-at-rest.html#disabled-key." } }