OR1 storage for Amazon OpenSearch Service
OR1 is an instance family for Amazon OpenSearch Service that provides a cost-effective way to store large
amounts of data. A domain with OR1 instances uses Amazon Elastic Block Store (Amazon EBS) gp3
or
io1
volumes for primary storage, with data copied synchronously to Amazon S3 as
it arrives. This storage structure provides increased indexing throughput with high
durability. The OR1 instance family also supports automatic data recovery in the event of
failure. For information about OR1 instance type options, see Current generation instance types.
If you're running indexing heavy operational analytics workloads such as log analytics, observability, or security analytics, you can benefit from the improved performance and compute efficiency of OR1 instances. In addition, the automatic data recovery offered by OR1 instances improves the overall reliability of your domain.
OpenSearch Service sends storage-related OR1 metrics to Amazon CloudWatch. For a list of available metrics, see OR1 metrics.
OR1 instances are available on-demand or with Reserved Instance pricing, with an hourly rate for the instances and storage provisioned in Amazon EBS and Amazon S3.
Topics
Limitations
Consider the following limitations when using OR1 instances for your domain.
-
Newly created domains must be running OpenSearch version 2.11 or higher.
-
Exisiting domains must be running OpenSearch version 2.15 or higher.
-
Your domain must have encryption at rest enabled. For more information, see Encryption of data at rest for Amazon OpenSearch Service.
-
If your domain uses dedicated master nodes, they must use Graviton instances. For more information about dedicated master nodes, see Dedicated master nodes in Amazon OpenSearch Service.
-
The refresh interval for indexes on OR1 instances must be 10 seconds or higher. The default refresh interval for OR1 instances is 10 seconds.
Tuning for better ingestion throughput
To get the best indexing throughput from your OR1 instances, it is recommended you do the following:
-
Use Large Bulk Sizes to improve buffer utilization. The recommended size is 10 MB.
-
Use multiple clients to improve parallel processing performance.
-
Set your number of active primary shards to match the number of data nodes to maximize resource utilization.
How OpenSearch optimized instances differ from non OpenSearch optimized instances
OpenSearch optimized instances differ from non OpenSearch optimized instances in the following ways:
-
For OpenSearch optimized instances, indexing is only performed on primary shards.
-
If OpenSearch optimized instances are configured with replicas, the indexing rate may appear lower than it actually is. For example, if there is 1 primary shard and 1 replica shard, the indexing rate may show a rate of 1000, however, the actual indexing rate is 2000.
-
OpenSearch optimized instances perform buffer operations prior to sending to a remote source. This results in higher ingestion latencies.
Note
The
IndexingLatency
metric is not affected, as it doesn’t include time to sync translog. ) -
Replica shards can be a few seconds behind primary shards. The time lag can be seen from
ReplicationLagMaxTime
metric
How OR1 differs from UltraWarm storage
OpenSearch Service provides UltraWarm instances that are a cost-effective way to store large amounts of read-only data. Both OR1 and UltraWarm instances store data locally in Amazon EBS and remotely in Amazon S3. However, OR1 and UltraWarm instances differ in several important ways:
-
OR1 instances keep a copy of data in both your local and remote store. In UltraWarm instances, data is kept primarily in remote store to reduce storage costs. Depending on your usage patterns, data can be moved to local storage.
-
OR1 instances are active and can accept read and write operations, whereas the data on UltraWarm instances is read-only until you manually move it back to hot storage.
-
UltraWarm relies on index snapshots for data durability. OR1 instances, by comparison, perform replication and recovery behind the scenes. In the event of a red index, OR1 instances will automatically restore missing shards from your remote storage in Amazon S3. The recovery time varies depending on the volume of data to be recovered.
For more information about UltraWarm storage, see UltraWarm storage for Amazon OpenSearch Service.
Using OR1 instances
You can select OR1 instances for your data nodes when you create a new domain with the AWS Management Console, the AWS Command Line Interface (AWS CLI), or the AWS SDK. You can then index and query the data using your existing tools.
-
Navigate to the Amazon OpenSearch Service console at https://console.aws.amazon.com/aos/
. -
In the left navigation pane, choose Domains.
-
Choose Create domain.
-
Enter a name for your domain along with your other preferred options. Under Instance family, choose OR1. Choose Create to start the domain creation process.
-
Navigate to your AWS CLI terminal. If you need to install the AWS CLI, see Install or update the latest version of the AWS CLI.
-
To use OR1 storage, you must provide the value of the specific OR1 instance type size in the
InstanceType
field when you create a domain. You must also enable encryption at rest.The following example creates a domain with OR1 instances of size
2xlarge
.aws opensearch create-domain \ --domain-name
test-domain
\ --engine-version OpenSearch_2.11 \ --cluster-config "InstanceType=or1.2xlarge.search,InstanceCount=3,DedicatedMasterEnabled=true,DedicatedMasterType=r6g.large.search,DedicatedMasterCount=3" \ --ebs-options "EBSEnabled=true,VolumeType=gp3,VolumeSize=200" \ --encryption-at-rest-options Enabled=true \ --advanced-security-options "Enabled=true,InternalUserDatabaseEnabled=true,MasterUserOptions={MasterUserName=test-user
,MasterUserPassword=test-password
}" \ --node-to-node-encryption-options Enabled=true \ --domain-endpoint-options EnforceHTTPS=true \ --access-policies '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"AWS":"*"},"Action":"es:*","Resource":"arn:aws:es:us-east-1
:account-id
:domain/test-domain
/*"}]}'