This whitepaper is for historical reference only. Some content might be outdated and some links might not be available.
Amazon OpenSearch Service
Amazon OpenSearch Service
It’s easy to get started with OpenSearch Service. You can set up and configure your OpenSearch Service domain in
minutes from the AWS Management Console
OpenSearch Service enables you to scale your cluster through a single API call, or a few clicks in the console. With OpenSearch Service, you get direct access to the OpenSearch Service open-source API, so that code and applications you’re already using with your existing OpenSearch Service environments will work seamlessly.
In addition to X86 based instances, Amazon OpenSearch Service offers instances from the Graviton2 instance
Ideal usage patterns
OpenSearch Service is ideal for querying and searching large amounts of data. Organizations can use OpenSearch Service to do the following:
-
Analyze activity logs, such logs for customer facing applications or websites
-
Analyze CloudWatch Logs with OpenSearch Service
-
Analyze product usage data coming from various services and systems
-
Analyze social media sentiments, CRM data, and find trends for your brand and products
-
Analyze data stream updates from other AWS services, such as Amazon Kinesis Data Streams and Amazon DynamoDB
-
Utilize a rich search and navigation experience.
-
Monitor usage for mobile applications
Cost model
With Amazon OpenSearch Service, you pay only for what you use. There are no minimum fees or upfront commitments. You are charged for OpenSearch Service instance hours, Amazon EBS storage (if you choose this option), and standard data transfer fees.
You can get started with our free tier, which provides free usage of up to 750 hours
per month of a single-AZ t2.micro.elasticsearch
or t2.small.elasticsearch
instance and 10 GB
per month of optional Amazon EBS storage (Magnetic or General Purpose). With OpenSearch Service Reserved
Instances, you can reserve instances for a one- or three-year term and get significant savings
on usage costs compared to On-Demand Instances.
OpenSearch Service enables you to add data durability through automated and manual snapshots of your cluster. OpenSearch Service provides storage space for automated snapshots free of charge for each Amazon OpenSearch Service domain. Automated snapshots are retained for a period of 14 days. Manual snapshots are charged according to Amazon S3 storage rates. Data transfer for using the snapshots is free of charge. At the time of this writing, with Graviton2 instances for OpenSearch Service, you can realize up to 44% price/performance improvement over previous generation instances.
For more information, see Amazon OpenSearch Service Pricing
UltraWarm is a tier for OpenSearch Service that provides a cost-effective way to store large amounts of read-only data on OpenSearch Service. Rather than attached storage, UltraWarm nodes use S3 and a sophisticated caching solution to improve performance. For indexes that you are not actively writing to, query less frequently, and don't need the same performance from, UltraWarm offers significantly lower costs per GiB of data. Because warm indices are read-only unless you return them to hot storage, UltraWarm is best-suited to immutable data, such as logs.
Performance
Performance of OpenSearch Service depends on multiple factors including instance type, workload, index, number of shards used, read replicas, and storage configurations – instance storage or EBS storage (general purpose SSD). Indexes are made up of shards of data which can be distributed on different instances in multiple Availability Zones.
Read replica of the shards are maintained by OpenSearch Service in a different Availability Zone if zone awareness is checked. OpenSearch Service can use either the fast SSD instance storage for storing indexes or multiple EBS volumes. A search engine makes heavy use of storage devices and making disks faster will result in faster query and search performance. Leveraging Graviton2 instances can also improve indexing throughput, indexing latency reduction, and query performance, in comparison with the corresponding x86-based instances from the current generation.
Durability and availability
You can configure your OpenSearch Service domains for high availability by enabling the Zone Awareness option either at domain creation time or by modifying a live domain. When Zone Awareness is enabled, OpenSearch Service distributes the instances that support the domain across two different Availability Zones. Then, if you enable replicas in OpenSearch Service, the instances are automatically distributed in such a way as to deliver cross-zone replication. You can build data durability for your OpenSearch Service domain through automated and manual snapshots.
You can use snapshots to recover your domain with preloaded data or to create a new domain with preloaded data. Snapshots are stored in Amazon S3, which is a secure, durable, highly-scalable object storage. By default, S3 automatically creates daily snapshots of each domain. In addition, you can use the S3 snapshot APIs to create additional manual snapshots. The manual snapshots are stored in S3. Manual snapshots can be used for cross-Region disaster recovery and to provide additional durability.
Scalability and elasticity
You can add or remove instances, and easily modify Amazon EBS volumes to accommodate data
growth. You can write a few lines of code that will monitor the state of your domain through
Amazon CloudWatch metrics and call the OpenSearch Service API to scale your domain up or down based on thresholds you
set. The service will run the scaling without any downtime. OpenSearch Service supports 1 EBS volume (max
size of 1.5 TB) per instance associated with a domain. With the default maximum of 20 data
nodes allowed per OpenSearch Service domain, you can allocate about 30 TB of EBS storage to a single domain.
You can request a service limit increase up to 100 instances per domain by creating a case
with the AWS Support Center
UltraWarm enables you to dramatically extend your data retention period and reduce costs
by up to 90% over hot storage. Best of all, the interactive analytics experience remains.
Query your warm indexes just like any other index, or use them to build OpenSearch Dashboards.
UltraWarm uses a combination of S3 and nodes powered by the AWS Nitro System
Cold storage lets you store any amount of infrequently accessed or historical data on your OpenSearch Service domain and analyze it on demand, at a lower cost than other storage tiers. Cold storage is appropriate if you need to do periodic research or forensic analysis on your older data. Practical examples of data suitable for cold storage include infrequently accessed logs, data that must be preserved to meet compliance requirements, or logs that have historical value.
Similar to UltraWarm storage, cold storage is backed by S3. When you need to query cold data, you can selectively attach it to existing UltraWarm nodes. You can manage the migration and lifecycle of your cold data manually or with Index State Management policies.
Interfaces
OpenSearch Service supports many of the commonly used OpenSearch APIs, so code, applications, and popular tools that you're already using with your current OpenSearch environments will work seamlessly. For a full list of supported OpenSearch operations, see OpenSearch Service documentation .
The AWS CLI, API, or the AWS Management Console can be used for creating and managing your domains as well.
OpenSearch Service supports integration with several AWS services, including streaming data from S3 buckets, Amazon Kinesis Data Streams, and DynamoDB Streams. Both integrations use a Lambda function as an event handler in the cloud that responds to new data in Amazon S3 and Amazon Kinesis Data Streams by processing it and streaming the data to your OpenSearch Service domain. OpenSearch Service also integrates with Amazon CloudWatch for monitoring OpenSearch Service domain metrics and CloudTrail for auditing configuration API calls to OpenSearch Service domains.
OpenSearch Service includes built-in integration with OpenSearch Dashboards, an open-source analytics and visualization platform and supports integration with Logstash, an open-source data pipeline that helps you process logs and other event data. You can set up your OpenSearch Service domain as the backend store for all logs coming through your Logstash implementation to easily ingest structured and unstructured data from a variety of sources.
Fine-grained access control
Fine-grained access control offers additional ways of controlling access to your data on OpenSearch Service. For example, depending on who makes the request, you might want a search to return results from only one index. You might want to hide certain fields in your documents or exclude certain documents altogether. Fine-grained access control offers the following benefits:
-
Role-based access control
-
Security at the index, document, and field level
-
OpenSearch Dashboards multi-tenancy
-
HTTP basic authentication for OpenSearch Service and OpenSearch Dashboards
SAML authentication for OpenSearch Dashboards lets you use your existing identity provider to offer single sign-on (SSO) for OpenSearch Dashboards on OpenSearch Service domains running OpenSearch Service 6.7 or later. To use SAML authentication, you must enable fine-grained access control.
Rather than authenticating through Amazon Cognito or the internal user database, SAML authentication for OpenSearch Dashboards lets you use third-party identity providers to log in to OpenSearch Dashboards, manage fine-grained access control, search your data, and build visualizations. OpenSearch Service supports providers that use the SAML 2.0 standard, such as Okta, Keycloak, Active Directory Federation Services (AD FS), and Auth0. Requests from OpenSearch Service to third-party providers aren't explicitly encrypted with a service provider certificate.
SAML authentication for OpenSearch Dashboards is only for accessing OpenSearch Dashboards through a web browser. Your SAML credentials do not let you make direct HTTP requests to the OpenSearch Service or OpenSearch Dashboards APIs.
Anti-patterns
Amazon OpenSearch Service has the following anti-patterns:
-
Online transaction processing (OLTP) - OpenSearch Service is a real-time distributed search and analytics engine. There is no support for transactions or processing on data manipulation. If your requirement is for a fast transactional system, then a traditional relational database system built on Amazon RDS, or a NoSQL database offering functionality such as DynamoDB, is a better choice.
-
Ad hoc data querying – While OpenSearch Service takes care of the operational overhead of building a highly scalable OpenSearch Service cluster, if running ad hoc queries or one-off queries against your data set is your use-case, Amazon Athena
is a better choice. Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL, without provisioning servers.