Architecture details - Migration Assistant for Amazon OpenSearch Service

Architecture details

This section describes the components and AWS services that make up this solution and the architecture details on how these components work together. These components are meant to accomplish one of the following scenarios:

  • Metadata migration - Migrating cluster metadata, such as index settings, aliases, and templates.

  • Backfill migration - Migrating existing or historical data from a source to a target cluster.

  • Live traffic migration - Replicating live ongoing traffic from source to target cluster.

  • Comparative tooling - Comparing the performance and behaviors of an existing cluster with a prospective new one.

In this guide, we focus on the first three scenarios, guiding you through a backfill from a source cluster while concurrently handling live production traffic, which will be captured and replayed to a target cluster.

Important

Migration strategies aren’t universally applicable. This guide provides instructions based on engineering best practices.

AWS services in this solution

AWS service Description

AWS CloudFormation

Core. Infrastructure as Code (IaC) templates used to deploy and configure Migration Assistant.

Amazon OpenSearch Service (AOS)

Core. A Search, Logging, and Analytics Engine that users can upgrade to, migrate to, and use to compare the results of a source and target cluster.

Amazon Managed Streaming Service for Apache Kafka (MSK)

Core. Stream-processor that is fully managed. It is used as a durable way to store and reuse HTTP traffic.

Amazon Elastic Container Service (ECS)

Core. Runs highly secure, reliable, and scalable containers. The Migration Management Console and Traffic Replayer run in Amazon ECS.

Amazon Elastic File System

Core. Scalable persistent storage utilized for retaining the request and response data from both the source and target clusters.

Amazon S3

Core. Storage allocated for Historical Backfill tasks, which involves exporting a snapshot from the source to be restored by the target cluster. S3 is also used to store IaC content.

AWS Systems Manager

Supporting. Provides you visibility and control of your infrastructure on AWS. Systems Manager provides a unified user interface so you can view operational data from multiple AWS services and enables you to automate operational tasks across your AWS resources.

AWS Secrets Manager

Supporting. A secure way for storing sensitive data, such as cluster credentials, that is required for Migration Assistant.

Amazon EC2

Supporting. Provides networking and security infrastructure for Migration Assistant including securing groups, and Virtual Private Networks.

AWS Lambda

Supporting. Lambda facilitates the execution of serverless functions and is employed by Migration Assistant to operate its suite of tools.

Amazon CloudWatch

Optional. Observe and monitor resources and applications on AWS or in the local Docker solution.

Self-service Elasticsearch/OpenSearch Source Cluster

The source cluster for this solution is based on Elasticsearch or OpenSearch, operating on EC2 instances or alternative computing infrastructure. Configure a capture proxy to interface with the source cluster, positioning the proxy in front of, or on each of the cluster coordinating nodes.

Migration Management Console

The Migration Management Console is a containerized portal that operates on Fargate within Amazon Elastic Container Service (Amazon ECS). Its primary role is to facilitate running the Migration Assistant for Amazon OpenSearch Service solution, along with providing a suite of tools designed to aid in the migration process. The console provides a migration-specific CLI and offers a variety of tools to streamline the migration process. Everything necessary for completing a migration, other than cleaning up the migration resources, can be done through this console.

Metadata Migration Tool

The Metadata Migration Tool is integrated into the Migration Management Console CLI. You can also use the Metadata Migration Tool to migrate cluster metadata, including index mappings, index configuration settings, templates, component templates, and aliases.

Capture Proxy

Capture Proxy is designed for HTTP RESTful traffic. It functions by relaying traffic to a source cluster and concurrently dividing the traffic, replicating it into a durable Kafka stream for subsequent playback.

Traffic Replayer

Traffic Replayer is a network traffic utility that replicates real-world workloads by retrieving recorded request traffic and dispatching it to a designated target cluster. It associates the original requests and their responses with those directed to the target cluster. This helps to compare correlated data.

Reindex-from-Snapshot Container

The Reindex-from-Snapshot (RFS) Container reindexes data from an existing snapshot on Amazon ECS tasks that coordinate the migration of documents from an existing snapshot. This reindexes documents in parallel to a target cluster.

Target Cluster

The Target Cluster is the destination OpenSearch cluster for migration or comparison in an A/B test. This component must exist prior to deploying this solution.