Using AWS CloudFormation to set up remote inference for semantic search - Amazon OpenSearch Service

Using AWS CloudFormation to set up remote inference for semantic search

Starting with OpenSearch version 2.9, you can use remote inference with semantic search to host your own machine learning (ML) models. Remote inference uses the ML Commons plugin to allow you to host your model inferences remotely on ML services, such as Amazon SageMaker and Amazon BedRock, and connect them to Amazon OpenSearch Service with ML connectors.

To ease the setup of remote inference, Amazon OpenSearch Service provides an AWS CloudFormation template in the console. CloudFormation is an AWS service that lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code.

The OpenSearch CloudFormation template automates the model provisioning process for you, so that you can easily create a model in your OpenSearch Service domain and then use the model ID to ingest data and run neural search queries.

When you use neural sparse encoders with OpenSearch Service version 2.12 and onwards, we recommend that you use the tokenizer model locally instead of deploying remotely. For more information, see Sparse encoding models in the OpenSearch documentation.

Prerequisites

To use a CloudFormation template with OpenSearch Service, complete the following prerequisites.

Set up an OpenSearch Service domain

Before you can use a CloudFormation template, you must set up an Amazon OpenSearch Service domain with version 2.9 or later and fine-grained access control enabled. Create an OpenSearch Service backend role to give the ML Commons plugin permission to create your connector for you.

The CloudFormation template creates a Lambda IAM role for you with the default name LambdaInvokeOpenSearchMLCommonsRole, which you can override if you want to choose a different name. After the template creates this IAM role, you need to give the Lambda function permission to call your OpenSearch Service domain. To do so, map the role named ml_full_access to your OpenSearch Service backend role with the following steps:

  1. Navigate to the OpenSearch Dashboards plugin for your OpenSearch Service domain. You can find the Dashboards endpoint on your domain dashboard on the OpenSearch Service console.

  2. From the main menu choose Security, Roles, and select the ml_full_access role.

  3. Choose Mapped users, Manage mapping.

  4. Under Backend roles, add the ARN of the Lambda role that needs permission to call your domain.

    arn:aws:iam::account-id:role/role-name
  5. Select Map and confirm the user or role shows up under Mapped users.

After you've mapped the role, navigate to the security configuration of your domain and add the Lambda IAM role to your OpenSearch Service access policy.

Enable permissions on your AWS account

Your AWS account must have permission to access CloudFormation and Lambda, along with whichever AWS service you choose for your template – either SageMaker Runtime or Amazon BedRock.

If you're using Amazon Bedrock, you must also register your model. See Model access in the Amazon Bedrock User Guide to register your model.

If you're using your own Amazon S3 bucket to provide model artifacts, you must add the CloudFormation IAM role to your S3 access policy. For more information, see Adding and removing IAM identity permissions in the IAM User Guide.

Amazon SageMaker templates

The Amazon SageMaker CloudFormation templates define multiple AWS resources in order to set up the neural plugin and semantic search for you.

First, use the Integration with text embedding models through Amazon SageMaker template to deploy a text embedding model in SageMaker Runtime as a server. If you don't provide a model endpoint, CloudFormation creates an IAM role that allows SageMaker Runtime to download model artifacts from Amazon S3 and deploy them to the server. If you provide an endpoint, CloudFormation creates an IAM role that allows the Lambda function to access the OpenSearch Service domain or, if the role already exists, updates and reuses the role. The endpoint serves the remote model that is used for the ML connector with the ML Commons plugin.

Next, use the Integration with Sparse Encoders through Amazon Sagemaker template to create a Lambda function that has your domain set up remote inference connectors. After the connector is created in OpenSearch Service, the remote inference can run semantic search using the remote model in SageMaker Runtime. The template returns the model ID in your domain back to you to so you can start searching.

To use the Amazon SageMaker CloudFormation templates
  1. Open the Amazon OpenSearch Service console at https://console.aws.amazon.com/aos/home.

  2. In the left navigation, choose Integrations.

  3. Under each of the Amazon SageMaker templates, choose Configure domain, Configure public domain.

  4. Follow the prompt in the CloudFormation console to provision your stack and set up a model.

Note

OpenSearch Service also provides a separate template to configure VPC domain. If you use this template, you need to provide the VPC ID for the Lambda function.

Amazon Bedrock templates

Similar to the Amazon SageMaker CloudFormation templates, the Amazon Bedrock CloudFormation template provisions the AWS resources needed to create connectors between OpenSearch Service and Amazon Bedrock.

First, the template creates an IAM role that allows the future Lambda function to access your OpenSearch Service domain. The template then creates the Lambda function, which has the domain create a connector using the ML Commons plugin. After OpenSearch Service creates the connector, the remote inference set up is finished and you can run semantic searches using the Amazon Bedrock API operations.

Note that since Amazon Bedrock hosts its own ML models, you don’t need to deploy a model to SageMaker Runtime. Instead, the template uses a predetermined endpoint for Amazon Bedrock and skips the endpoint provision steps.

To use the Amazon Bedrock CloudFormation template
  1. Open the Amazon OpenSearch Service console at https://console.aws.amazon.com/aos/home.

  2. In the left navigation, choose Integrations.

  3. Under Integrate with Amazon Titan Text Embeddings model through Amazon Bedrock, choose Configure domain, Configure public domain.

  4. Follow the prompt to set up your model.

Note

OpenSearch Service also provides a separate template to configure VPC domain. If you use this template, you need to provide the VPC ID for the Lambda function.

In addition, OpenSearch Service provides the following Amazon Bedrock templates to connect to the Cohere model and the Amazon Titan multimodal embeddings model:

  • Integration with Cohere Embed through Amazon Bedrock

  • Integrate with Amazon Bedrock Titan Multi-modal