Build a multi-tenant serverless architecture in Amazon OpenSearch Service - AWS Prescriptive Guidance

Build a multi-tenant serverless architecture in Amazon OpenSearch Service

Created by Tabby Ward (AWS) and Nisha Gambhir (AWS)

Environment: PoC or pilot

Technologies: Modernization; SaaS; Serverless

Workload: Open-source

AWS services: Amazon OpenSearch Service; AWS Lambda; Amazon S3; Amazon API Gateway

Summary

Amazon OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale Elasticsearch, which is a popular open-source search and analytics engine. Amazon OpenSearch Service provides free-text search as well as near real-time ingestion and dashboarding for streaming data such as logs and metrics.

Software as a service (SaaS) providers frequently use Amazon OpenSearch Service to address a broad range of use cases, such as gaining customer insights in a scalable and secure way while reducing complexity and downtime.

Using Amazon OpenSearch Service in a multi-tenant environment introduces a series of considerations that affect partitioning, isolation, deployment, and management of your SaaS solution. SaaS providers have to consider how to effectively scale their Elasticsearch clusters with continually shifting workloads. They also need to consider how tiering and noisy neighbor conditions could impact their partitioning model.

This pattern reviews the models that are used to represent and isolate tenant data with Elasticsearch constructs. In addition, the pattern focuses on a simple serverless reference architecture as an example to demonstrate indexing and searching using Amazon OpenSearch Service in a multi-tenant environment. It implements the pool data partitioning model, which shares the same index among all tenants while maintaining a tenant's data isolation. This pattern uses the following Amazon Web Services (AWS) services: Amazon API Gateway, AWS Lambda, Amazon Simple Storage Service (Amazon S3), and Amazon OpenSearch Service .

For more information about the pool model and other data partitioning models, see the Additional information section.

Prerequisites and limitations

Prerequisites

  • An active AWS account

  • AWS Command Line Interface (AWS CLI) version 2.x, installed and configured on macOS, Linux, or Windows

  • Python version 3.7

  • pip3 – The Python source code is provided as a .zip file to be deployed in a Lambda function. If you want to use the code locally or customize it, follow these steps to develop and recompile the source code:

    1. Generate the requirements.txt file by running the the following command in the same directory as the Python scripts: pip3 freeze > requirements.txt

    2. Install the dependencies: pip3 install -r requirements.txt

Limitations

  • This code runs in Python, and doesn’t currently support other programming languages. 

  • The sample application doesn’t include AWS cross-Region or disaster recovery (DR) support. 

  • This pattern is intended for demonstration purposes only. It is not intended to be used in a production environment.

Architecture

The following diagram illustrates the high-level architecture of this pattern. The architecture includes the following:

  • AWS Lambda to index and query the content 

  • Amazon OpenSearch Service to perform search 

  • Amazon API Gateway to provide an API interaction with the user

  • Amazon S3 to store raw (non-indexed) data

  • Amazon CloudWatch to monitor logs

  • AWS Identity and Access Management (IAM) to create tenant roles and policies

High-level multi-tenant serverless architecture

Automation and scale

For simplicity, the pattern uses AWS CLI to provision the infrastructure and to deploy the sample code. You can create an AWS CloudFormation template or AWS Cloud Development Kit (AWS CDK) scripts to automate the pattern.

Tools

AWS services

  • AWS CLI – AWS Command Line Interface (AWS CLI) is a unified tool for managing AWS services and resources by using commands in your command-line shell.

  • AWS Lambda – AWS Lambda is a compute service that lets you run code without provisioning or managing servers. Lambda runs your code only when needed and scales automatically, from a few requests per day to thousands per second.

  • Amazon API Gateway – Amazon API Gateway is an AWS service for creating, publishing, maintaining, monitoring, and securing REST, HTTP, and WebSocket APIs at any scale.

  • Amazon S3 – Amazon Simple Storage Service (Amazon S3) is an object storage service that lets you store and retrieve any amount of information at any time, from anywhere on the web.

  • Amazon OpenSearch Service – Amazon OpenSearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost-effectively at scale.

Code

The attachment provides sample files for this pattern. These include:

  • index_lambda_package.zip – The Lambda function for indexing data in Amazon OpenSearch Service by using the pool model.

  • search_lambda_package.zip – The Lambda function for searching for data in Amazon OpenSearch Service.

  • Tenant-1-data – Sample raw (non-indexed) data for Tenant-1.

  • Tenant-2-data – Sample raw (non-indexed) data for Tenant-2.

Important: The stories in this pattern include CLI command examples that are formatted for Unix, Linux, and macOS. For Windows, replace the backslash (\) Unix continuation character at the end of each line with a caret (^).

Epics

TaskDescriptionSkills required

Create an S3 bucket.

Create an S3 bucket in your AWS Region. This bucket will hold the non-indexed tenant data for the sample application. Make sure that the S3 bucket's name is globally unique, because the namespace is shared by all AWS accounts.

To create an S3 bucket, you can use the AWS CLI create-bucket command as follows:

aws s3api create-bucket \   --bucket tenantrawdata \   --region <your-AWS-Region>

where tenantrawdata is the S3 bucket name. (You can use any unique name that follows the bucket naming guidelines.)

Cloud architect, Cloud administrator
TaskDescriptionSkills required

Create an Amazon OpenSearch Service domain.

Run the AWS CLI create-elasticsearch-domain command to create an Amazon OpenSearch Service domain:

aws es create-elasticsearch-domain \   --domain-name vpc-cli-example \   --elasticsearch-version 7.10 \   --elasticsearch-cluster-config InstanceType=t3.medium.elasticsearch,InstanceCount=1 \   --ebs-options EBSEnabled=true,VolumeType=gp2,VolumeSize=10 \   --domain-endpoint-options "{\"EnforceHTTPS\": true}" \   --encryption-at-rest-options "{\"Enabled\": true}" \   --node-to-node-encryption-options "{\"Enabled\": true}" \   --advanced-security-options "{\"Enabled\": true, \"InternalUserDatabaseEnabled\": true, \ \"MasterUserOptions\": {\"MasterUserName\": \"KibanaUser\", \ \"MasterUserPassword\": \"NewKibanaPassword@123\"}}" \   --vpc-options "{\"SubnetIds\": [\"<subnet-id>\"], \"SecurityGroupIds\": [\"<sg-id>\"]}" \   --access-policies "{\"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \ \"Principal\": {\"AWS\": \"*\" }, \"Action\":\"es:*\", \ \"Resource\": \"arn:aws:es:region:account-id:domain\/vpc-cli-example\/*\" } ] }"

The instance count is set to 1 because the domain is for testing purposes. You need to enable fine-grained access control by using the advanced-security-options parameter, because the details cannot be changed after the domain has been created. 

This command creates a master user name (KibanaUser) and a password that you can use to log in to the Kibana console.

Because the domain is part of a virtual private cloud (VPC), you have to make sure that you can reach the Elasticsearch instance by specifying the access policy to use.

For more information, see Launching your Amazon OpenSearch Service domains using a VPC in the AWS documentation.

Cloud architect, Cloud administrator

Set up a bastion host.

Set up a Amazon Elastic Compute Cloud (Amazon EC2) Windows instance as a bastion host to access the Kibana console. The Elasticsearch security group must allow traffic from the Amazon EC2 security group. For instructions, see the blog post Controlling Network Access to EC2 Instances Using a Bastion Server.

When the bastion host has been set up, and you have the security group that is associated with the instance available, use the AWS CLI authorize-security-group-ingress command to add permission to the Elasticsearch security group to allow port 443 from the Amazon EC2 (bastion host) security group.

aws ec2 authorize-security-group-ingress \ --group-id <SecurityGroupIdfElasticSearch> \ --protocol tcp \ --port 443 \ --source-group <SecurityGroupIdfBashionHostEC2>
Cloud architect, Cloud administrator
TaskDescriptionSkills required

Create the Lambda execution role.

Run the AWS CLI create-role command to grant the Lambda index function access to AWS services and resources:

aws iam create-role \ --role-name index-lambda-role \ --assume-role-policy-document file://lambda_assume_role.json

where lambda_assume_role.json is a JSON document in the current folder that grants AssumeRole permissions to the Lambda function, as follows:

{      "Version": "2012-10-17",      "Statement": [          {              "Effect": "Allow",              "Principal": {                  "Service": "lambda.amazonaws.com"                },              "Action": "sts:AssumeRole"          }      ]  }
Cloud architect, Cloud administrator

Attach managed policies to the Lambda role.

Run the AWS CLI attach-role-policy command to attach managed policies to the role created in the previous step. These two policies give the role permissions to create an elastic network interface and to write logs to CloudWatch Logs.

aws iam attach-role-policy \ --role-name index-lambda-role \ --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole aws iam attach-role-policy \ --role-name index-lambda-role \ --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole 
Cloud architect, Cloud administrator

Create a policy to give the Lambda index function permission to read the S3 objects.

Run the AWS CLI create-policy command to to give the Lambda index function s3:GetObject permission to read the objects in the S3 bucket:

aws iam create-policy \   --policy-name s3-permission-policy \   --policy-document file://s3-policy.json

The file s3-policy.json is a JSON document in the current folder that grants s3:GetObject permissions to allow read access to S3 objects. If you used a different name when you created the S3 bucket, provide the correct bucket name in the Resource section in the following:

{     "Version": "2012-10-17",     "Statement": [         {            "Effect": "Allow",            "Action": "s3:GetObject",            "Resource": "arn:aws:s3:::tenantrawdata/*"         }     ] }
Cloud architect, Cloud administrator

Attach the Amazon S3 permission policy to the Lambda execution role.

Run the AWS CLI attach-role-policy command to attach the Amazon S3 permission policy you created in the previous step to the Lambda execution role:

aws iam attach-role-policy \   --role-name index-lambda-role \   --policy-arn <PolicyARN>

where PolicyARN is the Amazon Resource Name (ARN) of the Amazon S3 permission policy. You can get this value from the output of the previous command.

Cloud architect, Cloud administrator

Create the Lambda index function.

Run the AWS CLI create-function command to create the Lambda index function, which will access Amazon OpenSearch Service:

aws lambda create-function \   --function-name index-lambda-function \   --zip-file fileb://index_lambda_package.zip \   --handler lambda_index.lambda_handler \   --runtime python3.7 \   --role "arn:aws:iam::account-id:role/index-lambda-role" \   --timeout 30 \   --vpc-config "{\"SubnetIds\": [\"<subnet-id1\>", \"<subnet-id2>\"], \     \"SecurityGroupIds\": [\"<sg-1>\"]}"
Cloud architect, Cloud administrator

Allow Amazon S3 to call the Lambda index function.

Run the AWS CLI add-permission command to give Amazon S3 the permission to call the Lambda index function:

aws lambda add-permission \ --function-name index-lambda-function \ --statement-id s3-permissions \ --action lambda:InvokeFunction \ --principal s3.amazonaws.com \ --source-arn "arn:aws:s3:::tenantrawdata" \ --source-account "<account-id>"
Cloud architect, Cloud administrator

Add a Lambda trigger for the Amazon S3 event.

Run the AWS CLI put-bucket-notification-configuration command to send  notifications to the Lambda index function when the Amazon S3 ObjectCreated event is detected. The index function runs whenever an object is uploaded to the S3 bucket. 

aws s3api put-bucket-notification-configuration \ --bucket tenantrawdata \ --notification-configuration file://s3-trigger.json

The file s3-trigger.json is a JSON document in the current folder that adds the resource policy to the Lambda function when the Amazon S3 ObjectCreated event occurs.

Cloud architect, Cloud administrator
TaskDescriptionSkills required

Create the Lambda execution role.

Run the AWS CLI create-role command to grant the Lambda search function access to AWS services and resources:

aws iam create-role \ --role-name search-lambda-role \ --assume-role-policy-document file://lambda_assume_role.json

where lambda_assume_role.json is a JSON document in the current folder that grants AssumeRole permissions to the Lambda function, as follows:

{      "Version": "2012-10-17",      "Statement": [          {              "Effect": "Allow",              "Principal": {                  "Service": "lambda.amazonaws.com"                },              "Action": "sts:AssumeRole"          }      ]  }
Cloud architect, Cloud administrator

Attach managed policies to the Lambda role.

Run the AWS CLI attach-role-policy command to attach managed policies to the role created in the previous step. These two policies give the role permissions to create an elastic network interface and to write logs to CloudWatch Logs.

aws iam attach-role-policy \ --role-name search-lambda-role \ --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole aws iam attach-role-policy \ --role-name search-lambda-role \ --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole 
Cloud architect, Cloud administrator

Create the Lambda search function.

Run the AWS CLI create-function command to create the Lambda search function, which will access Amazon OpenSearch Service:

aws lambda create-function \   --function-name search-lambda-function \   --zip-file fileb://search_lambda_package.zip \   --handler lambda_search.lambda_handler \   --runtime python3.7 \   --role "arn:aws:iam::account-id:role/search-lambda-role" \   --timeout 30 \   --vpc-config "{\"SubnetIds\": [\"<subnet-id1\>", \"<subnet-id2>\"], \     \"SecurityGroupIds\": [\"<sg-1>\"]}"
Cloud architect, Cloud administrator
TaskDescriptionSkills required

Create tenant IAM roles.

Run the AWS CLI create-role command to create two tenant roles that will be used to test the search functionality:

aws iam create-role \   --role-name Tenant-1-role \   --assume-role-policy-document file://assume-role-policy.json
aws iam create-role \   --role-name Tenant-2-role \   --assume-role-policy-document file://assume-role-policy.json

The file assume-role-policy.json is a JSON document in the current folder that grants AssumeRole permissions to the Lambda execution role:

{     "Version": "2012-10-17",     "Statement": [         {             "Effect": "Allow",             "Principal": {                  "AWS": "<Lambda execution role for index function>",                  "AWS": "<Lambda execution role for search function>"              },             "Action": "sts:AssumeRole"         }     ] }
Cloud architect, Cloud administrator

Create a tenant IAM policy.

Run the AWS CLI create-policy command to create a tenant policy that grants access to Elasticsearch operations:

aws iam create-policy \   --policy-name tenant-policy \   --policy-document file://policy.json

The file policy.json is a JSON document in the current folder that grants permissions on Elasticsearch:

{     "Version": "2012-10-17",     "Statement": [         {             "Effect": "Allow",             "Action": [                 "es:ESHttpDelete",                 "es:ESHttpGet",                 "es:ESHttpHead",                 "es:ESHttpPost",                 "es:ESHttpPut",                 "es:ESHttpPatch"             ],             "Resource": [                 "<ARN of Elasticsearch domain created earlier>"             ]         }     ] }
Cloud architect, Cloud administrator

Attach the tenant IAM policy to the tenant roles.

Run the AWS CLI attach-role-policy command to attach the tenant IAM policy to the two tenant roles you created in the earlier step:

aws iam attach-role-policy \   --policy-arn arn:aws:iam::account-id:policy/tenant-policy \   --role-name Tenant-1-role aws iam attach-role-policy \   --policy-arn arn:aws:iam::account-id:policy/tenant-policy \   --role-name Tenant-2-role

The policy ARN is from the output of the previous step.

Cloud architect, Cloud administrator

Create an IAM policy to give Lambda permissions to assume role.

Run the AWS CLI create-policy command to create a policy for Lambda to assume the tenant role:

aws iam create-policy \   --policy-name assume-tenant-role-policy \ --policy-document file://lambda_policy.json

The file lambda_policy.json is a JSON document in the current folder that grants permissions to AssumeRole:

{     "Version": "2012-10-17",     "Statement": [        {             "Effect": "Allow",             "Action":  "sts:AssumeRole",             "Resource": "<ARN of tenant role created earlier>"        }     ] }

For Resource, you can use a wildcard character to avoid creating a new policy for each tenant.

Cloud architect, Cloud administrator

Create an IAM policy to give the Lambda index role permission to access Amazon S3.

Run the AWS CLI create-policy command to give the Lambda index role permission to access the objects in the S3 bucket:

aws iam create-policy \   --policy-name s3-permission-policy \   --policy-document file://s3_lambda_policy.json

The file s3_lambda_policy.json is the following JSON policy document in the current folder:

{     "Version": "2012-10-17",     "Statement": [         {             "Effect": "Allow",             "Action": "s3:GetObject",             "Resource": "arn:aws:s3:::tenantrawdata/*"         }     ] }
Cloud architect, Cloud administrator

Attach the policy to the Lambda execution role.

Run the AWS CLI attach-role-policy command to attach the policy created in the previous step to the Lambda index and search execution roles you created earlier:

aws iam attach-role-policy \   --policy-arn arn:aws:iam::account-id:policy/assume-tenant-role-policy \   --role-name index-lambda-role aws iam attach-role-policy \   --policy-arn arn:aws:iam::account-id:policy/assume-tenant-role-policy \   --role-name search-lambda-role aws iam attach-role-policy \   --policy-arn arn:aws:iam::account-id:policy/s3-permission-policy \   --role-name index-lambda-role

The policy ARN is from the output of the previous step.

Cloud architect, Cloud administrator
TaskDescriptionSkills required

Create a REST API in API Gateway.

Run the CLI create-rest-api command to create a REST API resource:

aws apigateway create-rest-api \   --name Test-Api \   --endpoint-configuration "{ \"types\": [\"REGIONAL\"] }"

For the endpoint configuration type, you can specify EDGE instead of REGIONAL to use edge locations instead of a particular AWS Region.

Note the value of the id field from the command output. This is the API ID that you will use in subsequent commands.

Cloud architect, Cloud administrator

Create a resource for the search API.

The search API resource starts the Lambda search function with the resource name search. (You don’t have to create an API for the Lambda index function, because it runs automatically when objects are uploaded to the S3 bucket.)

  1. Run the AWS CLI get-resources command to get the parent ID for the root path:

    aws apigateway get-resources \ --rest-api-id <API-ID>

    Note the value of the ID field. You will use this parent ID in the next command.

    { "items": [ { "id": "zpsri964ck", "path": "/" } ] }
  2. Run the AWS CLI create-resource command to create a resource for the search API. For parent-id, specify the ID from the previous command.

    aws apigateway create-resource \   --rest-api-id <API-ID> \   --parent-id <Parent-ID> \   --path-part search
Cloud architect, Cloud administrator

Create a GET method for the search API.

Run the AWS CLI put-method command to create a GET method for the search API:

aws apigateway put-method \   --rest-api-id <API-ID> \   --resource-id <ID from the previous command output> \   --http-method GET \   --authorization-type "NONE" \ --no-api-key-required

For resource-id, specify the ID from the output of the create-resource command.

Cloud architect, Cloud administrator

Create a method response for the search API.

Run the AWS CLI put-method-response command to add a method response for the search API:

aws apigateway put-method-response \   --rest-api-id <API-ID> \   --resource-id  <ID from the create-resource command output> \   --http-method GET \   --status-code 200 \ --response-models "{\"application/json\": \"Empty\"}"

For resource-id, specify the ID from the output of the earlier create-resource command.

Cloud architect, Cloud administrator

Set up a proxy Lambda integration for the search API.

Run the AWS CLI command put-integration command to set up an integration with the Lambda search function:

aws apigateway put-integration \   --rest-api-id <API-ID> \   --resource-id  <ID from the create-resource command output> \   --http-method GET \   --type AWS_PROXY \   --integration-http-method GET \   --uri arn:aws:apigateway:region:lambda:path/2015-03-31/functions/arn:aws:lambda:<region>:<account-id>:function:<function-name>/invocations

For resource-id, specify the ID from the earlier create-resource command.

Cloud architect, Cloud administrator

Grant API Gateway permission to call the Lambda search function.

Run the AWS CLI add-permission command to give API Gateway permission to use the search function:

aws lambda add-permission \   --function-name <function-name> \   --statement-id apigateway-get \   --action lambda:InvokeFunction \   --principal apigateway.amazonaws.com \ --source-arn "arn:aws:execute-api:<region>:<account-id>:api-id/*/GET/search

Change the source-arn path if you used a different API resource name instead of search.

Cloud architect, Cloud administrator

Deploy the search API.

Run the AWS CLI create-deployment command to create a stage resource named dev:

aws apigateway create-deployment \   --rest-api-id <API-ID> \ --stage-name dev

If you update the API, you can use the same CLI command to redeploy it to the same stage.

Cloud architect, Cloud administrator
TaskDescriptionSkills required

Log in to the Kibana console.

  1. Find the link to Kibana on your domain dashboard on the Amazon OpenSearch Service console. The URL is in the form: <domain-endpoint>/_plugin/kibana/.

  2. Use the bastion host you configured in the first epic to access the Kibana console.

  3. Log in to the Kibana console by using the master user name and password from the earlier step, when you created the Amazon OpenSearch Service domain.

  4. When prompted to select a tenant, choose Private.

Cloud architect, Cloud administrator

Create and configure Kibana roles.

To provide data isolation and to make sure that one tenant cannot retrieve the data of another tenant, you need to use document security, which allows tenants to access only documents that contain their tenant ID.

  1. On the Kibana console, in the navigation pane, choose Security, Role.

  2. Create a new tenant role.

  3. Set cluster permissions to indices_all, which gives create, read, update, and delete (CRUD) permissions on the Amazon OpenSearch Service index. 

  4. Restrict index permissions to the tenant-data index. (The index name should match the name in the Lambda search and index functions.) 

  5. Set index permissions to indices_all, to enable users to perform all index-related operations. (You can restrict operations for more granular access, depending on your requirements.)

  6. For document-level security, use the following policy to filter documents by tenant ID, to provide data isolation for tenants in a shared index:

    {   "bool": {     "must": {       "match": {         "TenantId": "Tenant-1"       }     }   } }

    The index name, properties, and values are case-sensitive.

Cloud architect, Cloud administrator

Map users to roles.

  1. Choose the Mapped users tab for the role, and then choose Map users.

  2. In the Backend roles section, specify the ARN of the IAM tenant role that you created earlier, and then choose Map. This maps the IAM tenant role to the Kibana role so that tenant-specific search returns data for that tenant only. For example, if the IAM role name for Tenant-1 is Tenant-1-Role, specify the ARN for Tenant-1-Role (from the Create and configure tenant roles epic) in the Backend roles box for the Tenant-1 Kibana role.

  3. Repeat steps 1 and 2 for Tenant-2.

We recommend that you automate the creation of the tenant and Kibana roles at the time of tenant onboarding.

Cloud architect, Cloud administrator

Create the tenant-data index.

In the navigation pane, under Management, choose Dev Tools, and then run the following command. This command creates the tenant-data index to define the mapping for the TenantId property.

PUT /tenant-data {   "mappings": {     "properties": {       "TenantId": { "type": "keyword"}     }   } }
Cloud architect, Cloud administrator
TaskDescriptionSkills required

Create a VPC endpoint for Amazon S3.

Run the AWS CLI create-vpc-endpoint command to create a VPC endpoint for Amazon S3. The endpoint enables the Lambda index function in the VPC to access the Amazon S3 service.

aws ec2 create-vpc-endpoint \   --vpc-id <VPC-ID> \   --service-name com.amazonaws.us-east-1.s3 \   --route-table-ids <route-table-ID>

For vpc-id, specify the VPC that you’re using for the Lambda index function. For service-name, use the correct URL for the Amazon S3 endpoint. For route-table-ids, specify the route table that’s associated with the VPC endpoint.

Cloud architect, Cloud administrator

Create a VPC endpoint for AWS STS.

Run the AWS CLI create-vpc-endpoint command to create a VPC endpoint for AWS Security Token Service (AWS STS). The endpoint enables the Lambda index and search functions in the VPC to access the AWS STS service. The functions use AWS STS when they assume the IAM role.

aws ec2 create-vpc-endpoint \   --vpc-id <VPC-ID> \   --vpc-endpoint-type Interface \   --service-name com.amazonaws.us-east-1.sts \   --subnet-id <subnet-ID> \ --security-group-id <security-group-ID>

For vpc-id, specify the VPC that you’re using for the Lambda index and search functions. For subnet-id, provide the subnet in which this endpoint should be created. For security-group-id, specify the security group to associate this endpoint with. (It could be the same as the security group Lambda uses.)

Cloud architect, Cloud administrator
TaskDescriptionSkills required

Update the Python files for the index and search functions.

  1. In the index_lambda_package.zip file, edit the lamba_index.py file to update the AWS account ID, AWS Region, and Elasticsearch endpoint information.

  2. In the search_lambda_package.zip file, edit the lambda_search.py file to update the AWS account ID, AWS Region, and Elasticsearch endpoint information.

You can get the Elasticsearch endpoint from the Overview tab of the Amazon OpenSearch Service console. It has the format <AWS-Region>.es.amazonaws.com.

Cloud architect, App developer

Update the Lambda code.

Use the AWS CLI update-function-code command to update the Lambda code with the changes you made to the Python files:

aws lambda update-function-code \   --function-name index-lambda-function \   --zip-file fileb://index_lambda_package.zip aws lambda update-function-code \   --function-name search-lambda-function \   --zip-file fileb://search_lambda_package.zip
Cloud architect, App developer

Upload raw data to the S3 bucket.

Use the AWS CLI cp command to upload data for the Tenant-1 and Tenant-2 objects to the tenantrawdata bucket (specify the name of the S3 bucket you created for this purpose):

aws s3 cp tenant-1-data s3://tenantrawdata aws s3 cp tenant-2-data s3://tenantrawdata

The S3 bucket is set up to run the Lambda index function whenever data is uploaded so that the document is indexed in Elasticsearch.

Cloud architect, Cloud administrator

Search data from the Kibana console.

On the Kibana console, run the following query:

GET tenant-data/_search

This query displays all the documents indexed in Elasticsearch. In this case, you should see two, separate documents for Tenant-1 and Tenant-2.

Cloud architect, Cloud administrator

Test the search API from API Gateway.

  1. In the API Gateway console, open the search API, choose the GET method inside the search resource, and then choose Test.

  2. In the test window, provide the following query string (case-sensitive) for the tenant ID, and then choose Test.

    TenantId=Tenant-1

    The Lambda function sends a query to Amazon OpenSearch Service that filters the tenant document based on the document-level security. The method returns the document that belongs to Tenant-1.

  3. Change the query string to:

    TenantId=Tenant-2

    This query returns the document that belongs to Tenant-2.

For screen illustrations, see the Additional information section.

Cloud architect, App developer

Related resources

Additional information

Data partitioning models

There are three common data partitioning models used in multi-tenant systems: silo, pool, and hybrid. The model you choose depends on the compliance, noisy neighbor, operations, and isolation needs of your environment.

Silo model

In the silo model, each tenant’s data is stored in a distinct storage area where there is no commingling of tenant data. You can use two approaches to implement the silo model with Amazon OpenSearch Service: domain per tenant and index per tenant.

  • Domain per tenant – You can use a separate Amazon OpenSearch Service domain (synonymous with an Elasticsearch cluster) per tenant. Placing each tenant in its own domain provides all the benefits associated with having data in a standalone construct. However, this approach introduces management and agility challenges. Its distributed nature makes it harder to aggregate and assess the operational health and activity of tenants. This is a costly option that requires each Amazon OpenSearch Service domain to have three master nodes and two data nodes for production workloads at the minimum.

Domain per tenant silo model for multi-tenant serverless architectures
  • Index per tenant – You can place tenant data in separate indexes within an Amazon OpenSearch Service cluster. With this approach, you use a tenant identifier when you create and name the index, by pre-pending the tenant identifier to the index name. The index per tenant approach helps you achieve your silo goals without introducing a completely separate cluster for each tenant. However, you might encounter memory pressure if the number of indexes grows, because this approach requires more shards, and the master node has to handle more allocation and rebalancing.

Index per tenant silo model for multi-tenant serverless architectures

Isolation in the silo model – In the silo model, you use IAM policies to isolate the domains or indexes that hold each tenant’s data. These policies prevent one tenant from accessing another tenant’s data. To implement your silo isolation model, you can create a resource-based policy that controls access to your tenant resource. This is often a domain access policy that specifies which actions a principal can perform on the domain’s sub-resources, including Elasticsearch indexes and APIs. With IAM identity-based polices, you can specify allowed or denied actions on the domain, indexes, or APIs within Amazon OpenSearch Service. The Action element of an IAM policy describes the specific action or actions that are allowed or denied by the policy, and the Principal element specifies the affected accounts, users, or roles.

The following sample policy grants Tenant-1 full access (as specified by es:*) to the sub-resources on the tenant-1 domain only. The trailing /* in the Resource element indicates that this policy applies to the domain’s sub-resources, not to the domain itself. When this policy is in effect, tenants are not allowed to create a new domain or modify settings on an existing domain.

{    "Version": "2012-10-17",    "Statement": [       {          "Effect": "Allow",          "Principal": {             "AWS": "arn:aws:iam::aws-account-id:user/Tenant-1"          },          "Action": "es:*",          "Resource": "arn:aws:es:Region:account-id:domain/tenant-1/*"       }    ] }

To implement the tenant per Index silo model, you would need to modify this sample policy to further restrict Tenant-1 to the specified index or indexes, by specifying the index name. The following sample policy restricts Tenant-1 to the tenant-index-1 index. 

{    "Version": "2012-10-17",    "Statement": [       {          "Effect": "Allow",          "Principal": {             "AWS": "arn:aws:iam::123456789012:user/Tenant-1"          },          "Action": "es:*",          "Resource": "arn:aws:es:Region:account-id:domain/test-domain/tenant-index-1/*"       }    ] }

Pool model

In the pool model, all tenant data is stored in an index within the same domain. The tenant identifier is included in the data (document) and used as the partition key, so you can determine which data belongs to which tenant. This model reduces the management overhead. Operating and managing the pooled index is easier and more efficient than managing multiple indexes. However, because tenant data is commingled within the same index, you lose the natural tenant isolation that the silo model provides. This approach might also degrade performance because of the noisy neighbor effect.

Pool model for multi-tenant serverless architectures

Tenant isolation in the pool model – In general, tenant isolation is challenging to implement in the pool model. The IAM mechanism used with the silo model doesn’t allow you to describe isolation based on the tenant ID stored in your document.

An alternative approach is to use the fine-grained access control (FGAC) support provided by the Open Distro for Elasticsearch. FGAC allows you to control permissions at an index, document, or field level. With each request, FGAC evaluates the user credentials and either authenticates the user or denies access. If FGAC authenticates the user, it fetches all roles mapped to that user and uses the complete set of permissions to determine how to handle the request. 

To achieve the required isolation in the pooled model, you can use document-level security, which lets you restrict a role to a subset of documents in an index. The following sample role restricts queries to Tenant-1. By applying this role to Tenant-1, you can achieve the necessary isolation. 

{    "bool": {      "must": {        "match": {          "tenantId": "Tenant-1"        }      }    } }

Hybrid model

The hybrid model uses a combination of the silo and pool models in the same environment to offer unique experiences to each tenant tier (such as free, standard, and premium tiers). Each tier follows the same security profile that was used in the pool model.

Hybrid model for multi-tenant serverless architectures

Tenant isolation in the hybrid model – In the hybrid model, you follow the same security profile as in the pool model, where using the FGAC security model at the document level provided tenant isolation. Although this strategy simplifies cluster management and offers agility, it complicates other aspects of the architecture. For example, your code requires additional complexity to determine which model is associated with each tenant. You also have to ensure that single-tenant queries don’t saturate the entire domain and degrade the experience for other tenants. 

Testing in API Gateway

Test window for Tenant-1 query

Test window for Tenant-1 query

Test window for Tenant-2 query

Test window for Tenant-2 query

Attachments

To access additional content that is associated with this document, unzip the following file: attachment.zip