Granting Amazon OpenSearch Ingestion pipelines access to collections
An Amazon OpenSearch Ingestion pipeline can write to an OpenSearch Serverless public collection or VPC collection. To provide access to the collection, you configure an AWS Identity and Access Management (IAM) pipeline role with a permissions policy that grants access to the collection. Before you specify the role in your pipeline configuration, you must configure it with an appropriate trust relationship, and then grant it data access permissions through a data access policy.
During pipeline creation, OpenSearch Ingestion creates an AWS PrivateLink connection between the pipeline and the OpenSearch Serverless collection. All traffic from the pipeline goes through this VPC endpoint and is routed to the collection. In order to reach the collection, the endpoint must be granted access to the collection through a network access policy.
Topics
Providing network access to pipelines
Each collection that you create in OpenSearch Serverless has at least one network access policy associated with it. Network access policies determine whether the collection is accessible over the internet from public networks, or whether it must be accessed privately. For more information about network policies, see Network access for Amazon OpenSearch Serverless.
Within a network access policy, you can only specify OpenSearch Serverless-managed VPC endpoints. For
more information, see Access Amazon OpenSearch Serverless using an interface endpoint
(AWS PrivateLink). However, in order for the
pipeline to write to the collection, the policy must also grant access to the VPC
endpoint that OpenSearch Ingestion automatically creates between the pipeline and the
collection. Therefore, when you create a pipeline that has an OpenSearch Serverless collection sink,
you must provide the name of the associated network policy using the
network_policy_name
option.
For example:
... sink: - opensearch: hosts: [ "https://
collection-id
.region
.aoss.amazonaws.com" ] index: "my-index" aws: serverless: true serverless_options: network_policy_name: "network-policy-name
"
During pipeline creation, OpenSearch Ingestion checks for the existence of the specified network policy. If it doesn't exist, OpenSearch Ingestion creates it. If it does exist, OpenSearch Ingestion updates it by adding a new rule to it. The rule grants access to the VPC endpoint that connects the pipeline and the collection.
For example:
{ "Rules":[ { "Resource":[ "collection/
my-collection
" ], "ResourceType":"collection" } ], "SourceVPCEs":[ "vpce-0c510712627e27269
" # The ID of the VPC endpoint that OpenSearch Ingestion creates between the pipeline and collection ], "Description":"Created by Data Prepper" }
In the console, any rules that OpenSearch Ingestion adds to your network policies are named Created by Data Prepper:
Note
In general, a rule that specifies public access for a collection overrides a rule that specifies private access. Therefore, if the policy already had public access configured, this new rule that OpenSearch Ingestion adds doesn't actually change the behavior of the policy. For more information, see Policy precedence.
If you stop or delete the pipeline, OpenSearch Ingestion deletes the VPC endpoint between the pipeline and the collection. It also modifies the network policy to remove the VPC endpoint from the list of allowed endpoints. If you restart the pipeline, it recreates the VPC endpoint and re-updates the network policy with the endpoint ID.
Step 1: Create a pipeline role
The role that you specify in the sts_role_arn parameter of a pipeline configuration must have an attached permissions policy that allows it to send data to the collection sink. It must also have a trust relationship that allows OpenSearch Ingestion to assume the role. For instructions on how to attach a policy to a role, see Adding IAM identity permissions in the IAM User Guide.
The following sample policy demonstrates the least privilege that you can provide in a pipeline configuration's sts_role_arn role for it to write to collections:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "Statement1", "Effect": "Allow", "Action": [ "aoss:APIAccessAll", "aoss:BatchGetCollection", "aoss:CreateSecurityPolicy", "aoss:GetSecurityPolicy", "aoss:UpdateSecurityPolicy" ], "Resource": "*" } ] }
The role must have the following trust relationship, which allows OpenSearch Ingestion to assume it:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "osis-pipelines.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }
Step 2: Create a collection
Create an OpenSearch Serverless collection with the following settings. For instructions to create a collection, see Creating collections.
Data access policy
Create a data access policy for the collection that grants the required permissions to the pipeline role. For example:
[ { "Rules": [ { "Resource": [ "index/
collection-name
/*" ], "Permission": [ "aoss:CreateIndex", "aoss:UpdateIndex", "aoss:DescribeIndex", "aoss:WriteDocument" ], "ResourceType": "index" } ], "Principal": [ "arn:aws:iam::account-id
:role/pipeline-role
" ], "Description": "Pipeline role access" } ]
Note
In the Principal
element, specify the Amazon Resource Name
(ARN) of the pipeline role that you created in the previous step.
Network access policy
Create a network access policy for the collection. You can ingest data into a public collection or a VPC collection. For example, the following policy provides access to a single OpenSearch Serverless-managed VPC endpoint:
[ { "Description":"Rule 1", "Rules":[ { "ResourceType":"collection", "Resource":[ "collection/
collection-name
" ] } ], "AllowFromPublic": false, "SourceVPCEs":[ "vpce-050f79086ee71ac05
" ] } ]
Important
You must specify the name of the network policy within the
network_policy_name
option in the pipeline configuration. At
the time of pipeline creation, OpenSearch Ingestion updates this network policy to
allow access to the VPC endpoint that it automatically creates between the
pipeline and the collection. See step 3 for an example pipeline configuration.
For more information, see Providing network access to
pipelines.
Step 3: Create a pipeline
Finally, create a pipeline in which you specify the pipeline role and collection details. The pipeline assumes this role in order to sign requests to the OpenSearch Serverless collection sink.
Make sure to do the following:
-
For the
hosts
option, specify the endpoint of the collection that you created in step 2. -
For the
sts_role_arn
option, specify the Amazon Resource Name (ARN) of the pipeline role that you created in step 1. -
Set the
serverless
option totrue
. -
Set the
network_policy_name
option to the name of the network policy attached to the collection. OpenSearch Ingestion automatically updates this network policy to allow access from the VPC that it creates between the pipeline and the collection. For more information, see Providing network access to pipelines.
version: "2" log-pipeline: source: http: path: "/log/ingest" processor: - date: from_time_received: true destination: "@timestamp" sink: - opensearch: hosts: [ "https://
collection-id
.region
.aoss.amazonaws.com" ] index: "my-index" aws: serverless: true serverless_options: network_policy_name: "network-policy-name
" # If the policy doesn't exist, a new policy is created. region: "us-east-1" sts_role_arn: "arn:aws:iam::account-id
:role/pipeline-role
"
For a full reference of required and unsupported parameters, see Supported plugins and options for Amazon OpenSearch Ingestion pipelines.