Configuring VPC access for Amazon OpenSearch Ingestion pipelines
You can access your Amazon OpenSearch Ingestion pipelines using an interface VPC endpoint. A VPC is a virtual network that's dedicated to your AWS account. It's logically isolated from other virtual networks in the AWS Cloud. Accessing a pipeline through a VPC endpoint enables secure communication between OpenSearch Ingestion and other services within the VPC without the need for an internet gateway, NAT device, or VPN connection. All traffic remains securely within the AWS Cloud.
OpenSearch Ingestion establishes this private connection by creating an interface endpoint, powered by AWS PrivateLink. We create an endpoint network interface in each subnet that you specify during pipeline creation. These are requester-managed network interfaces that serve as the entry point for traffic destined for the OpenSearch Ingestion pipeline. You can also choose to create and manage the interface endpoints yourself.
Using a VPC allows you to enforce data flow through your OpenSearch Ingestion pipelines within the boundaries of the VPC, rather than over the public internet. Pipelines that aren't within a VPC send and receive data over public-facing endpoints and the internet.
A pipeline with VPC access can write to public or VPC OpenSearch Service domains, and to public or VPC OpenSearch Serverless collections.
Topics
Considerations
Consider the following when you configure VPC access for a pipeline.
-
A pipeline doesn't need to be in the same VPC as its sink. You also don't need to establish a connection between the two VPCs. OpenSearch Ingestion takes care of connecting them for you.
-
You can only specify one VPC for your pipeline.
-
Unlike with public pipelines, a VPC pipeline must be in the same AWS Region as the domain or collection sink that it's writing to.
-
You can choose to deploy a pipeline into one, two, or three subnets of your VPC. The subnets are distributed across the same Availability Zones that your Ingestion OpenSearch Compute Units (OCUs) are deployed in.
-
If you only deploy a pipeline in one subnet and the Availability Zone goes down, you won't be able to ingest data. To ensure high availability, we recommend that you configure pipelines with two or three subnets.
-
Specifying a security group is optional. If you don't provide a security group, OpenSearch Ingestion uses the default security group that is specified in the VPC.
Limitations
Pipelines with VPC access have the following limitations.
-
You can't change a pipeline's network configuration after you create it. If you launch a pipeline within a VPC, you can't later change it to a public endpoint, and vice versa.
-
You can either launch your pipeline with an interface VPC endpoint or a public endpoint, but you can't do both. You must choose one or the other when you create a pipeline.
-
After you provision a pipeline with VPC access, you can't move it to a different VPC, and you can't change its subnets or security group settings.
-
If your pipeline writes to a domain or collection sink that uses VPC access, you can't go back later and change the sink (VPC or public) after the pipeline is created. You must delete and recreate the pipeline with a new sink. You can still switch from a public sink to a sink with VPC access.
-
You can't provide cross-account ingestion access to VPC pipelines.
Prerequisites
Before you can provision a pipeline with VPC access, you must do the following:
-
Create a VPC
To create your VPC, you can use the Amazon VPC console, the AWS CLI, or one of the AWS SDKs. For more information, see Working with VPCs in the Amazon VPC User Guide. If you already have a VPC, you can skip this step.
-
Reserve IP addresses
OpenSearch Ingestion places an elastic network interface in each subnet that you specify during pipeline creation. Each network interface is associated with an IP address. You must reserve one IP address per subnet for the network interfaces.
Configuring VPC access for a pipeline
You can enable VPC access for a pipeline within the OpenSearch Service console or using the AWS CLI.
You configure VPC access during pipeline creation. Under Network, choose VPC access and configure the following settings:
Setting | Description |
---|---|
Endpoint management |
Choose whether you want to create your VPC endpoints yourself, or have OpenSearch Ingestion create them for you. |
VPC |
Choose the ID of the virtual private cloud (VPC) that you want to use. The VPC and pipeline must be in the same AWS Region. |
Subnets |
Choose one or more subnets. OpenSearch Service will place a VPC endpoint and elastic network interfaces in the subnets. |
Security groups |
Choose one or more VPC security groups that allow your required application to reach the OpenSearch Ingestion pipeline on the ports (80 or 443) and protocols (HTTP or HTTPs) exposed by the pipeline. |
VPC attachment options |
If your source is a self-managed endpoint, attach your pipeline to a VPC. Choose one of the default CIDR options provided, or use a custom CIDR. |
To configure VPC access using the AWS CLI, specify the
--vpc-options
parameter:
aws osis create-pipeline \ --pipeline-name
vpc-pipeline
\ --min-units 4 \ --max-units 10 \ --vpc-options SecurityGroupIds={sg-12345678
,sg-9012345
},SubnetIds=subnet-1212234567834asdf
\ --pipeline-configuration-body "file://pipeline-config.yaml
"
Self-managed VPC endpoints
When you create a pipeline, you can use endpoint management to create a pipeline with self-managed endpoints or service-managed endpoints. Endpoint management is optional, and defaults to endpoints managed by OpenSearch Ingestion.
To create a pipeline with a self-managed VPC endpoint in the AWS Management Console, see Creating pipelines with the OpenSearch Service console. To create a pipeline with a
self-managed VPC endpoint in the AWS CLI, you can use the --vpc-options
parameter in the create-pipeline command:
--vpc-options SubnetIds=subnet-abcdef01234567890,VpcEndpointManagement=CUSTOMER
You can create an endpoint to your pipeline yourself when you specify your endpoint service. To find your endpoint service, use the get-pipeline command, which returns a response similar to the following:
"vpcEndpointService" : "com.amazonaws.osis.us-east-1.pipeline-id-1234567890abcdef1234567890", "vpcEndpoints" : [ { "vpcId" : "vpc-1234567890abcdef0", "vpcOptions" : { "subnetIds" : [ "subnet-abcdef01234567890", "subnet-021345abcdef6789" ], "vpcEndpointManagement" : "CUSTOMER" } }
Use the vpcEndpointService
from the response to create a VPC endpoint
with the AWS Management Console or AWS CLI.
If you use self-managed VPC endpoints, you must enable the DNS attributes
enableDnsSupport
and enableDnsHostnames
in your VPC. Note
that if you have a pipeline with a self-managed endpoint that you stop and
restart, you must recreate the VPC endpoint in your account.
Service-linked role for VPC access
A service-linked role is a unique type of IAM role that delegates permissions to a service so that it can create and manage resources on your behalf. If you choose a service-managed VPC endpoint, OpenSearch Ingestion requires a service-linked role called AWSServiceRoleForAmazonOpenSearchIngestionService to access your VPC, create the pipeline endpoint, and place network interfaces in a subnet of your VPC.
If you choose a self-managed VPC endpoint, OpenSearch Ingestion requires a service-linked role called AWSServiceRoleForOpensearchIngestionSelfManagedVpce. For more information on these roles, their permissions, and how to delete them, see Using service-linked roles to create OpenSearch Ingestion pipelines.
OpenSearch Ingestion automatically creates the role when you create an ingestion pipeline.
For this automatic creation to succeed, the user creating the first pipeline in an
account must have permissions for the iam:CreateServiceLinkedRole
action.
To learn more, see Service-linked role permissions in the IAM User Guide. You can view the role in the AWS Identity and Access Management (IAM) console
after it's created.