Setting up a custom domain for the Apache Airflow web server
Amazon Managed Workflows for Apache Airflow (Amazon MWAA) lets you to set up a custom domain for the managed Apache Airflow web server. Using a custom domain, you can access your environment's Amazon MWAA managed Apache Airflow web server using the Apache Airflow UI, the Apache Airflow CLI, or the Apache Airflow web server.
Note
You can only use custom domain with a private web server without internet access.
Use cases for a custom domain on Amazon MWAA
-
Share the web server domain across your cloud application on AWS — Using a custom domain lets you define a user-friendly URL for accessing the web server, instead of the generated service domain name. You can store this custom domain and share it as an environment variable in your applications.
-
Access a private web server — If you want to configure access for a web server in a VPC with no internet access, using a custom domain simplifies the URL redirection work flow.
Configure the custom domain
To configure the custom domain feature, you need to provide the custom domain value via the webserver.base_url
Apache Airflow configuration
when creating or updating your Amazon MWAA environment. The following constraints apply to your custom domain name:
-
The value should be a fully qualified domain name (FQDN) without any protocol or path. For example,
your-custom-domain.com
. -
Amazon MWAA does not allow a path in the URL. For example,
your-custom-domain.com/dags/
is not a valid custom domain name. -
The URL length is limited to 255 ASCII characters.
-
If you provide an empty string, by default, the environment will be created with a web server URL generated by Amazon MWAA.
The following example shows using the AWS CLI to create an environment with a custom web server domain name.
$
aws mwaa create-environment \ --name my-mwaa-env \ --source-bucket-arn arn:aws:s3:::my-bucket \ --airflow-configuration-options '{"webserver.base_url":"
my-custom-domain.com
"}' \ --network-configuration '{"SubnetIds":["subnet-0123456789abcdef","subnet-fedcba9876543210"]}' \ --execution-role-arn arn:aws:iam::123456789012:role/my-execution-role
After the environment is created or updated, you need to set up the networking infrastructure in your AWS account to access the private web server via the custom domain.
To revert back to the default service-generated URL, update your private environment and remove the webserver.base_url
configuration option.
Set up the networking infrastructure
Use the following steps to set up the required networking infrastructure to use with your custom domain in your AWS account.
-
Get the IP addresses for the Amazon VPC Endpoint Network Interfaces (ENI). To do this, first, use
get-environment
to find the WebserverVpcEndpointService
for your environment.$
aws mwaa get-environment --name
your-environment-name
If successful, you'll see output similar to the following.
{ "Environment": { "AirflowConfigurationOptions": {}, "AirflowVersion": "
latest-version
", "Arn": "environment-arn
", "CreatedAt": "2024-06-01T01:00:00-00:00", "DagS3Path": "dags", . . . "WebserverVpcEndpointService": "web-server-vpc-endpoint-service
", "WeeklyMaintenanceWindowStart": "TUE:21:30" } }Note the
WebserverVpcEndpointService
value and use it forweb-server-vpc-endpoint-service
in the following Amazon EC2describe-vpc-endpoints
command.--filters Name=service-name,Values=
in the following command.web-server-vpc-endpoint-service-id
-
Retrieve the Amazon VPC endpoint details. This command fetches details about Amazon VPC endpoints that match a specific service name, returning the endpoint ID and associated network interface IDs in a text format.
$
aws ec2 describe-vpc-endpoints \ --filters Name=service-name,Values=
web-server-vpc-endpoint-service
\ --query 'VpcEndpoints[*].{EndpointId:VpcEndpointId,NetworkInterfaceIds:NetworkInterfaceIds}' \ --output text -
Get the network interface details. This command retrieves private IP addresses for each network interface associated with the Amazon VPC endpoints identified in the previous step.
$
for eni_id in $( aws ec2 describe-vpc-endpoints \ --filters Name=service-name,Values=
service-id
\ --query 'VpcEndpoints[*].NetworkInterfaceIds' \ --output text ); do aws ec2 describe-network-interfaces \ --network-interface-ids $eni_id \ --query 'NetworkInterfaces[*].PrivateIpAddresses[*].PrivateIpAddress' \ --output text done -
Use
create-target-group
to create a new target group. You will use this target group to register the IP addresses for your web server Amazon VPC endpoints.$
aws elbv2 create-target-group \ --name
new-target-group-namne
\ --protocol HTTPS \ --port 443 \ --vpc-idweb-server-vpc-id
\ --target-type ip \ --health-check-protocol HTTPS \ --health-check-port 443 \ --health-check-path / \ --health-check-enabled \ --matcher 'HttpCode="200,302"'Register the IP addresses using the
register-targets
command.$
aws elbv2 register-targets \ --target-group-arn
target-group-arn
\ --targets Id=ip-address-1
Id=ip-address-2
-
Request an ACM certificate. Skip this step if you are using an existing certificate.
$
aws acm request-certificate \ --domain-name
my-custom-domain.com
\ --validation-method DNS -
Configure an Application Load Balancer. First, create the load balancer, then create a listener for the load balancer. Specify the ACM certificate you created in the previous step.
$
aws elbv2 create-load-balancer \ --name
my-mwaa-lb
\ --type application \ --subnetssubnet-id-1
subnet-id-2
$
aws elbv2 create-listener \ --load-balancer-arn
load-balancer-arn
\ --protocol HTTPS \ --port 443 \ --ssl-policy ELBSecurityPolicy-2016-08 \ --certificates CertificateArn=acm-certificate-arn
\ --default-actions Type=forward,TargetGroupArn=target-group-arn
If you use a Network Load Balancer in a private subnet, set up a bastion host or AWS VPN tunnel to access the web server.
-
Create a hosted zone using Route 53 for the domain.
$
aws route53 create-hosted-zone --name my-custom-domain.com \ --caller-reference 1
Create an A record for the domain. To do this using the AWS CLI, get the hosted zone ID using
list-hosted-zones-by-name
then apply the record withchange-resource-record-sets
.$
HOSTED_ZONE_ID=$(aws route53 list-hosted-zones-by-name \ --dns-name my-custom-domain.com \ --query 'HostedZones[0].Id' --output text)
$
aws route53 change-resource-record-sets \ --hosted-zone-id $HOSTED_ZONE_ID \ --change-batch '{ "Changes": [ { "Action": "CREATE", "ResourceRecordSet": { "Name": "
my-custom-domain.com
", "Type": "A", "AliasTarget": { "HostedZoneId": "load-balancer-hosted-zone-id
>", "DNSName": "load-balancer-dns-name
", "EvaluateTargetHealth": true } } } ] }' -
Update the security group rules for the web server Amazon VPC endpoint to follow the principle of least privilege by allowing HTTPS traffic only from the public subnets where the Application Load Balancer is located. Save the following JSON locally. For example, as
sg-ingress-ip-permissions.json
.[ { "IpProtocol": "tcp", "FromPort": 443, "ToPort": 443, "UserIdGroupPairs": [ { "GroupId": "
load-balancer-security-group-id
" } ], "IpRanges": [ { "CidrIp": "public-subnet-1-cidr
" }, { "CidrIp": "public-subnet-2-cidr
" } ] } ]Run the following Amazon EC2 command to update your ingress security group rules. Specify the JSON file for
--ip-permissions
.$
aws ec2 authorize-security-group-ingress \ --group-id <security-group-id> \ --ip-permissions file://
sg-ingress-ip-permissions.json
Run the following Amazon EC2 command to update your egress rules.
$
aws ec2 authorize-security-group-egress \ --group-id
webserver-vpc-endpoint-security-group-id
\ --protocol tcp \ --port 443 \ --source-groupload-balancer-security-group-id
Open the Amazon MWAA console and navigate to the Apache Airflow UI. If you are setting up an Network Load Balancer in a private subnet instead of the Application Load Balancer used here, you must access the web server with one of the following options.