Using the Apache Airflow REST API
Amazon Managed Workflows for Apache Airflow (Amazon MWAA) supports interacting with your Apache Airflow environments directly using the Apache Airflow REST API for environments running Apache Airflow v2.4.3 and later. This lets you access and manage your Amazon MWAA environments programmatically, providing a standardized way to invoke data orchestration workflows, manage your DAGs, and monitor the status of various Apache Airflow components such as the metadata database, triggerer, and scheduler.
To support scalability while using the Apache Airflow REST API, Amazon MWAA provides you with the option to horizontally scale webserver capacity to handle increased demand, whether from REST API requests, command line interface (CLI) usage, or more concurrent Apache Airflow user interface (UI) users. For more information about how Amazon MWAA scales webservers, refer to Configuring Amazon MWAA webserver automatic scaling.
You can use the Apache Airflow REST API to implement the following use-cases for your environments:
-
Programmatic access – You can now start Apache Airflow DAG runs, manage datasets, and retrieve the status of various components such as the metadata database, triggerers, and schedulers without relying on the Apache Airflow UI or CLI.
-
Integrate with external applications and microservices – REST API support allows you to build custom solutions that integrate your Amazon MWAA environments with other systems. For example, you can start workflows in response to events from external systems, such as completed database jobs or new user sign-ups.
-
Centralized monitoring – You can build monitoring dashboards that aggregate the status of your DAGs across multiple Amazon MWAA environments, enabling centralized monitoring and management.
For more information about the Apache Airflow REST API, refer to the Apache Airflow REST API Reference
By using InvokeRestApi
, you can access the Apache Airflow REST API using AWS credentials. Alternatively, you can also access it by obtaining a webserver access token and then using the token to call it.
If you encounter an error with the message Update your environment to use InvokeRestApi
while using the InvokeRestApi
operation, it indicates that you need to update your Amazon MWAA environment. This error occurs when your Amazon MWAA environment is not compatible with the latest changes related to the InvokeRestApi
feature. To resolve this issue, update your Amazon MWAA environment to incorporate the necessary changes for the InvokeRestApi
feature.
The InvokeRestApi
operation has a default timeout duration of 10 seconds. If the operation does not complete within this 10-second timeframe, it will be automatically terminated, and an error will be raised. Ensure that your REST API calls are designed to complete within this timeout period to avoid encountering errors.
In order to support scalability while using the Apache Airflow REST API, Amazon MWAA provides you with the option to horizontally scale web server capacity to handle increased demand, whether from REST API requests, command line interface (CLI) usage, or more concurrent Apache Airflow user interface (UI) users. For more information about how Amazon MWAA scales web servers, refer to Configuring Amazon MWAA webserver automatic scaling.
You can use the Apache Airflow REST API to implement the following use-cases for your environments:
-
Programmatic access – You can now start Apache Airflow DAG runs, manage datasets, and retrieve the status of various components such as the metadata database, triggerers, and schedulers without relying on the Apache Airflow UI or CLI.
-
Integrate with external applications and microservices – REST API support allows you to build custom solutions that integrate your Amazon MWAA environments with other systems. For example, you can start workflows in response to events from external systems, such as completed database jobs or new user sign-ups.
-
Centralized monitoring – You can build monitoring dashboards that aggregate the status of your DAGs across multiple Amazon MWAA environments, enabling centralized monitoring and management.
For more information about the Apache Airflow REST API, refer to The Apache Airflow REST API Reference
By using InvokeRestApi
, you can access the Apache Airflow REST API using AWS credentials. Alternatively, you can also access it by obtaining a web server access token and then using the token to call it.
-
If you encounter an error with the message
Update your environment to use InvokeRestApi
while using theInvokeRestApi
operation, it indicates that you need to update your Amazon MWAA environment. This error occurs when your Amazon MWAA environment is not compatible with the latest changes related to theInvokeRestApi
feature. To resolve this issue, update your Amazon MWAA environment to incorporate the necessary changes for theInvokeRestApi
feature. -
The
InvokeRestApi
operation has a default timeout duration of 10 seconds. If the operation does not complete within this 10-second timeframe, it will be automatically terminated, and an error will be raised. Ensure that your REST API calls are designed to complete within this timeout period to avoid encountering errors.
Important
The response payload size cannot exceed 6 MB. Your RestApi
fails if this limit is exceeded.
Use the following examples to make API calls to the Apache Airflow REST API and start a new DAG run:
Topics
Granting access to the Apache Airflow REST API: airflow:InvokeRestApi
To access the Apache Airflow REST API using AWS credentials, you must grant the airflow:InvokeRestApi
permission in your IAM policy. In the following policy sample, specify the Admin
, Op
, User
, Viewer
, or Public
role in {airflow-role}
to customize the level of user access. For more information, refer to Default Roles
Note
While configuring a private webserver, the InvokeRestApi
action cannot be invoked from outside of a Virtual Private Cloud (VPC). You can use the aws:SourceVpc
key to apply more granular access control for this operation. For more information, refer to aws:SourceVpc.
Calling the Apache Airflow REST API
This following sample script covers how to use the Apache Airflow REST API to list the available DAGs in your environment and how to create an Apache Airflow variable:
import boto3 env_name = "MyAirflowEnvironment" def list_dags(client): request_params = { "Name": env_name, "Path": "/dags", "Method": "GET", "QueryParameters": { "paused": False } } response = client.invoke_rest_api( **request_params ) print("Airflow REST API response: ", response['RestApiResponse']) def create_variable(client): request_params = { "Name": env_name, "Path": "/variables", "Method": "POST", "Body": { "key": "test-restapi-key", "value": "test-restapi-value", "description": "Test variable created by MWAA InvokeRestApi API", } } response = client.invoke_rest_api( **request_params ) print("Airflow REST API response: ", response['RestApiResponse']) if __name__ == "__main__": client = boto3.client("mwaa") list_dags(client) create_variable(client)
Creating a webserver session token and calling the Apache Airflow REST API
To create a webserver access token, use the following Python function. This function first calls the Amazon MWAA API to obtain a web login token. The web login token, which expires after 60 seconds, is then exchanged for a web session token, which lets you access the webserver and use the Apache Airflow REST API. If you require more than 10 transactions per second (TPS) of throttling capacity, you can use this method to access the Apache Airflow REST API.
The session token expires after 12 hours.
Tip
Key changes in the following code samples from Apache Airflow v2 to v3 are:
-
REST API path changed from
/api/v1
to/api/v2
-
Login path changed from
/aws_maa/login
to/pluginsv2/aws_mwaa/login
-
Response from login
response.cookies["_token"]
contains token information that you must use for subsequent API calls -
For a REST API call, you must pass
jwt_token
information in headers as:headers = { "Authorization": f"Bearer {jwt_token}", "Content-Type": "application/json" }
After authentication is complete, you have the credentials to start sending requests to the API endpoints. In the example in the following section, use the endpoint dags/{dag_name}/dagRuns
.