Installing Python dependencies
An "extra package" is a Python subpackage that is not included in the Apache Airflow
base install on your Amazon Managed Workflows for Apache Airflow (MWAA)
environment. It is referred to throughout this page as a Python dependency. This page
describes the steps to install
Apache Airflow Extra packagesrequirements.txt
file.
Prerequisites
How it works
Amazon MWAA runs pip3 install -r requirements.txt
on the requirements file that you specify for your environment for each of the Apache
Airflow Scheduler and Workers.
To run Python dependencies on your environment, you must do two things:
-
upload a
requirements.txt
to your storage bucket on the Amazon S3 console -
specify the location and the version of this file in the Requirements file field on the Amazon MWAA console
Syntax
For information about the syntax for pip install, see pip install
Creating a requirements.txt
If your Apache Airflow pipeline uses Extra packagesrequirements.txt
.
Your requirements.txt
file may look like this:
apache-airflow[hive] apache-airflow[postgres]==1.10.12 boto >= 2.49.0
Adding or updating a requirements.txt
file
To run Python dependencies on your environment, you must copy the requirements.txt
file to your
Amazon S3 storage bucket, then select this file on the Amazon MWAA console. You may need to change your region in the dropdown list. Only environments in the
current region are displayed.
Each time you update the file on your Amazon S3 bucket, a new version is created.
You may need to choose a version for your file on the Amazon MWAA console.
Uploading to your S3 storage bucket
-
Open the Environments page
on the Amazon MWAA console. -
Choose the environment where you want to run Python dependencies.
-
Select the S3 bucket link in the DAG code in S3 pane to open your storage bucket on the Amazon S3 console.
-
Choose Upload.
-
Choose Add file.
-
Select the local copy of your
requirements.txt
, choose Upload.
Installing Python dependencies on your environment
-
Open the Environments page
on the Amazon MWAA console. -
Choose the environment where you want to run Python dependencies.
-
Choose Edit.
-
On the DAG code in Amazon S3 pane, choose Browse S3 next to the Requirements file - optional field.
-
Select the
requirements.txt
file on your storage bucket. -
Choose Choose.
-
(Optional) Choose a
requirements.txt
version on the dropdown list. -
Choose Next, Update environment.
Viewing changes on your Apache Airflow UI
-
Open the Environments page
on the Amazon MWAA console. -
Choose your environment name.
-
Choose Open Airflow UI to view the changes in your Apache Airflow UI.
You may need to ask your account administrator to add AmazonMWAAWebServerAccess
permissions for your account to view your Apache Airflow UI. For more information,
see Managing access.