Installing Python dependencies - Amazon Managed Workflows for Apache Airflow

Installing Python dependencies

An "extra package" is a Python subpackage that is not included in the Apache Airflow base install on your Amazon Managed Workflows for Apache Airflow (MWAA) environment. It is referred to throughout this page as a Python dependency. This page describes the steps to install Apache Airflow Extra packages on your Amazon MWAA environment using a requirements.txt file.

Prerequisites

How it works

Amazon MWAA runs pip3 install -r requirements.txt on the requirements file that you specify for your environment for each of the Apache Airflow Scheduler and Workers.

To run Python dependencies on your environment, you must do two things:

  1. upload a requirements.txt to your storage bucket on the Amazon S3 console

  2. specify the location and the version of this file in the Requirements file field on the Amazon MWAA console

Syntax

For information about the syntax for pip install, see pip install.

Creating a requirements.txt

If your Apache Airflow pipeline uses Extra packages and no additional Python libraries, specify the names of the Python dependencies in your requirements.txt.

Your requirements.txt file may look like this:

apache-airflow[hive] apache-airflow[postgres]==1.10.12 boto >= 2.49.0

Adding or updating a requirements.txt file

To run Python dependencies on your environment, you must copy the requirements.txt file to your Amazon S3 storage bucket, then select this file on the Amazon MWAA console. You may need to change your region in the dropdown list. Only environments in the current region are displayed. Each time you update the file on your Amazon S3 bucket, a new version is created. You may need to choose a version for your file on the Amazon MWAA console.

Uploading to your S3 storage bucket

  1. Open the Environments page on the Amazon MWAA console.

  2. Choose the environment where you want to run Python dependencies.

  3. Select the S3 bucket link in the DAG code in S3 pane to open your storage bucket on the Amazon S3 console.

  4. Choose Upload.

  5. Choose Add file.

  6. Select the local copy of your requirements.txt, choose Upload.

Installing Python dependencies on your environment

  1. Open the Environments page on the Amazon MWAA console.

  2. Choose the environment where you want to run Python dependencies.

  3. Choose Edit.

  4. On the DAG code in Amazon S3 pane, choose Browse S3 next to the Requirements file - optional field.

  5. Select the requirements.txt file on your storage bucket.

  6. Choose Choose.

  7. (Optional) Choose a requirements.txt version on the dropdown list.

  8. Choose Next, Update environment.

Viewing changes on your Apache Airflow UI

  1. Open the Environments page on the Amazon MWAA console.

  2. Choose your environment name.

  3. Choose Open Airflow UI to view the changes in your Apache Airflow UI.

Note

You may need to ask your account administrator to add AmazonMWAAWebServerAccess permissions for your account to view your Apache Airflow UI. For more information, see Managing access.