為 Apache 氣流創建一個自定義插件PythonVirtualenvOperator - Amazon Managed Workflows for Apache Airflow

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

為 Apache 氣流創建一個自定義插件PythonVirtualenvOperator

以下範例顯示如何修補 Apache 氣流PythonVirtualenvOperator在亞馬遜的 Apache 氣流管理工作流程上使用自定義插件。

版本

  • 此頁面上的範例程式碼可搭配使用阿帕奇氣流 V1蟒蛇 3.7

  • 您可以使用此頁面上的程式碼範例阿帕奇氣流 v2 及以上蟒蛇

先決條件

若要使用此頁面上的範例程式碼,您需要下列項目:

許可

  • 使用此頁面上的程式碼範例不需要其他權限。

請求

若要使用此頁面上的範例程式碼,請將下列相依性新增至requirements.txt。如需進一步了解,請參閱 安裝 Python 賴項

virtualenv

自定義插件示例代碼

阿帕奇氣流將在啟動時執行插件文件夾中的 Python 文件的內容。這個插件將修補內置PythonVirtualenvOperator在該啟動過程中,以使其與亞馬遜 MWAA 兼容。下列步驟顯示自訂外掛程式的範例程式碼。

Apache Airflow v2
  1. 在命令提示字元中,導覽至plugins上面的目錄。例如:

    cd plugins
  2. 複製下列程式碼範例的內容,並在本機儲存為virtual_python_plugin.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow.plugins_manager import AirflowPlugin import airflow.utils.python_virtualenv from typing import List def _generate_virtualenv_cmd(tmp_dir: str, python_bin: str, system_site_packages: bool) -> List[str]: cmd = ['python3','/usr/local/airflow/.local/lib/python3.7/site-packages/virtualenv', tmp_dir] if system_site_packages: cmd.append('--system-site-packages') if python_bin is not None: cmd.append(f'--python={python_bin}') return cmd airflow.utils.python_virtualenv._generate_virtualenv_cmd=_generate_virtualenv_cmd class VirtualPythonPlugin(AirflowPlugin): name = 'virtual_python_plugin'
Apache Airflow v1
  1. 在命令提示字元中,導覽至plugins上面的目錄。例如:

    cd plugins
  2. 複製下列程式碼範例的內容,並在本機儲存為virtual_python_plugin.py

    from airflow.plugins_manager import AirflowPlugin from airflow.operators.python_operator import PythonVirtualenvOperator def _generate_virtualenv_cmd(self, tmp_dir): cmd = ['python3','/usr/local/airflow/.local/lib/python3.7/site-packages/virtualenv', tmp_dir] if self.system_site_packages: cmd.append('--system-site-packages') if self.python_version is not None: cmd.append('--python=python{}'.format(self.python_version)) return cmd PythonVirtualenvOperator._generate_virtualenv_cmd=_generate_virtualenv_cmd class EnvVarPlugin(AirflowPlugin): name = 'virtual_python_plugin'

Plugins.zip

下列步驟說明如何建立plugins.zip

  1. 在命令提示符中,導航到包含的目錄virtual_python_plugin.py以上。例如:

    cd plugins
  2. 壓縮您的內容plugins資料夾。

    zip plugins.zip virtual_python_plugin.py

程式碼範例

下列步驟說明如何建立自訂外掛程式的 DAG 程式碼。

Apache Airflow v2
  1. 在命令提示字元中,瀏覽至儲存 DAG 程式碼的目錄。例如:

    cd dags
  2. 複製下列程式碼範例的內容,並在本機儲存為virtualenv_test.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow import DAG from airflow.operators.python import PythonVirtualenvOperator from airflow.utils.dates import days_ago import os os.environ["PATH"] = os.getenv("PATH") + ":/usr/local/airflow/.local/bin" def virtualenv_fn(): import boto3 print("boto3 version ",boto3.__version__) with DAG(dag_id="virtualenv_test", schedule_interval=None, catchup=False, start_date=days_ago(1)) as dag: virtualenv_task = PythonVirtualenvOperator( task_id="virtualenv_task", python_callable=virtualenv_fn, requirements=["boto3>=1.17.43"], system_site_packages=False, dag=dag, )
Apache Airflow v1
  1. 在命令提示字元中,瀏覽至儲存 DAG 程式碼的目錄。例如:

    cd dags
  2. 複製下列程式碼範例的內容,並在本機儲存為virtualenv_test.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ from airflow import DAG from airflow.operators.python_operator import PythonVirtualenvOperator from airflow.utils.dates import days_ago import os os.environ["PATH"] = os.getenv("PATH") + ":/usr/local/airflow/.local/bin" def virtualenv_fn(): import boto3 print("boto3 version ",boto3.__version__) with DAG(dag_id="virtualenv_test", schedule_interval=None, catchup=False, start_date=days_ago(1)) as dag: virtualenv_task = PythonVirtualenvOperator( task_id="virtualenv_task", python_callable=virtualenv_fn, requirements=["boto3>=1.17.43"], system_site_packages=False, dag=dag, )

氣流組態選項

如果您使用的是阿帕奇氣流 V2,添加core.lazy_load_plugins : False作為一個 Apache 氣流配置選項。若要深入瞭解,請參閱使用配置選項在 2 中加載插件

後續步驟?

  • 了解如何上傳requirements.txt在這個例子中將文件文件到您的亞馬遜 S3 存儲桶安裝 Python 賴項

  • 了解如何將此範例中的 DAG 程式碼上傳至dags您的亞馬遜 S3 存儲桶中的文件夾新增或更新 DAG

  • 進一步了解如何上傳plugins.zip在這個例子中將文件文件到您的亞馬遜 S3 存儲桶安裝自定義插件