将亚马逊MWAA与亚马逊搭配使用 Micros RDS oft SQL 服务器 - Amazon Managed Workflows for Apache Airflow

本文属于机器翻译版本。若本译文内容与英语原文存在差异,则一律以英文原文为准。

将亚马逊MWAA与亚马逊搭配使用 Micros RDS oft SQL 服务器

您可以使用适用于 Apache Airflow 的亚马逊托管工作流程连接到 for 服务器RDS。SQL以下示例代码在适用DAGs于 Apache Airflow 的亚马逊托管工作流程环境中用于连接亚马逊版RDS微软SQL服务器并在其上执行查询。

版本

  • 本页上的示例代码可与 Python 3.7 中的 Apache Airflow v1 一起使用。

先决条件

要使用本页上的示例代码,您需要以下内容:

  • 亚马逊MWAA环境

  • Amazon MWAA 和 RDS fo SQL r Server 在同一个亚马逊上运行VPC/

  • VPCAmazon MWAA 和服务器的安全组配置了以下连接:

    • 亚马逊安全组中对亚马逊1433RDS开放端口MWAA的入站规则

    • 或者是从 Amazon MWAA 到的1433开放港口的出站规则 RDS

  • RDS适用于SQL服务器的 Apache Airflow Connection 反映了在先前过程中创建的 Amazon RDS SQL 服务器数据库的主机名、端口、用户名和密码。

依赖项

要使用本节中的示例代码,请将以下依赖项添加到 requirements.txt。要了解更多信息,请参阅 安装 Python 依赖项

Apache Airflow v2
apache-airflow-providers-microsoft-mssql==1.0.1 apache-airflow-providers-odbc==1.0.1 pymssql==2.2.1
Apache Airflow v1
apache-airflow[mssql]==1.10.12

Apache Airflow v2 连接

如果您在 Apache Airflow v2 中使用连接,请确保 Airflow 连接对象包含以下键值对:

  1. 连接 ID:mssql_default

  2. 连接类型:Amazon Web Services

  3. 主机:YOUR_DB_HOST

  4. 架构:

  5. 登录:管理员

  6. 密码:

  7. 端口:1433

  8. 附加依赖项:

代码示例

  1. 在命令提示符下,导航到存储DAG代码的目录。例如:

    cd dags
  2. 复制以下代码示例的内容,并在本地另存为 sql-server.py

    """ Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. """ import pymssql import logging import sys from airflow import DAG from datetime import datetime from airflow.operators.mssql_operator import MsSqlOperator from airflow.operators.python_operator import PythonOperator default_args = { 'owner': 'aws', 'depends_on_past': False, 'start_date': datetime(2019, 2, 20), 'provide_context': True } dag = DAG( 'mssql_conn_example', default_args=default_args, schedule_interval=None) drop_db = MsSqlOperator( task_id="drop_db", sql="DROP DATABASE IF EXISTS testdb;", mssql_conn_id="mssql_default", autocommit=True, dag=dag ) create_db = MsSqlOperator( task_id="create_db", sql="create database testdb;", mssql_conn_id="mssql_default", autocommit=True, dag=dag ) create_table = MsSqlOperator( task_id="create_table", sql="CREATE TABLE testdb.dbo.pet (name VARCHAR(20), owner VARCHAR(20));", mssql_conn_id="mssql_default", autocommit=True, dag=dag ) insert_into_table = MsSqlOperator( task_id="insert_into_table", sql="INSERT INTO testdb.dbo.pet VALUES ('Olaf', 'Disney');", mssql_conn_id="mssql_default", autocommit=True, dag=dag ) def select_pet(**kwargs): try: conn = pymssql.connect( server='sampledb.<xxxxxx>.<region>.rds.amazonaws.com', user='admin', password='<yoursupersecretpassword>', database='testdb' ) # Create a cursor from the connection cursor = conn.cursor() cursor.execute("SELECT * from testdb.dbo.pet") row = cursor.fetchone() if row: print(row) except: logging.error("Error when creating pymssql database connection: %s", sys.exc_info()[0]) select_query = PythonOperator( task_id='select_query', python_callable=select_pet, dag=dag, ) drop_db >> create_db >> create_table >> insert_into_table >> select_query

接下来做什么?