メニュー
Amazon Redshift
データベース開発者ガイド (API Version 2012-12-01)

ステップ 2: Amazon EMR クラスターを作成する

COPY コマンドでは、Amazon EMR の Hadoop Distributed File System (HDFS) のファイルからデータをロードします。Amazon EMR クラスターを作成する場合には、クラスターの HDFS にデータファイルを出力するようにクラスターを設定する必要があります。

To create an Amazon EMR cluster

  1. Create an Amazon EMR cluster in the same AWS region as the Amazon Redshift cluster.

    If the Amazon Redshift cluster is in a VPC, the Amazon EMR cluster must be in the same VPC group. If the Amazon Redshift cluster uses EC2-Classic mode (that is, it is not in a VPC), the Amazon EMR cluster must also use EC2-Classic mode. For more information, see Managing Clusters in Virtual Private Cloud (VPC) in the Amazon Redshift Cluster Management Guide.

  2. Configure the cluster to output data files to the cluster's HDFS. The HDFS file names must not include asterisks (*) or question marks (?).

    重要

    The file names must not include asterisks ( * ) or question marks ( ? ).

  3. Specify No for the Auto-terminate option in the Amazon EMR cluster configuration so that the cluster remains available while the COPY command executes.

    重要

    If any of the data files are changed or deleted before the COPY completes, you might have unexpected results, or the COPY operation might fail.

  4. Note the cluster ID and the master public DNS (the endpoint for the Amazon EC2 instance that hosts the cluster). You will use that information in later steps.