Amazon Redshift
Database Developer Guide (API Version 2012-12-01)
« PreviousNext »
View the PDF for this guide.Go to the AWS Discussion Forum for this product.Did this page help you?  Yes | No |  Tell us about it...

Loading Data from Amazon EMR

You can use the COPY command to load data in parallel from an Amazon EMR cluster configured to write text files to the cluster's Hadoop Distributed File System (HDFS) in the form of fixed-width files, character-delimited files, CSV files, or JSON-formatted files.

Amazon EMR provides a bootstrap action for output to Amazon Redshift that performs much of the preparation work for you. The bootstrap action must be specified when the Amazon EMR cluster is created. The Amazon Redshift bootstrap action is not available for Amazon EMR clusters created using the following AMI versions: 2.1.4, 2.2.4, 2.3.6.

You will follow different procedures to load data from an Amazon EMR cluster, depending on whether or not you choose to use the Amazon Redshift bootstrap action. Follow the steps in one of the following sections.