Launching a Spark application using the Amazon Redshift integration for Apache Spark

For Amazon EMR releases 6.4 through 6.9, you must use the --jars or --packages option to specify which of the following JAR files you want to use. The --jars option specifies dependencies stored locally, in HDFS, or using HTTP/S. To see other file locations supported by the --jars option, see Advanced Dependency Management in the Spark documentation. The --packages option specifies dependencies stored in the public Maven repo.

spark-redshift.jar
spark-avro.jar
RedshiftJDBC.jar
minimal-json.jar

Amazon EMR releases 6.10.0 and higher don't require the minimal-json.jar dependency, and automatically install the other dependencies to each cluster by default. The following examples show how to launch a Spark application with the Amazon Redshift integration for Apache Spark.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Using Spark on Amazon Redshift

Authenticate to Amazon Redshift