Menu
Amazon Elastic MapReduce
Developer Guide (API Version 2009-03-31)

Process Data Using Cascading

This documentation is for AMI versions 2.x and 3.x of Amazon EMR. See the Amazon EMR Release Guide for information about Amazon EMR releases 4.0.0 and above. For information about managing the Amazon EMR service in 4.x releases, see the Amazon EMR Management Guide.

Cascading is an open-source Java library that provides a query API, a query planner, and a job scheduler for creating and running Hadoop MapReduce applications. Applications developed with Cascading are compiled and packaged into standard Hadoop-compatible JAR files similar to other native Hadoop applications. A Cascading step is submitted as a custom JAR in the Amazon EMR console. For more information about Cascading, go to http://www.cascading.org.