|« PreviousNext »|
|Did this page help you? Yes | No | Tell us about it...|
Amazon Elastic MapReduce (Amazon EMR) supports Apache Pig, a platform you can use to analyze large data sets. For more information about Pig, go to http://pig.apache.org/. Amazon EMR supports several versions of Pig. The following sections describe how to configure Pig on Amazon EMR.
Pig is an open-source, Apache library that runs on top of Hadoop. The library takes SQL-like commands written in a language called Pig Latin and converts those commands into MapReduce clusters. Pig enables you to create database types of queries using familiar SQL-like commands and syntax, so you do not have to write complex MapReduce algorithms using a lower level computer language, such as Java. Although you can execute one Pig Latin command at a time, it is far more common to write a script of Pig Latin commands that accomplish a complete task. Amazon EMR can use these scripts when you upload them to Amazon S3.