Amazon EMR
Amazon EMR Release Guide

Apache Mahout

Amazon EMR supports Apache Mahout, a machine learning framework for Apache Hadoop. For more information about Mahout, go to

Mahout is a machine learning library with tools for clustering, classification, and several types of recommenders, including tools to calculate most-similar items or build item recommendations for users. Mahout employs the Hadoop framework to distribute calculations across a cluster, and now includes additional work distribution methods, including Spark.

For more information and an example of how to use Mahout with Amazon EMR, see the Building a Recommender with Apache Mahout on Amazon EMR post on the AWS Big Data blog.


Only Mahout version 0.13.0 and later are compatible with Spark version 2.x in Amazon EMR version 5.0 and later.

Mahout Release Information for This Release of Amazon EMR

Application Amazon EMR Release Label Components installed with this application

Mahout 0.13.0


emrfs, emr-ddb, emr-goodies, emr-kinesis, emr-s3-dist-cp, hadoop-client, hadoop-mapred, hadoop-hdfs-datanode, hadoop-hdfs-library, hadoop-hdfs-namenode, hadoop-httpfs-server, hadoop-kms-server, hadoop-yarn-nodemanager, hadoop-yarn-resourcemanager, hadoop-yarn-timeline-server, mahout-client