Menu
Amazon EMR
Amazon EMR Release Guide

Apache Mahout

Amazon EMR supports Apache Mahout, a machine learning framework for Hadoop. For more information about Mahout, go to http://mahout.apache.org/.

Mahout is a machine learning library with tools for clustering, classification, and several types of recommenders, including tools to calculate most-similar items or build item recommendations for users. Mahout employs the Hadoop framework to distribute calculations across a cluster, and now includes additional work distribution methods, including Spark.

For more information and an example of how to use Mahout with Amazon EMR, see the Building a Recommender with Apache Mahout on Amazon EMR post on the AWS Big Data blog.

Note

Only Mahout version 0.13.0 and later are compatible with Spark version 2.x in Amazon EMR version 5.0 and later.

Release Information

Application Amazon EMR Release Label Components installed with this application

Mahout 0.13.0

emr-5.8.0

emrfs, emr-ddb, emr-goodies, emr-kinesis, emr-s3-dist-cp, hadoop-client, hadoop-mapred, hadoop-hdfs-datanode, hadoop-hdfs-library, hadoop-hdfs-namenode, hadoop-httpfs-server, hadoop-kms-server, hadoop-yarn-nodemanager, hadoop-yarn-resourcemanager, hadoop-yarn-timeline-server, mahout-client