Amazon EMR
Amazon EMR Release Guide

Apache Zeppelin

Use Apache Zeppelin as a notebook for interactive data exploration. For more information about Zeppelin, see

To access the Zeppelin web interface, set up an SSH tunnel to the master node and a proxy connection. For more information, see View Web Interfaces Hosted on Amazon EMR Clusters

Zeppelin Release Information for This Release of Amazon EMR

Application Amazon EMR Release Label Components installed with this application

Zeppelin 0.7.3


aws-hm-client, aws-sagemaker-spark-sdk, emrfs, emr-goodies, hadoop-client, hadoop-hdfs-datanode, hadoop-hdfs-library, hadoop-hdfs-namenode, hadoop-httpfs-server, hadoop-kms-server, hadoop-yarn-nodemanager, hadoop-yarn-resourcemanager, hadoop-yarn-timeline-server, spark-client, spark-history-server, spark-on-yarn, spark-yarn-slave, zeppelin-server

Considerations When Using Zeppelin on Amazon EMR

  • Connect to Zeppelin using the same SSH tunneling method to connect to other web servers on the master node. Zeppelin server is found at port 8890.

  • Zeppelin on Amazon EMR release 5.0.0 and later supports Shiro authentication.

  • Zeppelin on Amazon EMR release 5.8.0 and later supports using AWS Glue Data Catalog as the metastore for Spark SQL. For more information, see Using AWS Glue Data Catalog as the Metastore for Spark SQL.

  • Zeppelin does not use some of the settings defined in your cluster’s spark-defaults.conf configuration file (though it instructs YARN to allocate executors dynamically if you have enabled that setting). You must set executor settings (such as memory and cores) on the Interpreter tab and then restart the interpreter for them to be used.

  • Zeppelin on Amazon EMR does not support the SparkR interpreter.