Use Amazon EMR clusters from Studio Classic notebooks
In this section, you learn about how to discover, connect to, or terminate an Amazon EMR cluster from SageMaker Studio Classic notebooks.
-
If you are an administrator, see Configure the discoverability of Amazon EMR clusters (for administrators) to configure the discoverability of Amazon EMR clusters from SageMaker Studio Classic notebooks.
-
If you are a data scientist or data engineer looking to discover Amazon EMR clusters from your Studio Classic notebooks, see Discover Amazon EMR clusters from SageMaker Studio Classic.
-
If you are a data scientist or data engineer looking to connect to existing Amazon EMR clusters from your Studio Classic notebooks, see Connect to an Amazon EMR cluster from SageMaker Studio Classic.
When connecting to your Amazon EMR cluster from SageMaker Studio Classic, you can authenticate to
your cluster with Kerberos, Lightweight Directory Access Protocol (LDAP), or
use runtime IAM
role authentication. Your authentication method depends on your cluster
configuration. You can refer to this example Access Apache Livy using a Network Load Balancer on a Kerberos-enabled Amazon EMR
cluster
Find the list of available connection commands to an Amazon EMR cluster per authentication method in Enter the connection command to an Amazon EMR cluster manually to connect to your Amazon EMR cluster.
Supported images and kernels to connect to an Amazon EMR cluster from SageMaker Studio Classic
SageMaker Studio Classic provides built-in support to connect to Amazon EMR clusters in the following images and kernels:
-
DataScience – Python 3 kernel
-
DataScience 2.0 – Python 3 kernel
-
DataScience 3.0 – Python 3 kernel
-
SparkAnalytics 1.0 – SparkMagic and PySpark kernels
-
SparkAnalytics 2.0 – SparkMagic and PySpark kernels
-
SparkMagic – SparkMagic and PySpark kernels
-
PyTorch 1.8 – Python 3 kernels
-
TensorFlow 2.6 – Python 3 kernel
-
TensorFlow 2.11 – Python 3 kernel
Those images and kernels come with sagemaker-studio-analytics-extension
To connect to Amazon EMR clusters using another built-in image or your own image, follow the instructions in Bring your own image.
Bring your own image
To bring your own image in SageMaker Studio Classic and allow your notebooks to connect to
Amazon EMR clusters, install the following sagemaker-studio-analytics-extension
pip install sparkmagic pip install sagemaker-studio-sparkmagic-lib pip install sagemaker-studio-analytics-extension
Additionally, to connect to Amazon EMR with Kerberos
authentication, you must install the kinit client. Depending on your OS, the command
to install the kinit client can vary. To bring an Ubuntu (Debian based) image, use
the apt-get install -y -qq krb5-user
command.
For more information on bringing your own image in SageMaker Studio Classic, see Bring your own SageMaker image.