Getting started with AWS Glue interactive sessions - AWS Glue

Getting started with AWS Glue interactive sessions

These sections describe how to run AWS Glue interactive sessions locally.

Prerequisites for setting up interactive sessions locally

The following are prerequisites for installing interactive sessions:

  • Supported Python versions are 3.6 - 3.9.

  • See sections below for MacOS/Linux and Windows instructions.

MacOS/Linux instructions

Installing Jupyter and AWS Glue interactive sessions Jupyter kernels

  1. Install jupyter boto3 and aws-glue-sessions with pip. Jupyter Lab is also compatible and can be installed instead.

    pip3 install --upgrade jupyter boto3 aws-glue-sessions
  2. The following commands use pip to identify the installation location for aws-glue-sessions. The associated botocore then installs the Jupyter kernels.

    SITE_PACKAGES=$(pip3 show aws-glue-sessions | grep Location | awk '{print $2}') jupyter kernelspec install $SITE_PACKAGES/aws_glue_interactive_sessions_kernel/glue_pyspark jupyter kernelspec install $SITE_PACKAGES/aws_glue_interactive_sessions_kernel/glue_spark

Configuring session credentials and region

AWS Glue interactive sessions requires the same IAM permissions as AWS Glue Jobs and Dev Endpoints. Specify the role used with interactive sessions in one of two ways:

  1. With the %iam_role and %region magics

  2. With an additional line in ~/.aws/credentials

Configuring a session role with magic

In the first cell, type %iam_role <YourGlueServiceRole> in the first cell executed.

Configuring a session role with ~/.aws/credentials

AWS Glue Service Role for interactive sessions can either be specified in the notebook itself or stored alongside the AWS CLI config. If you have a role you typically use with AWS Glue Jobs this will be that role. If you do not have a role you use for AWS Glue jobs, please follow this guide, Configuring IAM permissions for AWS Glue , to set one up.

To set this role as the default role for interactive sessions:

  1. With a text editor, open ~/.aws/credentials.

  2. Look for the profile you use for AWS Glue. If you don't use a profile, use the [Default] profile.

  3. Add a line in the profile for the role you intend to use like glue_role_arn=<AWSGlueServiceRole>.

  4. [Optional]: If your profile does not have a default region set, I recommend adding one with region=us-east-1, replacing us-east-1 with your desired region.

  5. Save the config.

For more information, see Interactive sessions with IAM.

Running Jupyter notebook

To run Jupyter notebook, complete the following steps.

  1. Run the following command to launch Jupyter Notebook.

    jupyter notebook
  2. Choose New, and then choose one of the AWS Glue kernels to begin coding against AWS Glue.

Windows instructions

Installing Jupyter and AWS Glue interactive sessions kernels

  1. Use pip to install Jupyter. Jupyter Lab is also compatible and can be installed instead.

    pip3 install --upgrade jupyter boto3 aws-glue-sessions
  2. (Optional) Run the following command to list the installed packages. If jupyter and aws-glue-sessions were successfully installed, you should see a long list of packages, including jupyter 1.0.0 (or later).

    pip3 list
  3. Install the sessions kernels into Jupyter by running the following commands. These commands will look up the installation location for aws-glue-sessions from pip and install the Jupyter kernels therein.

    1. Change the directory to the aws-glue-sessions install directory within python's site-packages directory.

      Windows PowerShell:

      cd ((pip3 show aws-glue-sessions | Select-String Location | % {$_ -replace("Location: ","")})+"\aws_glue_interactive_sessions_kernel")
    2. Install the AWS Glue PySpark and AWS Glue Scala kernels.

      jupyter-kernelspec install glue_pyspark
      jupyter-kernelspec install glue_spark

Configuring session credentials and region

AWS Glue interactive sessions requires the same IAM permissions as AWS Glue Jobs and Dev Endpoints. Specify the role used with interactive sessions in one of two ways:

  1. With the %iam_role and %region magics

  2. With an additional line in ~/.aws/config

Configuring a session role with magic

In the first cell, type %iam_role <YourGlueServiceRole> in the first cell executed.

Configuring a session role with ~/.aws/config

AWS Glue Service Role for interactive sessions can either be specified in the notebook itself or stored alongside the AWS CLI config. If you have a role you typically use with AWS Glue Jobs this will be that role. If you do not have a role you use for AWS Glue jobs, please follow this guide, Setting up IAM permissions for AWS Glue , to set one up.

To set this role as the default role for interactive sessions:

  1. With a text editor, open ~/.aws/credentials.

  2. Look for the profile you use for AWS Glue. If you don't use a profile, use the [Default] profile.

  3. Add a line in the profile for the role you intend to use like glue_role_arn=<AWSGlueServiceRole>.

  4. [Optional]: If your profile does not have a default region set, I recommend adding one with region=us-east-1, replacing us-east-1 with your desired region.

  5. Save the config.

For more information, see Interactive sessions with IAM.

Running Jupyter

To run Jupyter Notebook, complete the following steps.

  1. Run the following command to launch Jupyter Notebook.

    jupyter notebook
  2. Choose New, and then choose one of the AWS Glue kernels to begin coding against AWS Glue.

Upgrading from the interactive sessions preview

The kernel was upgraded with new names when it was released with version 0.27. To clean up preview versions of the kernels run the following from a terminal or PowerShell.

Note

If you are a part of any other AWS Glue preview that requires a custom service model, removing the kernel will remove the custom service model.

# Remove Old Glue Kernels jupyter kernelspec remove glue_python_kernel jupyter kernelspec remove glue_scala_kernel # Remove Custom Model cd ~/.aws/models rm -rf glue/