Learn Workspace basics - Amazon EMR

Learn Workspace basics

When you use an EMR Studio, you can create and configure different Workspaces to organize and run notebooks. This section covers creating and working with Workspaces. For a conceptual overview, see Workspaces on the How Amazon EMR Studio works page.

Create an EMR Studio Workspace

You can create EMR Studio Workspaces to run notebook code using the EMR Studio interface.

To create a Workspace in an EMR Studio
  1. Log in to your EMR Studio.

  2. Choose Create a Workspace.

  3. Enter a Workspace name and a Description. Naming a Workspace helps you identify it on the Workspaces page.

  4. If you want to work with other Studio users in this Workspace in real time, enable Workspace collaboration. You can configure collaborators after you launch the Workspace.

  5. If you want to attach a cluster to a Workspace, expand the Advanced configuration section. You can attach a cluster later, if you prefer. For more information, see Attach a compute to an EMR Studio Workspace.

    Note

    To provision a new cluster, you need access permissions from your administrator.

    Choose one of the cluster options for the Workspace and attach the cluster. For more information about provisioning a cluster when you create a Workspace, see Create and attach a new EMR cluster to an EMR Studio Workspace.

  6. Choose Create a Workspace in the lower right of the page.

After you create a Workspace, EMR Studio will open the Workspaces page. You will see a green success banner at the top of the page and can find the newly-created Workspace in the list.

By default, a Workspace is shared and can be seen by all Studio users. However, only one user can open and work in a Workspace at a time. To work simultaneously with other users, you can Configure Workspace collaboration

Launch a Workspace

To start working with notebook files, launch a Workspace to access the notebook editor. The Workspaces page in a Studio lists all of the Workspaces that you have access to with details including Name, Status, Creation time, and Last modified.

Note

If you had EMR notebooks in the old Amazon EMR console, you can find them in the new console as EMR Studio Workspaces. EMR Notebooks users need additional IAM role permissions to access or create Workspaces. If you recently created a notebook in the old console, you might need to refresh the Workspaces list to see it in the new console. For more information about the transition, see Amazon EMR Notebooks are available as Amazon EMR Studio Workspaces in the console and Amazon EMR console

To launch a Workspace for editing and running notebooks
  1. On the Workspaces page of your Studio, find the Workspace. You can filter the list by keyword or by column value.

  2. Choose the Workspace name to launch the Workspace in a new browser tab. It may take a few minutes for the Workspace to open if it's Idle. Alternatively, select the row for the Workspace and then select Launch Workspace. You can choose from the following launch options:

    • Quick launch – Quickly launch your Workspace with default options. Choose Quick launch if you want to attach clusters to the Workspace in JupyterLab.

    • Launch with options – Launch your Workspace with custom options. You can choose to launch in either Jupyter or JupyterLab, attach your Workspace to an EMR cluster, and select your security groups.

    Note

    Only one user can open and work in a Workspace at a time. If you select a Workspace that is already in use, EMR Studio displays a notification when you try to open it. The User column on the Workspaces page shows the user working in the Workspace.

Understand the Workspace user interface

The EMR Studio Workspace user interface is based on the JupyterLab interface with icon-denoted tabs on the left sidebar. When you pause over an icon, you can see a tooltip that shows the name of the tab. Choose tabs from the left sidebar to access the following panels.

  • File Browser – Displays the files and directories in the Workspace, as well as the files and directories of linked Git repositories.

  • Running Kernels and Terminals – Lists all of the kernels and terminals running in the Workspace. For more information, see Managing kernels and terminals in the official JupyterLab documentation.

  • Git – Provides a graphical user interface for performing commands in the Git repositories attached to the Workspace. This panel is a JupyterLab extension called jupyterlab-git. For more information, see jupyterlab-git.

  • EMR clusters – Lets you attach a cluster to or detach a cluster from the Workspace to run notebook code. The EMR cluster configuration panel also provides advanced configuration options to help you create and attach a new cluster to the Workspace. For more information, see Create and attach a new EMR cluster to an EMR Studio Workspace.

  • Amazon EMR Git Repository – Helps you link the Workspace with up to three Git repositories. For details and instructions, see Link Git-based repositories to an EMR Studio Workspace.

  • Notebook Examples – Provides a list of notebook examples that you can save to the Workspace. You can also access the examples by choosing Notebook Examples on the Launcher page of the Workspace.

  • Commands – Offers a keyboard-driven way to search for and run JupyterLab commands. For more information, see the Command palette page in the JupyterLab documentation.

  • Notebook Tools – Lets you select and set options such as cell slide type and metadata. The Notebook Tools option appears in the left sidebar after you open a notebook file.

  • Open Tabs – Lists the open documents and activities in the main work area so that you can jump to an open tab. For more information, see the Tabs and single-document mode page in the JupyterLab documentation.

  • Collaboration – Lets you enable or disable Workspace collaboration, and manage collaborators. To see the Collaboration panel, you must have the necessary permissions. For more information, see Set ownership for Workspace collaboration.

Explore notebook examples

Every EMR Studio Workspace includes a set of notebook examples that you can use to explore EMR Studio features. To edit or run a notebook example, you can save it to the Workspace.

To save a notebook example to a Workspace
  1. From the left sidebar, choose the Notebook Examples tab to open the Notebook Examples panel. You can also access the examples by choosing Notebook Examples on the Launcher page of the Workspace.

  2. Choose a notebook example to preview it in the main work area. The example is read-only.

  3. To save the notebook example to the Workspace, choose Save to Workspace. EMR Studio saves the example in your home directory. After you save a notebook example to the Workspace, you can rename, edit, and run it.

For more information about the notebook examples, see the EMR Studio Notebook examples GitHub repository.

Save Workspace content

When you work in the notebook editor of a Workspace, EMR Studio saves the content of notebook cells and output for you in the Amazon S3 location associated with the Studio. This backup process preserves work between sessions.

You can also save a notebook by pressing CTRL+S in the open notebook tab or by using one of the save options under File.

Another way to back up the notebook files in a Workspace is to associate the Workspace with a Git-based repository and sync your changes with the remote repository. Doing so also lets you save and share notebooks with team members who use a different Workspace or Studio. For instructions, see Link Git-based repositories to an EMR Studio Workspace.

Delete a Workspace and notebook files

When you delete a notebook file from an EMR Studio Workspace, you delete the file from the File browser, and EMR Studio removes its backup copy in Amazon S3. You do not have to take any further steps to avoid storage charges when you delete a file from a Workspace.

When you delete an entire Workspace, its notebook files and folders will remain in the Amazon S3 storage location. The files continue to accrue storage charges. To avoid storage charges, remove all backed-up files and folders that are associated with your deleted Workspace from Amazon S3.

To delete a notebook file from an EMR Studio Workspace
  1. Select the File browser panel from the left sidebar in the Workspace.

  2. Select the file or folder you want to delete. Right-click your selection and choose Delete. The file disappears from the list. EMR Studio removes the file or folder from Amazon S3 for you.

From the Workspace UI
Delete a Workspace and its associated backup files from EMR Studio
  1. Log in to your EMR Studio with your Studio access URL and choose Workspaces from the left navigation.

  2. Find your Workspace in the list, then select the check box next to its name. You can select multiple Workspaces to delete at the same time.

  3. Choose Delete in the upper right of the Workspaces list and confirm that you want to delete the selected Workspaces. Choose Delete to confirm.

  4. If you want to remove the notebook files that were associated with the deleted Workspace from Amazon S3, follow the instructions for Deleting objects in the Amazon Simple Storage Service Console User Guide. If you did not create the Studio, consult your Studio administrator to determine the Amazon S3 backup location for the deleted Workspace.

From the Workspaces list
Delete a Workspace and its associated backup files from the Workspaces list
  1. Navigate to the Workspaces list in the console.

  2. Select the Workspace that you want to delete from the list and then choose Actions.

  3. Choose Delete.

  4. If you want to remove the notebook files that were associated with the deleted Workspace from Amazon S3, follow the instructions for Deleting objects in the Amazon Simple Storage Service Console User Guide. If you did not create the Studio, consult your Studio administrator to determine the Amazon S3 backup location for the deleted Workspace.

Understand Workspace status

After you create an EMR Studio Workspace, it appears as a row in the Workspaces list in your Studio with its name, status, creation time, and last modified timestamp. The following table describes Workspace statuses.

Status Description
Starting The Workspace is being prepared, but is not yet ready to use. You can't open a Workspace when its status is Starting.
Ready You can open the Workspace to use the notebook editor, but you must attach the Workspace to an EMR cluster before you can run notebook code.
Attaching The Workspace is being attached to a cluster.
Attached The Workspace is attached to an EMR cluster and ready for you to write and run notebook code. If a Workspace's status is not Attached, you must attach it to a cluster before you can run notebook code.
Idle The Workspace has stopped. To reactivate an idle Workspace, select it from the Workspaces list. The status changes from Idle to Starting to Ready when you select the Workspace.
Stopping The Workspace is shutting down and will be set to Idle. When you stop a Workspace, it terminates any corresponding notebook kernels. EMR Studio stops notebooks that have been inactive for a long time.
Deleting When you delete a Workspace, EMR Studio marks it for deletion and starts the deletion process. After the deletion process completes, the Workspace disappears from the list. When you delete a Workspace, its notebook files will remain in the Amazon S3 storage location.

Resolve Workspace connectivity issues

To resolve Workspace connectivity issues, you can stop and restart a Workspace. When you restart a Workspace, EMR Studio launches the Workspace in a different Availability Zone or a different subnet that is associated with your Studio.

To stop and restart an EMR Studio Workspace
  1. Close the Workspace in your browser.

  2. Navigate to the Workspace list in the console.

  3. Select your Workspace from the list and choose Actions.

  4. Choose Stop and wait for the Workspace status to change from Stopping to Idle.

  5. Choose Actions again, and then choose Start to restart the Workspace.

  6. Wait for the Workspace status to change from Starting to Ready, then choose the Workspace name to reopen it in a new browser tab.