Link Git-based repositories to an EMR Studio Workspace - Amazon EMR

Link Git-based repositories to an EMR Studio Workspace

About Git repositories for EMR Studio

You can associate a maximum of three Git repositories with an EMR Studio Workspace. By default, each Workspace lets you choose from a list of Git repositories that are associated with the same AWS account as your Studio. You can also create a new Git repository as a resource for your Workspace.

You can manually run Git commands like the following using a terminal command while connected to the master node of a cluster.

!git pull origin <branch-name>

Alternatively, you can use he jupyterlab-git extension, which is installed and available to use in each Workspace. Open it from the left sidebar by choosing the Git icon. For information about the jupyterlab-git extension for JupyterLab, see jupyterlab-git.

Prerequisites

To link an associated Git repository to your Workspace

  1. Open the Workspace that you want to link to a repository from the Workspaces list in your Studio.

  2. In the left sidebar, choose the EMR Git Repository icon to open the Git repository tool panel.

  3. Under Git repositories, expand the dropdown list and select a maximum of three different repositories to link to your Workspace. EMR Studio will automatically register your selection and begin linking each repository.

It might take some time for the linking process to complete. You can see the status for each repository that you selected in the Git repository tool panel. After EMR Studio links a repository to your Workspace, you should see the files that belong to that repository appear in the File browser panel.

To add a new Git repository to your Workspace as a resource

  1. Open the Workspace that you want to link to a repository from the Workspaces list in your Studio.

  2. In the left sidebar, choose the EMR Git Repository icon to open the Git repository tool panel.

  3. Choose Add new Git repository.

  4. For Repository name, enter a descriptive name for the repository in EMR Studio. Names may only contain alphanumeric characters, hyphens, and underscores.

  5. For Git repository URL, enter the URL for the repository. When you use a CodeCommit repository, this is the URL that is copied when you choose Clone URL and then Clone HTTPS. For example, https://git-codecommit.us-west-2.amazonaws.com/v1/repos/[MyCodeCommitRepoName].

  6. For Branch, enter the name of an existing branch that you want to check out.

  7. For Git credentials, choose an option according to the following guidelines. EMR Studio accesses your Git credentials using secrets stored in Secrets Manager.

    Note

    If you use a GitHub repository, we recommend that you use a personal access token (PAT) to authenticate. Beginning August 13, 2021, GitHub will require token-based authentication and will no longer accept passwords when authenticating Git operations. For more information, see the Token authentication requirements for Git operations post in The GitHub Blog.

    Option Description
    Create a new secret

    Choose this option to associate existing Git credentials with a new secret that will be created in AWS Secrets Manager for you. Do one of the following based on the Git credentials that you use for the repository.

    If you use a Git user name and password to access the repository, select Username and password, enter the Secret name to use in Secrets Manager, and then enter the Username and Password to associate with the secret.

    –OR–

    If you use a personal access token to access the repository, select Personal access token (PAT), enter the Secret name to use in Secrets Manager, and then enter your personal access token. For more information, see Creating a personal access token for the command line for GitHub and Personal access tokens for Bitbucket. CodeCommit repositories do not support this option.

    Use a public repository without credentials Choose this option to access a public repository.
    Use an existing AWS secret

    Choose this option if you already saved your credentials as a secret in Secrets Manager, and then select the secret name from the list.

    If you select a secret associated with a Git user name and password, the secret must be in the format {"gitUsername": MyUserName, "gitPassword": MyPassword}.

  8. Choose Add repository to create the new repository. After EMR Studio successfully creates the new repository, you will see a success message. The new repository appears in the dropdown list under Git repositories.

  9. To link your new repository to your Workspace, choose it from the dropdown list under Git repositories.

It might take some time for the linking process to complete. After EMR Studio links your new repository to your Workspace, you should see a new folder with the same name as your repository appear in the File Browser panel.

To open a different linked repository, navigate to its folder in the File browser.