Add a Git-based repository to Amazon EMR - Amazon EMR

Add a Git-based repository to Amazon EMR

Note

EMR Notebooks are available as EMR Studio Workspaces in the console. The Create Workspace button in the console lets you create new notebooks. To access or create Workspaces, EMR Notebooks users need additional IAM role permissions. For more information, see Amazon EMR Notebooks are Amazon EMR Studio Workspaces in the console and Amazon EMR console.

Refer to the following sections for information on how to add a Git-based repository to an EMR notebook in the old console, or to an EMR Studio Workspace in the new console.

New console

Because EMR Notebooks are EMR Studio Workspaces in the new console, you can follow the instructions in Link Git-based repositories to an EMR Studio Workspace to associate up to three Git repositories with your Workspace.

Alternatively, you can use the JupyterLab Git extension. Choose the Git icon from the left sidebar of your Jupyterlab notebook to access the extension. For information about the extension, see the jupyterlab-git GitHub repo.

To associate a Git repository with a Workspace, your Studio administrator must take steps to configure the Studio to allow Git repository linking. For more information, see Establish access and permissions for Git-based repositories.

Old console
To add a Git-based repository as a resource in your Amazon EMR account with the old console
  1. Open the old Amazon EMR console at https://console.aws.amazon.com/elasticmapreduce.

  2. Choose Git repositories, and then choose Add repository.

  3. For Repository name, enter a name to use for the repository in Amazon EMR.

    Names may only contain alphanumeric characters, hyphens (-), or underscores (_).

  4. For Git repository URL, enter the URL for the repository. When using a CodeCommit repository, this is the URL that is copied when you choose Clone URL and then Clone HTTPS, for example, https://git-codecommit.us-west-2.amazonaws.com/v1/repos/MyCodeCommitRepoName.

  5. For Branch, enter a branch name.

  6. For Git credentials, choose options according to the following guidelines. You can use a Git user name and password or a personal access token (PAT) to authenticate to your repository. EMR Notebooks accesses your Git credentials using secrets stored in Secrets Manager.

    Note

    If you use a GitHub repository, we recommend that you use a personal access token (PAT) to authenticate. Beginning August 13, 2021, GitHub will no longer accept passwords when authenticating Git operations. For more information, see the Token authentication requirements for Git operations post in The GitHub Blog.

    Option Description

    Use an existing AWS secret

    Choose this option if you already saved your credentials as a secret in Secrets Manager, and then select the secret name from the list.

    If you select a secret associated with a Git user name and password, the secret must be in the format {"gitUsername": "MyUserName", "gitPassword": "MyPassword"}.

    Create a new secret

    Choose this option to associate existing Git credentials with a new secret that you create in Secrets Manager. Do one of the following based on the Git credentials that you use for the repository.

    If you use a Git user name and password to access the repository, select Username and password, enter the Secret name to use in Secrets Manager, and then enter the Username and Password to associate with the secret.

    –OR–

    If you use a personal access token to access the repository, select Personal access token (PAT), enter the Secret name to use in Secrets Manager, and then enter your personal access token.

    For more information, see Creating a personal access token for the command line for GitHub and Personal access tokens for Bitbucket. CodeCommit repositories do not support this option.

    Use a public repository without credentials

    Choose this option to access a public repository.

  7. Choose Add repository.