Link Git-based repositories to an EMR Studio Workspace
Associate up to three Git-based repositories with an Amazon EMR Studio Workspace to save and share notebook files.
About Git repositories for EMR Studio
You can associate a maximum of three Git repositories with an EMR Studio Workspace. By default, each Workspace lets you choose from a list of Git repositories that are associated with the same AWS account as the Studio. You can also create a new Git repository as a resource for a Workspace.
You can run Git commands like the following using a terminal command while connected to the primary node of a cluster.
!git pull origin
<branch-name>
Alternatively, you can use the jupyterlab-git extension. Open it from the
left sidebar by choosing the Git icon. For information about the
jupyterlab-git extension for JupyterLab, see jupyterlab-git
Prerequisites
-
To associate a Git repository with a Workspace, the Studio must be configured to allow Git repository linking. Your Studio administrator should take steps to Establish access and permissions for Git-based repositories.
-
If you use a CodeCommit repository, you must use Git credentials and HTTPS. SSH keys and HTTPS with the AWS Command Line Interface credential helper are not supported. CodeCommit also does not support personal access tokens (PATs). For more information, see Using IAM with CodeCommit in the IAM user Guide and Setup for HTTPS users using Git credentials in the AWS CodeCommit User Guide.
Instructions
To link an associated Git repository to a Workspace
-
Open the Workspace that you want to link to a repository from the Workspaces list in the Studio.
-
In the left sidebar, choose the Amazon EMR Git Repository icon to open the Git repository tool panel.
-
Under Git repositories, expand the dropdown list and select a maximum of three repositories to link to the Workspace. EMR Studio registers your selection and begins linking each repository.
It might take some time for the linking process to complete. You can see the status for each repository that you selected in the Git repository tool panel. After EMR Studio links a repository to a Workspace, you should see the files that belong to that repository appear in the File browser panel.
To add a new Git repository to a Workspace as a resource
-
Open the Workspace that you want to link to a repository from the Workspaces list in your Studio.
-
In the left sidebar, choose the Amazon EMR Git Repository icon to open the Git repository tool panel.
-
Choose Add new Git repository.
-
For Repository name, enter a descriptive name for the repository in EMR Studio. Names may only contain alphanumeric characters, hyphens, and underscores.
-
For Git repository URL, enter the URL for the repository. When you use a CodeCommit repository, this is the URL that is copied when you choose Clone URL and then Clone HTTPS. For example,
https://git-codecommit.us-west-2.amazonaws.com/v1/repos/[MyCodeCommitRepoName]
. -
For Branch, enter the name of an existing branch that you want to check out.
-
For Git credentials, choose an option according to the following guidelines. EMR Studio accesses your Git credentials using secrets stored in Secrets Manager.
Note
If you use a GitHub repository, we recommend that you use a personal access token (PAT) to authenticate. Beginning August 13, 2021, GitHub will require token-based authentication and will no longer accept passwords when authenticating Git operations. For more information, see the Token authentication requirements for Git operations
post in The GitHub Blog. Option Description Create a new secret Choose this option to associate existing Git credentials with a new secret that will be created in AWS Secrets Manager for you. Do one of the following based on the Git credentials that you use for the repository.
If you use a Git user name and password to access the repository, select Username and password, enter the Secret name to use in Secrets Manager, and then enter the Username and Password to associate with the secret.
–OR–
If you use a personal access token to access the repository, select Personal access token (PAT), enter the Secret name to use in Secrets Manager, and then enter your personal access token. For more information, see Creating a personal access token for the command line for GitHub
and Personal access tokens for Bitbucket . CodeCommit repositories do not support this option. Use a public repository without credentials Choose this option to access a public repository. Use an existing AWS secret Choose this option if you already saved your credentials as a secret in Secrets Manager, and then select the secret name from the list.
If you select a secret associated with a Git user name and password, the secret must be in the format
{"gitUsername": "
.MyUserName
", "gitPassword": "MyPassword
"} -
Choose Add repository to create the new repository. After EMR Studio creates the new repository, you will see a success message. The new repository appears in the dropdown list under Git repositories.
-
To link the new repository to your Workspace, choose it from the dropdown list under Git repositories.
It might take some time for the linking process to complete. After EMR Studio links the new repository to the Workspace, you should see a new folder with the same name as your repository appear in the File Browser panel.
To open a different linked repository, navigate to its folder in the File browser.