Install External Libraries and Kernels in Notebook Instances - Amazon SageMaker

Install External Libraries and Kernels in Notebook Instances

Amazon SageMaker notebook instances come with multiple environments already installed. These environments contain Jupyter kernels and Python packages including: scikit, Pandas, NumPy, TensorFlow, and MXNet. These environments, along with all files in the sample-notebooks folder, are refreshed when you stop and start a notebook instance. You can also install your own environments that contain your choice of packages and kernels.

The different Jupyter kernels in Amazon SageMaker notebook instances are separate conda environments. For information about conda environments, see Managing environments in the Conda documentation.

Install custom environments and kernels on the notebook instance's Amazon EBS volume. This ensures that they persist when you stop and restart the notebook instance, and that any external libraries you install are not updated by Amazon SageMaker. To do that, use a lifecycle configuration that includes both a script that runs when you create the notebook instance (on-create) and a script that runs each time you restart the notebook instance (on-start). For more information about using notebook instance lifecycle configurations, see Customize a Notebook Instance Using a Lifecycle Configuration Script. There is a GitHub repository that contains sample lifecycle configuration scripts at SageMaker Notebook Instance Lifecycle Config Samples.

The examples at https://github.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/blob/master/scripts/persistent-conda-ebs/on-create.sh and https://github.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/blob/master/scripts/persistent-conda-ebs/on-start.sh show the best practice for installing environments and kernels on a notebook instance. The on-create script installs the ipykernel library so that you can use create custom environments as Jupyter kernels, and then uses pip install and conda install to install libraries. You can adapt the script to create custom envronments and install libraries that you want. Amazon SageMaker does not update these libraries when you stop and restart the notebook instance, so you can ensure that your custom environment has specific versions of libraries that you want. The on-start script installs any custom environments that you create as Jupyter kernels, so that they appear in the dropdown list in the Jupyter New menu.