Bring your own container (BYOC) - Amazon Braket

Bring your own container (BYOC)

Amazon Braket Hybrid Jobs provides three pre-built containers for running code in different environments. If one of these containers supports your use case, you only have to provide your algorithm script when you create a hybrid job. Minor missing dependencies can be added from your algorithm script or from a requirements.txt file using pip.

If none of these containers support your use case, or if you wish to expand on them, Braket Hybrid Jobs supports running hybrid jobs with your own custom Docker container image, or bring your own container (BYOC). But before we dive in, let’s make sure it’s actually the right feature for your use case.

When is bringing my own container the right decision?

Bringing your own container (BYOC) to Braket Hybrid Jobs offers the flexibility to use your own software by installing it in a packaged environment. Depending on your specific needs, there may be ways to achieve the same flexibility without having to go through the full BYOC Docker build - Amazon ECR upload - custom image URI cycle.

Note

BYOC may not be the right choice if you want to add a small number of additional Python packages (generally fewer than 10) which are publicly available. For example, if you're using PyPi.

In this case, you can use one of the pre-built Braket images, and then include a requirements.txt file in your source directory at the job submission. The file is automatically read, and pip will install the packages with the specified versions as normal. If you're installing a large number of packages, the runtime of your jobs may be substantially increased. Check the Python and, if applicable, CUDA version of the prebuilt container you want to use to test if your software will work.

BYOC is necessary when you want to use a non-Python language (like C++ or Rust) for your job script, or if you want to use a Python version not available through the Braket pre-built containers. It’s also a good choice if:

  • You're using software with a license key, and you need to authenticate that key against a licensing server to run the software. With BYOC, you can embed the license key in your Docker image and include code to authenticate it.

  • You're using software that isn’t publicly available. For example, the software is hosted on a private GitLab or GitHub repository that you need a particular SSH key to access.

  • You need to install a large suite of software that isn’t packaged in the Braket provided containers. BYOC will allow you to eliminate long startup times for your hybrid jobs containers due to software installation.

BYOC also enables you to make your custom SDK or algorithm available to customers by building a Docker container with your software and making it available to your users. You can do this by setting appropriate permissions in Amazon ECR.

Note

You must comply with all applicable software licenses.

Recipe for bringing your own container

In this section, we provide a step-by-step guide of what you’ll need to bring your own container (BYOC) to Braket Hybrid Jobs — the scripts, files, and steps to combine them in order to get up and running with your custom Docker images. We provide recipes for two common cases:

  1. Install additional software in a Docker image and use only Python algorithm scripts in your jobs.

  2. Use algorithm scripts written in a non-Python language with Hybrid Jobs, or a CPU architecture besides x86.

Defining the container entry script is more complex for case 2.

When Braket runs your Hybrid Job, it launches the requested number and type of Amazon EC2 instances, then runs the Docker image specified by the image URI input to job creation on them. When using the BYOC feature, you specify an image URI hosted in a private Amazon ECR repository that you have Read access to. Braket Hybrid Jobs uses that custom image to run the job.

The specific components you need to build a Docker image that can be used with Hybrid Jobs. If you’re unfamiliar with writing and building Dockerfiles, we suggest you refer to the Dockerfile documentation and the Amazon ECR CLI documentation as needed while you read these instructions.

A base image for your Dockerfile

If you are using Python and want to install software on top of what’s provided in the Braket provided containers, an option for a base image is one of the Braket container images, hosted in our GitHub repo and on Amazon ECR. You will need to authenticate to Amazon ECR to pull the image and build on top of it. For example, the first line of your BYOC Docker file could be: FROM [IMAGE_URI_HERE]

Next, fill out the rest of the Dockerfile to install and set up the software that you want to add to the container. The pre-built Braket images will already contain the appropriate container entry point script, so you don’t need to worry about including that.

If you want to use a non-Python language, such as C++, Rust, or Julia, or if you want to build an image for a non-x86 CPU architecture, like ARM, you may need to build on top of a barebones public image. You can find many such images at the Amazon Elastic Container Registry Public Gallery. Make sure you choose one that is appropriate for the CPU architecture, and if necessary, the GPU you want to use.

(Optional) A modified container entry point script

Note

If you're only adding additional software to a pre-built Braket image, you can skip this section.

To run non-Python code as part of your hybrid job, you’ll need to modify the Python script which defines the container entry point. For example, the braket_container.py python script on the Amazon Braket Github . This is the script the images pre-built by Braket use to launch your algorithm script and set appropriate environment variables. The container entry point script itself must be in Python, but can launch non-Python scripts. In the pre-built example, you can see that Python algorithm scripts are launched either as a Python subprocess or as a fully new process. By modifying this logic, you can enable the entry point script to launch non-Python algorithm scripts. For example, you could modify thekick_off_customer_script() function to launch Rust processes dependent on the file extension ending.

You can also choose to write a completely new braket_container.py. It should copy input data, source archives, and other necessary files from Amazon S3 into the container, and define the appropriate environment variables.

A Dockerfile that installs any necessary software and includes the container script

Note

If you use a pre-built Braket image as your Docker base image, the container script is already present.

If you created a modified container script in the previous step, you'll need to copy it into the container and define the environment variable SAGEMAKER_PROGRAM to braket_container.py, or what you have named your new container entry point script.

The following is an example of a Dockerfile that allows you to use Julia on GPU-accelerated Jobs instances:

FROM nvidia/cuda:12.2.0-devel-ubuntu22.04 ARG DEBIAN_FRONTEND=noninteractive ARG JULIA_RELEASE=1.8 ARG JULIA_VERSION=1.8.3 ARG PYTHON=python3.11 ARG PYTHON_PIP=python3-pip ARG PIP=pip ARG JULIA_URL = https://julialang-s3.julialang.org/bin/linux/x64/${JULIA_RELEASE}/ ARG TAR_NAME = julia-${JULIA_VERSION}-linux-x86_64.tar.gz ARG PYTHON_PKGS = # list your Python packages and versions here RUN curl -s -L ${JULIA_URL}/${TAR_NAME} | tar -C /usr/local -x -z --strip-components=1 -f - RUN apt-get update \ && apt-get install -y --no-install-recommends \ build-essential \ tzdata \ openssh-client \ openssh-server \ ca-certificates \ curl \ git \ libtemplate-perl \ libssl1.1 \ openssl \ unzip \ wget \ zlib1g-dev \ ${PYTHON_PIP} \ ${PYTHON}-dev \ RUN ${PIP} install --no-cache --upgrade ${PYTHON_PKGS} RUN ${PIP} install --no-cache --upgrade sagemaker-training==4.1.3 # Add EFA and SMDDP to LD library path ENV LD_LIBRARY_PATH="/opt/conda/lib/python${PYTHON_SHORT_VERSION}/site-packages/smdistributed/dataparallel/lib:$LD_LIBRARY_PATH" ENV LD_LIBRARY_PATH=/opt/amazon/efa/lib/:$LD_LIBRARY_PATH # Julia specific installation instructions COPY Project.toml /usr/local/share/julia/environments/v${JULIA_RELEASE}/ RUN JULIA_DEPOT_PATH=/usr/local/share/julia \ julia -e 'using Pkg; Pkg.instantiate(); Pkg.API.precompile()' # generate the device runtime library for all known and supported devices RUN JULIA_DEPOT_PATH=/usr/local/share/julia \ julia -e 'using CUDA; CUDA.precompile_runtime()' # Open source compliance scripts RUN HOME_DIR=/root \ && curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip \ && unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/ \ && cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance \ && chmod +x /usr/local/bin/testOSSCompliance \ && chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh \ && ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} ${PYTHON} \ && rm -rf ${HOME_DIR}/oss_compliance* # Copying the container entry point script COPY braket_container.py /opt/ml/code/braket_container.py ENV SAGEMAKER_PROGRAM braket_container.py

This example, downloads and runs scripts provided by AWS to ensure compliance with all relevant Open-Source licenses. For example, by properly attributing any installed code governed by an MIT license.

If you need to include non-public code, for instance code that is hosted in a private GitHub or GitLab repository, do not embed SSH keys in the Docker image to access it. Instead, use Docker Compose when you build to allow Docker to access SSH on the host machine it is built on. For more information, see the Securely using SSH keys in Docker to access private Github repositories guide.

Building and uploading your Docker image

With a properly defined Dockerfile, you are now ready to follow the steps to create a private Amazon ECR repository, if one does not already exist. You can also build, tag, and upload your container image to the repository.

You are ready to build, tag, and push the image. See the Docker build documentation for a full explanation of options to docker build and some examples.

For the sample file defined above, you could run:

aws ecr get-login-password --region ${your_region} | docker login --username AWS --password-stdin ${aws_account_id}.dkr.ecr.${your_region}.amazonaws.com docker build -t braket-julia . docker tag braket-julia:latest ${aws_account_id}.dkr.ecr.${your_region}.amazonaws.com/braket-julia:latest docker push ${aws_account_id}.dkr.ecr.${your_region}.amazonaws.com/braket-julia:latest

Assigning appropriate Amazon ECR permissions

Braket Hybrid Jobs Docker images must be hosted in private Amazon ECR repositories. By default, a private Amazon ECR repo does not provide read access to the Braket Hybrid Jobs IAM role or to any other users that want to use your image, such as a collaborator or student. You must set a repository policy in order to grant the appropriate permissions. In general, only give permission to those specific users and IAM roles you want to access your images, rather than allowing anyone with the image URI to pull them.

Running Braket hybrid jobs in your own container

To create a hybrid job with your own container, call AwsQuantumJob.create() with the argument image_uri specified. You can use a QPU, an on-demand simulator, or run your code locally on the classical processor available with Braket Hybrid Jobs. We recommend testing your code out on a simulator like SV1, DM1, or TN1 before running on a real QPU.

To run your code on the classical processor, specify the instanceType and the instanceCount you use by updating the InstanceConfig. Note that if you specify an instance_count > 1, you need to make sure that your code can run across multiple hosts. The upper limit for the number of instances you can choose is 5. For example:

job = AwsQuantumJob.create( source_module="source_dir", entry_point="source_dir.algorithm_script:start_here", image_uri="111122223333.dkr.ecr.us-west-2.amazonaws.com/my-byoc-container:latest", instance_config=InstanceConfig(instanceType="ml.p3.8xlarge", instanceCount=3), device="local:braket/braket.local.qubit", # ...)
Note

Use the device ARN to track the simulator you used as hybrid job metadata. Acceptable values must follow the format device = "local:<provider>/<simulator_name>". Remember that <provider> and <simulator_name> must consist only of letters, numbers, _, -, and . . The string is limited to 256 characters.

If you plan to use BYOC and you're not using the Braket SDK to create quantum tasks, you should pass the value of the environmental variable AMZN_BRAKET_JOB_TOKEN to the jobToken parameter in the CreateQuantumTask request. If you don't, the quantum tasks don't get priority and are billed as regular standalone quantum tasks.