Amazon SageMaker Processing runs your processing container image in a similar way as the following
command, where AppSpecification.ImageUri
is the Amazon ECR image URI that
you specify in a CreateProcessingJob
operation.
docker run [AppSpecification.ImageUri]
This command runs the ENTRYPOINT
command configured in your Docker
image.
You can also override the entrypoint command in the image or give command-line
arguments to your entrypoint command using the
AppSpecification.ContainerEntrypoint
and
AppSpecification.ContainerArgument
parameters in your
CreateProcessingJob
request. Specifying these parameters configures
Amazon SageMaker Processing to run the container similar to the way that the following command does.
docker run --entry-point [AppSpecification.ContainerEntrypoint] [AppSpecification.ImageUri] [AppSpecification.ContainerArguments]
For example, if you specify the ContainerEntrypoint
to be
[python3, -v, /processing_script.py]
in your
CreateProcessingJob
request, and ContainerArguments
to
be [data-format, csv]
, Amazon SageMaker Processing runs your container with the following
command.
python3 -v /processing_script.py data-format csv
When building your processing container, consider the following details:
-
Amazon SageMaker Processing decides whether the job completes or fails depending on the exit code of the command run. A processing job completes if all of the processing containers exit successfully with an exit code of 0, and fails if any of the containers exits with a non-zero exit code.
-
Amazon SageMaker Processing lets you override the processing container's entrypoint and set command-line arguments just like you can with the Docker API. Docker images can also configure the entrypoint and command-line arguments using the
ENTRYPOINT
and CMD instructions. The wayCreateProcessingJob
'sContainerEntrypoint
andContainerArgument
parameters configure a Docker image's entrypoint and arguments mirrors how Docker overrides the entrypoint and arguments through the Docker API:-
If neither
ContainerEntrypoint
norContainerArguments
are provided, Processing uses the defaultENTRYPOINT
or CMD in the image. -
If
ContainerEntrypoint
is provided, but notContainerArguments
, Processing runs the image with the given entrypoint, and ignores theENTRYPOINT
and CMD in the image. -
If
ContainerArguments
is provided, but notContainerEntrypoint
, Processing runs the image with the defaultENTRYPOINT
in the image and with the provided arguments. -
If both
ContainerEntrypoint
andContainerArguments
are provided, Processing runs the image with the given entrypoint and arguments, and ignores theENTRYPOINT
and CMD in the image.
-
-
You must use the exec form of the
ENTRYPOINT
instruction in your Dockerfile (ENTRYPOINT
["executable", "param1", "param2"])
instead of the shell form (ENTRYPOINT
command param1 param2
). This lets your processing container receiveSIGINT
andSIGKILL
signals, which Processing uses to stop processing jobs with theStopProcessingJob
API. -
/opt/ml
and all its subdirectories are reserved by SageMaker AI. When building your Processing Docker image, don't place any data required by your processing container in these directories. -
If you plan to use GPU devices, make sure that your containers are nvidia-docker compatible. Include only the CUDA toolkit in containers. Don't bundle NVIDIA drivers with the image. For more information about nvidia-docker, see NVIDIA/nvidia-docker
.