AWS Tools for Windows PowerShell
Command Reference

AWS services or capabilities described in AWS Documentation may vary by region/location. Click Getting Started with Amazon AWS to see specific differences applicable to the China (Beijing) Region.

Synopsis

Calls the Amazon SageMaker Service CreateTrainingJob API operation.

Syntax

New-SMTrainingJob
-TrainingJobName <String>
-AlgorithmSpecification <AlgorithmSpecification>
-EnableInterContainerTrafficEncryption <Boolean>
-EnableManagedSpotTraining <Boolean>
-EnableNetworkIsolation <Boolean>
-HyperParameter <Hashtable>
-InputDataConfig <Channel[]>
-CheckpointConfig_LocalPath <String>
-StoppingCondition_MaxRuntimeInSecond <Int32>
-StoppingCondition_MaxWaitTimeInSecond <Int32>
-OutputDataConfig <OutputDataConfig>
-ResourceConfig <ResourceConfig>
-RoleArn <String>
-CheckpointConfig_S3Uri <String>
-VpcConfig_SecurityGroupId <String[]>
-VpcConfig_Subnet <String[]>
-Tag <Tag[]>
-Select <String>
-PassThru <SwitchParameter>
-Force <SwitchParameter>

Description

Starts a model training job. After training completes, Amazon SageMaker saves the resulting model artifacts to an Amazon S3 location that you specify. If you choose to host your model using Amazon SageMaker hosting services, you can use the resulting model artifacts as part of the model. You can also use the artifacts in a machine learning service other than Amazon SageMaker, provided that you know how to use them for inferences. In the request body, you provide the following:
  • AlgorithmSpecification - Identifies the training algorithm to use.
  • HyperParameters - Specify these algorithm-specific parameters to enable the estimation of model parameters during training. Hyperparameters can be tuned to optimize this learning process. For a list of hyperparameters for each training algorithm provided by Amazon SageMaker, see Algorithms.
  • InputDataConfig - Describes the training dataset and the Amazon S3, EFS, or FSx location where it is stored.
  • OutputDataConfig - Identifies the Amazon S3 bucket where you want Amazon SageMaker to save the results of model training.
  • ResourceConfig - Identifies the resources, ML compute instances, and ML storage volumes to deploy for model training. In distributed training, you specify more than one instance.
  • EnableManagedSpotTraining - Optimize the cost of training machine learning models by up to 80% by using Amazon EC2 Spot instances. For more information, see Managed Spot Training.
  • RoleARN - The Amazon Resource Number (ARN) that Amazon SageMaker assumes to perform tasks on your behalf during model training. You must grant this role the necessary permissions so that Amazon SageMaker can successfully complete model training.
  • StoppingCondition - To help cap training costs, use MaxRuntimeInSeconds to set a time limit for training. Use MaxWaitTimeInSeconds to specify how long you are willing to to wait for a managed spot training job to complete.
For more information about Amazon SageMaker, see How It Works.

Parameters

-AlgorithmSpecification <AlgorithmSpecification>
The registry path of the Docker image that contains the training algorithm and algorithm-specific metadata, including the input mode. For more information about algorithms provided by Amazon SageMaker, see Algorithms. For information about providing your own algorithms, see Using Your Own Algorithms with Amazon SageMaker.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-CheckpointConfig_LocalPath <String>
(Optional) The local directory where checkpoints are written. The default directory is /opt/ml/checkpoints/.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-CheckpointConfig_S3Uri <String>
Identifies the S3 path where you want Amazon SageMaker to store checkpoints. For example, s3://bucket-name/key-name-prefix.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-EnableInterContainerTrafficEncryption <Boolean>
To encrypt all communications between ML compute instances in distributed training, choose True. Encryption provides greater security for distributed training, but training might take longer. How long it takes depends on the amount of communication between compute instances, especially if you use a deep learning algorithm in distributed training. For more information, see Protect Communications Between ML Compute Instances in a Distributed Training Job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-EnableManagedSpotTraining <Boolean>
To train models using managed spot training, choose True. Managed spot training provides a fully managed and scalable infrastructure for training machine learning models. this option is useful when training jobs can be interrupted and when there is flexibility when the training job is run. The complete and intermediate results of jobs are stored in an Amazon S3 bucket, and can be used as a starting point to train models incrementally. Amazon SageMaker provides metrics and logs in CloudWatch. They can be used to see when managed spot training jobs are running, interrupted, resumed, or completed.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-EnableNetworkIsolation <Boolean>
Isolates the training container. No inbound or outbound network calls can be made, except for calls between peers within a training cluster for distributed training. If you enable network isolation for training jobs that are configured to use a VPC, Amazon SageMaker downloads and uploads customer data and model artifacts through the specified VPC, but the training container does not have network access.The Semantic Segmentation built-in algorithm does not support network isolation.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
This parameter overrides confirmation prompts to force the cmdlet to continue its operation. This parameter should always be used with caution.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-HyperParameter <Hashtable>
Algorithm-specific parameters that influence the quality of the model. You set hyperparameters before you start the learning process. For a list of hyperparameters for each training algorithm provided by Amazon SageMaker, see Algorithms.You can specify a maximum of 100 hyperparameters. Each hyperparameter is a key-value pair. Each key and value is limited to 256 characters, as specified by the Length Constraint.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesHyperParameters
-InputDataConfig <Channel[]>
An array of Channel objects. Each channel is a named input source. InputDataConfig describes the input data and its location. Algorithms can accept input data from one or more channels. For example, an algorithm might have two channels of input data, training_data and validation_data. The configuration for each channel provides the S3, EFS, or FSx location where the input data is stored. It also provides information about the stored data: the MIME type, compression method, and whether the data is wrapped in RecordIO format. Depending on the input mode that the algorithm supports, Amazon SageMaker either copies input data files from an S3 bucket to a local directory in the Docker container, or makes it available as input streams. For example, if you specify an EFS location, input data files will be made available as input streams. They do not need to be downloaded.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-OutputDataConfig <OutputDataConfig>
Specifies the path to the S3 location where you want to store model artifacts. Amazon SageMaker creates subfolders for the artifacts.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-PassThru <SwitchParameter>
Changes the cmdlet behavior to return the value passed to the TrainingJobName parameter. The -PassThru parameter is deprecated, use -Select '^TrainingJobName' instead. This parameter will be removed in a future version.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ResourceConfig <ResourceConfig>
The resources, including the ML compute instances and ML storage volumes, to use for model training. ML storage volumes store model artifacts and incremental states. Training algorithms might also use ML storage volumes for scratch space. If you want Amazon SageMaker to use the ML storage volume to store the training data, choose File as the TrainingInputMode in the algorithm specification. For distributed training algorithms, specify an instance count greater than 1.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-RoleArn <String>
The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf. During model training, Amazon SageMaker needs your permission to read input data from an S3 bucket, download a Docker image that contains training code, write model artifacts to an S3 bucket, write logs to Amazon CloudWatch Logs, and publish metrics to Amazon CloudWatch. You grant permissions for all of these tasks to an IAM role. For more information, see Amazon SageMaker Roles. To be able to pass this role to Amazon SageMaker, the caller of this API must have the iam:PassRole permission.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-Select <String>
Use the -Select parameter to control the cmdlet output. The default value is 'TrainingJobArn'. Specifying -Select '*' will result in the cmdlet returning the whole service response (Amazon.SageMaker.Model.CreateTrainingJobResponse). Specifying the name of a property of type Amazon.SageMaker.Model.CreateTrainingJobResponse will result in that property being returned. Specifying -Select '^ParameterName' will result in the cmdlet returning the selected cmdlet parameter value.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-StoppingCondition_MaxRuntimeInSecond <Int32>
The maximum length of time, in seconds, that the training or compilation job can run. If job does not complete during this time, Amazon SageMaker ends the job. If value is not specified, default value is 1 day. The maximum value is 28 days.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesStoppingCondition_MaxRuntimeInSeconds
-StoppingCondition_MaxWaitTimeInSecond <Int32>
The maximum length of time, in seconds, how long you are willing to wait for a managed spot training job to complete. It is the amount of time spent waiting for Spot capacity plus the amount of time the training job runs. It must be equal to or greater than MaxRuntimeInSeconds.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesStoppingCondition_MaxWaitTimeInSeconds
-Tag <Tag[]>
An array of key-value pairs. For more information, see Using Cost Allocation Tags in the AWS Billing and Cost Management User Guide.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesTags
-TrainingJobName <String>
The name of the training job. The name must be unique within an AWS Region in an AWS account.
Required?True
Position?1
Accept pipeline input?True (ByValue, ByPropertyName)
-VpcConfig_SecurityGroupId <String[]>
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesVpcConfig_SecurityGroupIds
-VpcConfig_Subnet <String[]>
The ID of the subnets in the VPC to which you want to connect your training job or model. Amazon EC2 P3 accelerated computing instances are not available in the c/d/e availability zones of region us-east-1. If you want to create endpoints with P3 instances in VPC mode in region us-east-1, create subnets in a/b/f availability zones instead.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesVpcConfig_Subnets

Common Credential and Region Parameters

-AccessKey <String>
The AWS access key for the user account. This can be a temporary access key if the corresponding session token is supplied to the -SessionToken parameter.
Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
Aliases AK
-Credential <AWSCredentials>
An AWSCredentials object instance containing access and secret key information, and optionally a token for session-based credentials.
Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
-ProfileLocation <String>

Used to specify the name and location of the ini-format credential file (shared with the AWS CLI and other AWS SDKs)

If this optional parameter is omitted this cmdlet will search the encrypted credential file used by the AWS SDK for .NET and AWS Toolkit for Visual Studio first. If the profile is not found then the cmdlet will search in the ini-format credential file at the default location: (user's home directory)\.aws\credentials. Note that the encrypted credential file is not supported on all platforms. It will be skipped when searching for profiles on Windows Nano Server, Mac, and Linux platforms.

If this parameter is specified then this cmdlet will only search the ini-format credential file at the location given.

As the current folder can vary in a shell or during script execution it is advised that you use specify a fully qualified path instead of a relative path.

Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
Aliases AWSProfilesLocation, ProfilesLocation
-ProfileName <String>
The user-defined name of an AWS credentials or SAML-based role profile containing credential information. The profile is expected to be found in the secure credential file shared with the AWS SDK for .NET and AWS Toolkit for Visual Studio. You can also specify the name of a profile stored in the .ini-format credential file used with the AWS CLI and other AWS SDKs.
Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
Aliases AWSProfileName, StoredCredentials
-NetworkCredential <PSCredential>
Used with SAML-based authentication when ProfileName references a SAML role profile. Contains the network credentials to be supplied during authentication with the configured identity provider's endpoint. This parameter is not required if the user's default network identity can or should be used during authentication.
Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
-SecretKey <String>
The AWS secret key for the user account. This can be a temporary secret key if the corresponding session token is supplied to the -SessionToken parameter.
Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
Aliases SecretAccessKey, SK
-SessionToken <String>
The session token if the access and secret keys are temporary session-based credentials.
Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
Aliases ST
-Region <String>
The system name of the AWS region in which the operation should be invoked. For example, us-east-1, eu-west-1 etc.
Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)
Aliases RegionToCall
-EndpointUrl <String>

The endpoint to make the call against.

Note: This parameter is primarily for internal AWS use and is not required/should not be specified for normal usage. The cmdlets normally determine which endpoint to call based on the region specified to the -Region parameter or set as default in the shell (via Set-DefaultAWSRegion). Only specify this parameter if you must direct the call to a specific custom endpoint.

Required? False
Position? Named
Accept pipeline input? True (ByPropertyName)

Inputs

You can pipe a String object to this cmdlet for the TrainingJobName parameter.

Outputs

This cmdlet returns a System.String object. The service call response (type Amazon.SageMaker.Model.CreateTrainingJobResponse) can also be referenced from properties attached to the cmdlet entry in the $AWSHistory stack.

Supported Version

AWS Tools for PowerShell: 2.x.y.z