AWS Tools for Windows PowerShell
Command Reference

AWS services or capabilities described in AWS Documentation may vary by region/location. Click Getting Started with Amazon AWS to see specific differences applicable to the China (Beijing) Region.

Synopsis

Calls the Amazon SageMaker Service CreateTrainingJob API operation.

Syntax

New-SMTrainingJob
-TrainingJobName <String>
-AlgorithmSpecification <AlgorithmSpecification>
-DebugHookConfig_CollectionConfiguration <CollectionConfiguration[]>
-DebugRuleConfiguration <DebugRuleConfiguration[]>
-ProfilerConfig_DisableProfiler <Boolean>
-InfraCheckConfig_EnableInfraCheck <Boolean>
-EnableInterContainerTrafficEncryption <Boolean>
-EnableManagedSpotTraining <Boolean>
-EnableNetworkIsolation <Boolean>
-RemoteDebugConfig_EnableRemoteDebug <Boolean>
-Environment <Hashtable>
-ExperimentConfig_ExperimentName <String>
-DebugHookConfig_HookParameter <Hashtable>
-HyperParameter <Hashtable>
-InputDataConfig <Channel[]>
-CheckpointConfig_LocalPath <String>
-DebugHookConfig_LocalPath <String>
-TensorBoardOutputConfig_LocalPath <String>
-RetryStrategy_MaximumRetryAttempt <Int32>
-StoppingCondition_MaxPendingTimeInSecond <Int32>
-StoppingCondition_MaxRuntimeInSecond <Int32>
-StoppingCondition_MaxWaitTimeInSecond <Int32>
-OutputDataConfig <OutputDataConfig>
-ProfilerRuleConfiguration <ProfilerRuleConfiguration[]>
-ProfilerConfig_ProfilingIntervalInMillisecond <Int64>
-ProfilerConfig_ProfilingParameter <Hashtable>
-ResourceConfig <ResourceConfig>
-RoleArn <String>
-ExperimentConfig_RunName <String>
-DebugHookConfig_S3OutputPath <String>
-ProfilerConfig_S3OutputPath <String>
-TensorBoardOutputConfig_S3OutputPath <String>
-CheckpointConfig_S3Uri <String>
-VpcConfig_SecurityGroupId <String[]>
-VpcConfig_Subnet <String[]>
-Tag <Tag[]>
-ExperimentConfig_TrialComponentDisplayName <String>
-ExperimentConfig_TrialName <String>
-Select <String>
-PassThru <SwitchParameter>
-Force <SwitchParameter>
-ClientConfig <AmazonSageMakerConfig>

Description

Starts a model training job. After training completes, SageMaker saves the resulting model artifacts to an Amazon S3 location that you specify. If you choose to host your model using SageMaker hosting services, you can use the resulting model artifacts as part of the model. You can also use the artifacts in a machine learning service other than SageMaker, provided that you know how to use them for inference. In the request body, you provide the following:
  • AlgorithmSpecification - Identifies the training algorithm to use.
  • HyperParameters - Specify these algorithm-specific parameters to enable the estimation of model parameters during training. Hyperparameters can be tuned to optimize this learning process. For a list of hyperparameters for each training algorithm provided by SageMaker, see Algorithms. Do not include any security-sensitive information including account access IDs, secrets or tokens in any hyperparameter field. If the use of security-sensitive credentials are detected, SageMaker will reject your training job request and return an exception error.
  • InputDataConfig - Describes the input required by the training job and the Amazon S3, EFS, or FSx location where it is stored.
  • OutputDataConfig - Identifies the Amazon S3 bucket where you want SageMaker to save the results of model training.
  • ResourceConfig - Identifies the resources, ML compute instances, and ML storage volumes to deploy for model training. In distributed training, you specify more than one instance.
  • EnableManagedSpotTraining - Optimize the cost of training machine learning models by up to 80% by using Amazon EC2 Spot instances. For more information, see Managed Spot Training.
  • RoleArn - The Amazon Resource Name (ARN) that SageMaker assumes to perform tasks on your behalf during model training. You must grant this role the necessary permissions so that SageMaker can successfully complete model training.
  • StoppingCondition - To help cap training costs, use MaxRuntimeInSeconds to set a time limit for training. Use MaxWaitTimeInSeconds to specify how long a managed spot training job has to complete.
  • Environment - The environment variables to set in the Docker container.
  • RetryStrategy - The number of times to retry the job when the job fails due to an InternalServerError.
For more information about SageMaker, see How It Works.

Parameters

-AlgorithmSpecification <AlgorithmSpecification>
The registry path of the Docker image that contains the training algorithm and algorithm-specific metadata, including the input mode. For more information about algorithms provided by SageMaker, see Algorithms. For information about providing your own algorithms, see Using Your Own Algorithms with Amazon SageMaker.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-CheckpointConfig_LocalPath <String>
(Optional) The local directory where checkpoints are written. The default directory is /opt/ml/checkpoints/.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-CheckpointConfig_S3Uri <String>
Identifies the S3 path where you want SageMaker to store checkpoints. For example, s3://bucket-name/key-name-prefix.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ClientConfig <AmazonSageMakerConfig>
Amazon.PowerShell.Cmdlets.SM.AmazonSageMakerClientCmdlet.ClientConfig
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-DebugHookConfig_CollectionConfiguration <CollectionConfiguration[]>
Configuration information for Amazon SageMaker Debugger tensor collections. To learn more about how to configure the CollectionConfiguration parameter, see Use the SageMaker and Debugger Configuration API Operations to Create, Update, and Debug Your Training Job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesDebugHookConfig_CollectionConfigurations
-DebugHookConfig_HookParameter <Hashtable>
Configuration information for the Amazon SageMaker Debugger hook parameters.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesDebugHookConfig_HookParameters
-DebugHookConfig_LocalPath <String>
Path to local storage location for metrics and tensors. Defaults to /opt/ml/output/tensors/.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-DebugHookConfig_S3OutputPath <String>
Path to Amazon S3 storage location for metrics and tensors.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-DebugRuleConfiguration <DebugRuleConfiguration[]>
Configuration information for Amazon SageMaker Debugger rules for debugging output tensors.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesDebugRuleConfigurations
-EnableInterContainerTrafficEncryption <Boolean>
To encrypt all communications between ML compute instances in distributed training, choose True. Encryption provides greater security for distributed training, but training might take longer. How long it takes depends on the amount of communication between compute instances, especially if you use a deep learning algorithm in distributed training. For more information, see Protect Communications Between ML Compute Instances in a Distributed Training Job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-EnableManagedSpotTraining <Boolean>
To train models using managed spot training, choose True. Managed spot training provides a fully managed and scalable infrastructure for training machine learning models. this option is useful when training jobs can be interrupted and when there is flexibility when the training job is run. The complete and intermediate results of jobs are stored in an Amazon S3 bucket, and can be used as a starting point to train models incrementally. Amazon SageMaker provides metrics and logs in CloudWatch. They can be used to see when managed spot training jobs are running, interrupted, resumed, or completed.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-EnableNetworkIsolation <Boolean>
Isolates the training container. No inbound or outbound network calls can be made, except for calls between peers within a training cluster for distributed training. If you enable network isolation for training jobs that are configured to use a VPC, SageMaker downloads and uploads customer data and model artifacts through the specified VPC, but the training container does not have network access.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-Environment <Hashtable>
The environment variables to set in the Docker container.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ExperimentConfig_ExperimentName <String>
The name of an existing experiment to associate with the trial component.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ExperimentConfig_RunName <String>
The name of the experiment run to associate with the trial component.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ExperimentConfig_TrialComponentDisplayName <String>
The display name for the trial component. If this key isn't specified, the display name is the trial component name.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ExperimentConfig_TrialName <String>
The name of an existing trial to associate the trial component with. If not specified, a new trial is created.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
This parameter overrides confirmation prompts to force the cmdlet to continue its operation. This parameter should always be used with caution.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-HyperParameter <Hashtable>
Algorithm-specific parameters that influence the quality of the model. You set hyperparameters before you start the learning process. For a list of hyperparameters for each training algorithm provided by SageMaker, see Algorithms.You can specify a maximum of 100 hyperparameters. Each hyperparameter is a key-value pair. Each key and value is limited to 256 characters, as specified by the Length Constraint. Do not include any security-sensitive information including account access IDs, secrets or tokens in any hyperparameter field. If the use of security-sensitive credentials are detected, SageMaker will reject your training job request and return an exception error.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesHyperParameters
-InfraCheckConfig_EnableInfraCheck <Boolean>
Enables an infrastructure health check.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-InputDataConfig <Channel[]>
An array of Channel objects. Each channel is a named input source. InputDataConfig describes the input data and its location. Algorithms can accept input data from one or more channels. For example, an algorithm might have two channels of input data, training_data and validation_data. The configuration for each channel provides the S3, EFS, or FSx location where the input data is stored. It also provides information about the stored data: the MIME type, compression method, and whether the data is wrapped in RecordIO format. Depending on the input mode that the algorithm supports, SageMaker either copies input data files from an S3 bucket to a local directory in the Docker container, or makes it available as input streams. For example, if you specify an EFS location, input data files are available as input streams. They do not need to be downloaded.Your input must be in the same Amazon Web Services region as your training job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-OutputDataConfig <OutputDataConfig>
Specifies the path to the S3 location where you want to store model artifacts. SageMaker creates subfolders for the artifacts.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-PassThru <SwitchParameter>
Changes the cmdlet behavior to return the value passed to the TrainingJobName parameter. The -PassThru parameter is deprecated, use -Select '^TrainingJobName' instead. This parameter will be removed in a future version.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ProfilerConfig_DisableProfiler <Boolean>
Configuration to turn off Amazon SageMaker Debugger's system monitoring and profiling functionality. To turn it off, set to True.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ProfilerConfig_ProfilingIntervalInMillisecond <Int64>
A time interval for capturing system metrics in milliseconds. Available values are 100, 200, 500, 1000 (1 second), 5000 (5 seconds), and 60000 (1 minute) milliseconds. The default value is 500 milliseconds.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesProfilerConfig_ProfilingIntervalInMilliseconds
-ProfilerConfig_ProfilingParameter <Hashtable>
Configuration information for capturing framework metrics. Available key strings for different profiling options are DetailedProfilingConfig, PythonProfilingConfig, and DataLoaderProfilingConfig. The following codes are configuration structures for the ProfilingParameters parameter. To learn more about how to configure the ProfilingParameters parameter, see Use the SageMaker and Debugger Configuration API Operations to Create, Update, and Debug Your Training Job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesProfilerConfig_ProfilingParameters
-ProfilerConfig_S3OutputPath <String>
Path to Amazon S3 storage location for system and framework metrics.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ProfilerRuleConfiguration <ProfilerRuleConfiguration[]>
Configuration information for Amazon SageMaker Debugger rules for profiling system and framework metrics.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesProfilerRuleConfigurations
-RemoteDebugConfig_EnableRemoteDebug <Boolean>
If set to True, enables remote debugging.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ResourceConfig <ResourceConfig>
The resources, including the ML compute instances and ML storage volumes, to use for model training. ML storage volumes store model artifacts and incremental states. Training algorithms might also use ML storage volumes for scratch space. If you want SageMaker to use the ML storage volume to store the training data, choose File as the TrainingInputMode in the algorithm specification. For distributed training algorithms, specify an instance count greater than 1.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-RetryStrategy_MaximumRetryAttempt <Int32>
The number of times to retry the job. When the job is retried, it's SecondaryStatus is changed to STARTING.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesRetryStrategy_MaximumRetryAttempts
-RoleArn <String>
The Amazon Resource Name (ARN) of an IAM role that SageMaker can assume to perform tasks on your behalf. During model training, SageMaker needs your permission to read input data from an S3 bucket, download a Docker image that contains training code, write model artifacts to an S3 bucket, write logs to Amazon CloudWatch Logs, and publish metrics to Amazon CloudWatch. You grant permissions for all of these tasks to an IAM role. For more information, see SageMaker Roles. To be able to pass this role to SageMaker, the caller of this API must have the iam:PassRole permission.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-Select <String>
Use the -Select parameter to control the cmdlet output. The default value is 'TrainingJobArn'. Specifying -Select '*' will result in the cmdlet returning the whole service response (Amazon.SageMaker.Model.CreateTrainingJobResponse). Specifying the name of a property of type Amazon.SageMaker.Model.CreateTrainingJobResponse will result in that property being returned. Specifying -Select '^ParameterName' will result in the cmdlet returning the selected cmdlet parameter value.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-StoppingCondition_MaxPendingTimeInSecond <Int32>
The maximum length of time, in seconds, that a training or compilation job can be pending before it is stopped.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesStoppingCondition_MaxPendingTimeInSeconds
-StoppingCondition_MaxRuntimeInSecond <Int32>
The maximum length of time, in seconds, that a training or compilation job can run before it is stopped.For compilation jobs, if the job does not complete during this time, a TimeOut error is generated. We recommend starting with 900 seconds and increasing as necessary based on your model.For all other jobs, if the job does not complete during this time, SageMaker ends the job. When RetryStrategy is specified in the job request, MaxRuntimeInSeconds specifies the maximum time for all of the attempts in total, not each individual attempt. The default value is 1 day. The maximum value is 28 days.The maximum time that a TrainingJob can run in total, including any time spent publishing metrics or archiving and uploading models after it has been stopped, is 30 days.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesStoppingCondition_MaxRuntimeInSeconds
-StoppingCondition_MaxWaitTimeInSecond <Int32>
The maximum length of time, in seconds, that a managed Spot training job has to complete. It is the amount of time spent waiting for Spot capacity plus the amount of time the job can run. It must be equal to or greater than MaxRuntimeInSeconds. If the job does not complete during this time, SageMaker ends the job.When RetryStrategy is specified in the job request, MaxWaitTimeInSeconds specifies the maximum time for all of the attempts in total, not each individual attempt.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesStoppingCondition_MaxWaitTimeInSeconds
-Tag <Tag[]>
An array of key-value pairs. You can use tags to categorize your Amazon Web Services resources in different ways, for example, by purpose, owner, or environment. For more information, see Tagging Amazon Web Services Resources.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesTags
-TensorBoardOutputConfig_LocalPath <String>
Path to local storage location for tensorBoard output. Defaults to /opt/ml/output/tensorboard.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-TensorBoardOutputConfig_S3OutputPath <String>
Path to Amazon S3 storage location for TensorBoard output.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-TrainingJobName <String>
The name of the training job. The name must be unique within an Amazon Web Services Region in an Amazon Web Services account.
Required?True
Position?1
Accept pipeline input?True (ByValue, ByPropertyName)
-VpcConfig_SecurityGroupId <String[]>
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesVpcConfig_SecurityGroupIds
-VpcConfig_Subnet <String[]>
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesVpcConfig_Subnets

Common Credential and Region Parameters

-AccessKey <String>
The AWS access key for the user account. This can be a temporary access key if the corresponding session token is supplied to the -SessionToken parameter.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesAK
-Credential <AWSCredentials>
An AWSCredentials object instance containing access and secret key information, and optionally a token for session-based credentials.
Required?False
Position?Named
Accept pipeline input?True (ByValue, ByPropertyName)
-EndpointUrl <String>
The endpoint to make the call against.Note: This parameter is primarily for internal AWS use and is not required/should not be specified for normal usage. The cmdlets normally determine which endpoint to call based on the region specified to the -Region parameter or set as default in the shell (via Set-DefaultAWSRegion). Only specify this parameter if you must direct the call to a specific custom endpoint.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-NetworkCredential <PSCredential>
Used with SAML-based authentication when ProfileName references a SAML role profile. Contains the network credentials to be supplied during authentication with the configured identity provider's endpoint. This parameter is not required if the user's default network identity can or should be used during authentication.
Required?False
Position?Named
Accept pipeline input?True (ByValue, ByPropertyName)
-ProfileLocation <String>
Used to specify the name and location of the ini-format credential file (shared with the AWS CLI and other AWS SDKs)If this optional parameter is omitted this cmdlet will search the encrypted credential file used by the AWS SDK for .NET and AWS Toolkit for Visual Studio first. If the profile is not found then the cmdlet will search in the ini-format credential file at the default location: (user's home directory)\.aws\credentials.If this parameter is specified then this cmdlet will only search the ini-format credential file at the location given.As the current folder can vary in a shell or during script execution it is advised that you use specify a fully qualified path instead of a relative path.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesAWSProfilesLocation, ProfilesLocation
-ProfileName <String>
The user-defined name of an AWS credentials or SAML-based role profile containing credential information. The profile is expected to be found in the secure credential file shared with the AWS SDK for .NET and AWS Toolkit for Visual Studio. You can also specify the name of a profile stored in the .ini-format credential file used with the AWS CLI and other AWS SDKs.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesStoredCredentials, AWSProfileName
-Region <Object>
The system name of an AWS region or an AWSRegion instance. This governs the endpoint that will be used when calling service operations. Note that the AWS resources referenced in a call are usually region-specific.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesRegionToCall
-SecretKey <String>
The AWS secret key for the user account. This can be a temporary secret key if the corresponding session token is supplied to the -SessionToken parameter.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesSK, SecretAccessKey
-SessionToken <String>
The session token if the access and secret keys are temporary session-based credentials.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesST

Outputs

This cmdlet returns a System.String object. The service call response (type Amazon.SageMaker.Model.CreateTrainingJobResponse) can also be referenced from properties attached to the cmdlet entry in the $AWSHistory stack.

Supported Version

AWS Tools for PowerShell: 2.x.y.z