CfnDataQualityJobDefinition
- class aws_cdk.aws_sagemaker.CfnDataQualityJobDefinition(scope, id, *, data_quality_app_specification, data_quality_job_input, data_quality_job_output_config, job_resources, role_arn, data_quality_baseline_config=None, endpoint_name=None, job_definition_name=None, network_config=None, stopping_condition=None, tags=None)
Bases:
CfnResource
Creates a definition for a job that monitors data quality and drift.
For information about model monitor, see Amazon SageMaker Model Monitor .
- See:
- CloudformationResource:
AWS::SageMaker::DataQualityJobDefinition
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cfn_data_quality_job_definition = sagemaker.CfnDataQualityJobDefinition(self, "MyCfnDataQualityJobDefinition", data_quality_app_specification=sagemaker.CfnDataQualityJobDefinition.DataQualityAppSpecificationProperty( image_uri="imageUri", # the properties below are optional container_arguments=["containerArguments"], container_entrypoint=["containerEntrypoint"], environment={ "environment_key": "environment" }, post_analytics_processor_source_uri="postAnalyticsProcessorSourceUri", record_preprocessor_source_uri="recordPreprocessorSourceUri" ), data_quality_job_input=sagemaker.CfnDataQualityJobDefinition.DataQualityJobInputProperty( batch_transform_input=sagemaker.CfnDataQualityJobDefinition.BatchTransformInputProperty( data_captured_destination_s3_uri="dataCapturedDestinationS3Uri", dataset_format=sagemaker.CfnDataQualityJobDefinition.DatasetFormatProperty( csv=sagemaker.CfnDataQualityJobDefinition.CsvProperty( header=False ), json=sagemaker.CfnDataQualityJobDefinition.JsonProperty( line=False ), parquet=False ), local_path="localPath", # the properties below are optional exclude_features_attribute="excludeFeaturesAttribute", s3_data_distribution_type="s3DataDistributionType", s3_input_mode="s3InputMode" ), endpoint_input=sagemaker.CfnDataQualityJobDefinition.EndpointInputProperty( endpoint_name="endpointName", local_path="localPath", # the properties below are optional exclude_features_attribute="excludeFeaturesAttribute", s3_data_distribution_type="s3DataDistributionType", s3_input_mode="s3InputMode" ) ), data_quality_job_output_config=sagemaker.CfnDataQualityJobDefinition.MonitoringOutputConfigProperty( monitoring_outputs=[sagemaker.CfnDataQualityJobDefinition.MonitoringOutputProperty( s3_output=sagemaker.CfnDataQualityJobDefinition.S3OutputProperty( local_path="localPath", s3_uri="s3Uri", # the properties below are optional s3_upload_mode="s3UploadMode" ) )], # the properties below are optional kms_key_id="kmsKeyId" ), job_resources=sagemaker.CfnDataQualityJobDefinition.MonitoringResourcesProperty( cluster_config=sagemaker.CfnDataQualityJobDefinition.ClusterConfigProperty( instance_count=123, instance_type="instanceType", volume_size_in_gb=123, # the properties below are optional volume_kms_key_id="volumeKmsKeyId" ) ), role_arn="roleArn", # the properties below are optional data_quality_baseline_config=sagemaker.CfnDataQualityJobDefinition.DataQualityBaselineConfigProperty( baselining_job_name="baseliningJobName", constraints_resource=sagemaker.CfnDataQualityJobDefinition.ConstraintsResourceProperty( s3_uri="s3Uri" ), statistics_resource=sagemaker.CfnDataQualityJobDefinition.StatisticsResourceProperty( s3_uri="s3Uri" ) ), endpoint_name="endpointName", job_definition_name="jobDefinitionName", network_config=sagemaker.CfnDataQualityJobDefinition.NetworkConfigProperty( enable_inter_container_traffic_encryption=False, enable_network_isolation=False, vpc_config=sagemaker.CfnDataQualityJobDefinition.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] ) ), stopping_condition=sagemaker.CfnDataQualityJobDefinition.StoppingConditionProperty( max_runtime_in_seconds=123 ), tags=[CfnTag( key="key", value="value" )] )
- Parameters:
scope (
Construct
) – Scope in which this resource is defined.id (
str
) – Construct identifier for this resource (unique in its scope).data_quality_app_specification (
Union
[IResolvable
,DataQualityAppSpecificationProperty
,Dict
[str
,Any
]]) – Specifies the container that runs the monitoring job.data_quality_job_input (
Union
[IResolvable
,DataQualityJobInputProperty
,Dict
[str
,Any
]]) – A list of inputs for the monitoring job. Currently endpoints are supported as monitoring inputs.data_quality_job_output_config (
Union
[IResolvable
,MonitoringOutputConfigProperty
,Dict
[str
,Any
]]) – The output configuration for monitoring jobs.job_resources (
Union
[IResolvable
,MonitoringResourcesProperty
,Dict
[str
,Any
]]) – Identifies the resources to deploy for a monitoring job.role_arn (
str
) – The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.data_quality_baseline_config (
Union
[IResolvable
,DataQualityBaselineConfigProperty
,Dict
[str
,Any
],None
]) – Configures the constraints and baselines for the monitoring job.endpoint_name (
Optional
[str
]) – The name of the endpoint used to run the monitoring job.job_definition_name (
Optional
[str
]) – The name for the monitoring job definition.network_config (
Union
[IResolvable
,NetworkConfigProperty
,Dict
[str
,Any
],None
]) – Specifies networking configuration for the monitoring job.stopping_condition (
Union
[IResolvable
,StoppingConditionProperty
,Dict
[str
,Any
],None
]) – A time limit for how long the monitoring job is allowed to run before stopping.tags (
Optional
[Sequence
[Union
[CfnTag
,Dict
[str
,Any
]]]]) – An array of key-value pairs to apply to this resource. For more information, see Tag .
Methods
- add_deletion_override(path)
Syntactic sugar for
addOverride(path, undefined)
.- Parameters:
path (
str
) – The path of the value to delete.- Return type:
None
- add_dependency(target)
Indicates that this resource depends on another resource and cannot be provisioned unless the other resource has been successfully provisioned.
This can be used for resources across stacks (or nested stack) boundaries and the dependency will automatically be transferred to the relevant scope.
- Parameters:
target (
CfnResource
) –- Return type:
None
- add_depends_on(target)
(deprecated) Indicates that this resource depends on another resource and cannot be provisioned unless the other resource has been successfully provisioned.
- Parameters:
target (
CfnResource
) –- Deprecated:
use addDependency
- Stability:
deprecated
- Return type:
None
- add_metadata(key, value)
Add a value to the CloudFormation Resource Metadata.
- Parameters:
key (
str
) –value (
Any
) –
- See:
- Return type:
None
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/metadata-section-structure.html
Note that this is a different set of metadata from CDK node metadata; this metadata ends up in the stack template under the resource, whereas CDK node metadata ends up in the Cloud Assembly.
- add_override(path, value)
Adds an override to the synthesized CloudFormation resource.
To add a property override, either use
addPropertyOverride
or prefixpath
with “Properties.” (i.e.Properties.TopicName
).If the override is nested, separate each nested level using a dot (.) in the path parameter. If there is an array as part of the nesting, specify the index in the path.
To include a literal
.
in the property name, prefix with a\
. In most programming languages you will need to write this as"\\."
because the\
itself will need to be escaped.For example:
cfn_resource.add_override("Properties.GlobalSecondaryIndexes.0.Projection.NonKeyAttributes", ["myattribute"]) cfn_resource.add_override("Properties.GlobalSecondaryIndexes.1.ProjectionType", "INCLUDE")
would add the overrides Example:
"Properties": { "GlobalSecondaryIndexes": [ { "Projection": { "NonKeyAttributes": [ "myattribute" ] ... } ... }, { "ProjectionType": "INCLUDE" ... }, ] ... }
The
value
argument toaddOverride
will not be processed or translated in any way. Pass raw JSON values in here with the correct capitalization for CloudFormation. If you pass CDK classes or structs, they will be rendered with lowercased key names, and CloudFormation will reject the template.- Parameters:
path (
str
) –The path of the property, you can use dot notation to override values in complex types. Any intermediate keys will be created as needed.
value (
Any
) –The value. Could be primitive or complex.
- Return type:
None
- add_property_deletion_override(property_path)
Adds an override that deletes the value of a property from the resource definition.
- Parameters:
property_path (
str
) – The path to the property.- Return type:
None
- add_property_override(property_path, value)
Adds an override to a resource property.
Syntactic sugar for
addOverride("Properties.<...>", value)
.- Parameters:
property_path (
str
) – The path of the property.value (
Any
) – The value.
- Return type:
None
- apply_removal_policy(policy=None, *, apply_to_update_replace_policy=None, default=None)
Sets the deletion policy of the resource based on the removal policy specified.
The Removal Policy controls what happens to this resource when it stops being managed by CloudFormation, either because you’ve removed it from the CDK application or because you’ve made a change that requires the resource to be replaced.
The resource can be deleted (
RemovalPolicy.DESTROY
), or left in your AWS account for data recovery and cleanup later (RemovalPolicy.RETAIN
). In some cases, a snapshot can be taken of the resource prior to deletion (RemovalPolicy.SNAPSHOT
). A list of resources that support this policy can be found in the following link:- Parameters:
policy (
Optional
[RemovalPolicy
]) –apply_to_update_replace_policy (
Optional
[bool
]) – Apply the same deletion policy to the resource’s “UpdateReplacePolicy”. Default: truedefault (
Optional
[RemovalPolicy
]) – The default policy to apply in case the removal policy is not defined. Default: - Default value is resource specific. To determine the default value for a resource, please consult that specific resource’s documentation.
- See:
- Return type:
None
- get_att(attribute_name, type_hint=None)
Returns a token for an runtime attribute of this resource.
Ideally, use generated attribute accessors (e.g.
resource.arn
), but this can be used for future compatibility in case there is no generated attribute.- Parameters:
attribute_name (
str
) – The name of the attribute.type_hint (
Optional
[ResolutionTypeHint
]) –
- Return type:
- get_metadata(key)
Retrieve a value value from the CloudFormation Resource Metadata.
- Parameters:
key (
str
) –- See:
- Return type:
Any
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/metadata-section-structure.html
Note that this is a different set of metadata from CDK node metadata; this metadata ends up in the stack template under the resource, whereas CDK node metadata ends up in the Cloud Assembly.
- inspect(inspector)
Examines the CloudFormation resource and discloses attributes.
- Parameters:
inspector (
TreeInspector
) – tree inspector to collect and process attributes.- Return type:
None
- obtain_dependencies()
Retrieves an array of resources this resource depends on.
This assembles dependencies on resources across stacks (including nested stacks) automatically.
- Return type:
List
[Union
[Stack
,CfnResource
]]
- obtain_resource_dependencies()
Get a shallow copy of dependencies between this resource and other resources in the same stack.
- Return type:
List
[CfnResource
]
- override_logical_id(new_logical_id)
Overrides the auto-generated logical ID with a specific ID.
- Parameters:
new_logical_id (
str
) – The new logical ID to use for this stack element.- Return type:
None
- remove_dependency(target)
Indicates that this resource no longer depends on another resource.
This can be used for resources across stacks (including nested stacks) and the dependency will automatically be removed from the relevant scope.
- Parameters:
target (
CfnResource
) –- Return type:
None
- replace_dependency(target, new_target)
Replaces one dependency with another.
- Parameters:
target (
CfnResource
) – The dependency to replace.new_target (
CfnResource
) – The new dependency to add.
- Return type:
None
- to_string()
Returns a string representation of this construct.
- Return type:
str
- Returns:
a string representation of this resource
Attributes
- CFN_RESOURCE_TYPE_NAME = 'AWS::SageMaker::DataQualityJobDefinition'
- attr_creation_time
The time when the job definition was created.
- CloudformationAttribute:
CreationTime
- attr_job_definition_arn
The Amazon Resource Name (ARN) of the job definition.
- CloudformationAttribute:
JobDefinitionArn
- cfn_options
Options for this resource, such as condition, update policy etc.
- cfn_resource_type
AWS resource type.
- creation_stack
return:
the stack trace of the point where this Resource was created from, sourced from the +metadata+ entry typed +aws:cdk:logicalId+, and with the bottom-most node +internal+ entries filtered.
- data_quality_app_specification
Specifies the container that runs the monitoring job.
- data_quality_baseline_config
Configures the constraints and baselines for the monitoring job.
- data_quality_job_input
A list of inputs for the monitoring job.
- data_quality_job_output_config
The output configuration for monitoring jobs.
- endpoint_name
The name of the endpoint used to run the monitoring job.
- job_definition_name
The name for the monitoring job definition.
- job_resources
Identifies the resources to deploy for a monitoring job.
- logical_id
The logical ID for this CloudFormation stack element.
The logical ID of the element is calculated from the path of the resource node in the construct tree.
To override this value, use
overrideLogicalId(newLogicalId)
.- Returns:
the logical ID as a stringified token. This value will only get resolved during synthesis.
- network_config
Specifies networking configuration for the monitoring job.
- node
The tree node.
- ref
Return a string that will be resolved to a CloudFormation
{ Ref }
for this element.If, by any chance, the intrinsic reference of a resource is not a string, you could coerce it to an IResolvable through
Lazy.any({ produce: resource.ref })
.
- role_arn
The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.
- stack
The stack in which this element is defined.
CfnElements must be defined within a stack scope (directly or indirectly).
- stopping_condition
A time limit for how long the monitoring job is allowed to run before stopping.
- tags
Tag Manager which manages the tags for this resource.
- tags_raw
An array of key-value pairs to apply to this resource.
Static Methods
- classmethod is_cfn_element(x)
Returns
true
if a construct is a stack element (i.e. part of the synthesized cloudformation template).Uses duck-typing instead of
instanceof
to allow stack elements from different versions of this library to be included in the same stack.- Parameters:
x (
Any
) –- Return type:
bool
- Returns:
The construct as a stack element or undefined if it is not a stack element.
- classmethod is_cfn_resource(x)
Check whether the given object is a CfnResource.
- Parameters:
x (
Any
) –- Return type:
bool
- classmethod is_construct(x)
Checks if
x
is a construct.Use this method instead of
instanceof
to properly detectConstruct
instances, even when the construct library is symlinked.Explanation: in JavaScript, multiple copies of the
constructs
library on disk are seen as independent, completely different libraries. As a consequence, the classConstruct
in each copy of theconstructs
library is seen as a different class, and an instance of one class will not test asinstanceof
the other class.npm install
will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of theconstructs
library can be accidentally installed, andinstanceof
will behave unpredictably. It is safest to avoid usinginstanceof
, and using this type-testing method instead.- Parameters:
x (
Any
) – Any object.- Return type:
bool
- Returns:
true if
x
is an object created from a class which extendsConstruct
.
BatchTransformInputProperty
- class CfnDataQualityJobDefinition.BatchTransformInputProperty(*, data_captured_destination_s3_uri, dataset_format, local_path, exclude_features_attribute=None, s3_data_distribution_type=None, s3_input_mode=None)
Bases:
object
Input object for the batch transform job.
- Parameters:
data_captured_destination_s3_uri (
str
) – The Amazon S3 location being used to capture the data.dataset_format (
Union
[IResolvable
,DatasetFormatProperty
,Dict
[str
,Any
]]) – The dataset format for your batch transform job.local_path (
str
) – Path to the filesystem where the batch transform data is available to the container.exclude_features_attribute (
Optional
[str
]) – The attributes of the input data to exclude from the analysis.s3_data_distribution_type (
Optional
[str
]) – Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key. Defaults toFullyReplicated
s3_input_mode (
Optional
[str
]) – Whether thePipe
orFile
is used as the input mode for transferring data for the monitoring job.Pipe
mode is recommended for large datasets.File
mode is useful for small files that fit in memory. Defaults toFile
.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker batch_transform_input_property = sagemaker.CfnDataQualityJobDefinition.BatchTransformInputProperty( data_captured_destination_s3_uri="dataCapturedDestinationS3Uri", dataset_format=sagemaker.CfnDataQualityJobDefinition.DatasetFormatProperty( csv=sagemaker.CfnDataQualityJobDefinition.CsvProperty( header=False ), json=sagemaker.CfnDataQualityJobDefinition.JsonProperty( line=False ), parquet=False ), local_path="localPath", # the properties below are optional exclude_features_attribute="excludeFeaturesAttribute", s3_data_distribution_type="s3DataDistributionType", s3_input_mode="s3InputMode" )
Attributes
- data_captured_destination_s3_uri
The Amazon S3 location being used to capture the data.
- dataset_format
The dataset format for your batch transform job.
- exclude_features_attribute
The attributes of the input data to exclude from the analysis.
- local_path
Path to the filesystem where the batch transform data is available to the container.
- s3_data_distribution_type
Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key.
Defaults to
FullyReplicated
- s3_input_mode
Whether the
Pipe
orFile
is used as the input mode for transferring data for the monitoring job.Pipe
mode is recommended for large datasets.File
mode is useful for small files that fit in memory. Defaults toFile
.
ClusterConfigProperty
- class CfnDataQualityJobDefinition.ClusterConfigProperty(*, instance_count, instance_type, volume_size_in_gb, volume_kms_key_id=None)
Bases:
object
The configuration for the cluster of resources used to run the processing job.
- Parameters:
instance_count (
Union
[int
,float
]) – The number of ML compute instances to use in the model monitoring job. For distributed processing jobs, specify a value greater than 1. The default value is 1.instance_type (
str
) – The ML compute instance type for the processing job.volume_size_in_gb (
Union
[int
,float
]) – The size of the ML storage volume, in gigabytes, that you want to provision. You must specify sufficient ML storage for your scenario.volume_kms_key_id (
Optional
[str
]) – The AWS Key Management Service ( AWS KMS) key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the model monitoring job.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker cluster_config_property = sagemaker.CfnDataQualityJobDefinition.ClusterConfigProperty( instance_count=123, instance_type="instanceType", volume_size_in_gb=123, # the properties below are optional volume_kms_key_id="volumeKmsKeyId" )
Attributes
- instance_count
The number of ML compute instances to use in the model monitoring job.
For distributed processing jobs, specify a value greater than 1. The default value is 1.
- instance_type
The ML compute instance type for the processing job.
- volume_kms_key_id
The AWS Key Management Service ( AWS KMS) key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the model monitoring job.
- volume_size_in_gb
The size of the ML storage volume, in gigabytes, that you want to provision.
You must specify sufficient ML storage for your scenario.
ConstraintsResourceProperty
- class CfnDataQualityJobDefinition.ConstraintsResourceProperty(*, s3_uri=None)
Bases:
object
The constraints resource for a monitoring job.
- Parameters:
s3_uri (
Optional
[str
]) – The Amazon S3 URI for the constraints resource.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker constraints_resource_property = sagemaker.CfnDataQualityJobDefinition.ConstraintsResourceProperty( s3_uri="s3Uri" )
Attributes
- s3_uri
The Amazon S3 URI for the constraints resource.
CsvProperty
- class CfnDataQualityJobDefinition.CsvProperty(*, header=None)
Bases:
object
The CSV format.
- Parameters:
header (
Union
[bool
,IResolvable
,None
]) – A boolean flag indicating if given CSV has header.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker csv_property = sagemaker.CfnDataQualityJobDefinition.CsvProperty( header=False )
Attributes
- header
A boolean flag indicating if given CSV has header.
DataQualityAppSpecificationProperty
- class CfnDataQualityJobDefinition.DataQualityAppSpecificationProperty(*, image_uri, container_arguments=None, container_entrypoint=None, environment=None, post_analytics_processor_source_uri=None, record_preprocessor_source_uri=None)
Bases:
object
Information about the container that a data quality monitoring job runs.
- Parameters:
image_uri (
str
) – The container image that the data quality monitoring job runs.container_arguments (
Optional
[Sequence
[str
]]) – The arguments to send to the container that the monitoring job runs.container_entrypoint (
Optional
[Sequence
[str
]]) – The entrypoint for a container used to run a monitoring job.environment (
Union
[IResolvable
,Mapping
[str
,str
],None
]) – Sets the environment variables in the container that the monitoring job runs.post_analytics_processor_source_uri (
Optional
[str
]) – An Amazon S3 URI to a script that is called after analysis has been performed. Applicable only for the built-in (first party) containers.record_preprocessor_source_uri (
Optional
[str
]) – An Amazon S3 URI to a script that is called per row prior to running analysis. It can base64 decode the payload and convert it into a flattened JSON so that the built-in container can use the converted data. Applicable only for the built-in (first party) containers.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker data_quality_app_specification_property = sagemaker.CfnDataQualityJobDefinition.DataQualityAppSpecificationProperty( image_uri="imageUri", # the properties below are optional container_arguments=["containerArguments"], container_entrypoint=["containerEntrypoint"], environment={ "environment_key": "environment" }, post_analytics_processor_source_uri="postAnalyticsProcessorSourceUri", record_preprocessor_source_uri="recordPreprocessorSourceUri" )
Attributes
- container_arguments
The arguments to send to the container that the monitoring job runs.
- container_entrypoint
The entrypoint for a container used to run a monitoring job.
- environment
Sets the environment variables in the container that the monitoring job runs.
- image_uri
The container image that the data quality monitoring job runs.
- post_analytics_processor_source_uri
An Amazon S3 URI to a script that is called after analysis has been performed.
Applicable only for the built-in (first party) containers.
- record_preprocessor_source_uri
An Amazon S3 URI to a script that is called per row prior to running analysis.
It can base64 decode the payload and convert it into a flattened JSON so that the built-in container can use the converted data. Applicable only for the built-in (first party) containers.
DataQualityBaselineConfigProperty
- class CfnDataQualityJobDefinition.DataQualityBaselineConfigProperty(*, baselining_job_name=None, constraints_resource=None, statistics_resource=None)
Bases:
object
Configuration for monitoring constraints and monitoring statistics.
These baseline resources are compared against the results of the current job from the series of jobs scheduled to collect data periodically.
- Parameters:
baselining_job_name (
Optional
[str
]) – The name of the job that performs baselining for the data quality monitoring job.constraints_resource (
Union
[IResolvable
,ConstraintsResourceProperty
,Dict
[str
,Any
],None
]) – The constraints resource for a monitoring job.statistics_resource (
Union
[IResolvable
,StatisticsResourceProperty
,Dict
[str
,Any
],None
]) – Configuration for monitoring constraints and monitoring statistics. These baseline resources are compared against the results of the current job from the series of jobs scheduled to collect data periodically.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker data_quality_baseline_config_property = sagemaker.CfnDataQualityJobDefinition.DataQualityBaselineConfigProperty( baselining_job_name="baseliningJobName", constraints_resource=sagemaker.CfnDataQualityJobDefinition.ConstraintsResourceProperty( s3_uri="s3Uri" ), statistics_resource=sagemaker.CfnDataQualityJobDefinition.StatisticsResourceProperty( s3_uri="s3Uri" ) )
Attributes
- baselining_job_name
The name of the job that performs baselining for the data quality monitoring job.
- constraints_resource
The constraints resource for a monitoring job.
- statistics_resource
Configuration for monitoring constraints and monitoring statistics.
These baseline resources are compared against the results of the current job from the series of jobs scheduled to collect data periodically.
DataQualityJobInputProperty
- class CfnDataQualityJobDefinition.DataQualityJobInputProperty(*, batch_transform_input=None, endpoint_input=None)
Bases:
object
The input for the data quality monitoring job.
Currently endpoints are supported for input.
- Parameters:
batch_transform_input (
Union
[IResolvable
,BatchTransformInputProperty
,Dict
[str
,Any
],None
]) – Input object for the batch transform job.endpoint_input (
Union
[IResolvable
,EndpointInputProperty
,Dict
[str
,Any
],None
]) – Input object for the endpoint.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker data_quality_job_input_property = sagemaker.CfnDataQualityJobDefinition.DataQualityJobInputProperty( batch_transform_input=sagemaker.CfnDataQualityJobDefinition.BatchTransformInputProperty( data_captured_destination_s3_uri="dataCapturedDestinationS3Uri", dataset_format=sagemaker.CfnDataQualityJobDefinition.DatasetFormatProperty( csv=sagemaker.CfnDataQualityJobDefinition.CsvProperty( header=False ), json=sagemaker.CfnDataQualityJobDefinition.JsonProperty( line=False ), parquet=False ), local_path="localPath", # the properties below are optional exclude_features_attribute="excludeFeaturesAttribute", s3_data_distribution_type="s3DataDistributionType", s3_input_mode="s3InputMode" ), endpoint_input=sagemaker.CfnDataQualityJobDefinition.EndpointInputProperty( endpoint_name="endpointName", local_path="localPath", # the properties below are optional exclude_features_attribute="excludeFeaturesAttribute", s3_data_distribution_type="s3DataDistributionType", s3_input_mode="s3InputMode" ) )
Attributes
- batch_transform_input
Input object for the batch transform job.
DatasetFormatProperty
- class CfnDataQualityJobDefinition.DatasetFormatProperty(*, csv=None, json=None, parquet=None)
Bases:
object
The dataset format of the data to monitor.
- Parameters:
csv (
Union
[IResolvable
,CsvProperty
,Dict
[str
,Any
],None
]) – The CSV format.json (
Union
[IResolvable
,JsonProperty
,Dict
[str
,Any
],None
]) – The Json format.parquet (
Union
[bool
,IResolvable
,None
]) – A flag indicate if the dataset format is Parquet.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker dataset_format_property = sagemaker.CfnDataQualityJobDefinition.DatasetFormatProperty( csv=sagemaker.CfnDataQualityJobDefinition.CsvProperty( header=False ), json=sagemaker.CfnDataQualityJobDefinition.JsonProperty( line=False ), parquet=False )
Attributes
- csv
The CSV format.
- json
The Json format.
- parquet
A flag indicate if the dataset format is Parquet.
EndpointInputProperty
- class CfnDataQualityJobDefinition.EndpointInputProperty(*, endpoint_name, local_path, exclude_features_attribute=None, s3_data_distribution_type=None, s3_input_mode=None)
Bases:
object
Input object for the endpoint.
- Parameters:
endpoint_name (
str
) – An endpoint in customer’s account which has enabledDataCaptureConfig
enabled.local_path (
str
) – Path to the filesystem where the endpoint data is available to the container.exclude_features_attribute (
Optional
[str
]) – The attributes of the input data to exclude from the analysis.s3_data_distribution_type (
Optional
[str
]) – Whether input data distributed in Amazon S3 is fully replicated or sharded by an Amazon S3 key. Defaults toFullyReplicated
s3_input_mode (
Optional
[str
]) – Whether thePipe
orFile
is used as the input mode for transferring data for the monitoring job.Pipe
mode is recommended for large datasets.File
mode is useful for small files that fit in memory. Defaults toFile
.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker endpoint_input_property = sagemaker.CfnDataQualityJobDefinition.EndpointInputProperty( endpoint_name="endpointName", local_path="localPath", # the properties below are optional exclude_features_attribute="excludeFeaturesAttribute", s3_data_distribution_type="s3DataDistributionType", s3_input_mode="s3InputMode" )
Attributes
- endpoint_name
An endpoint in customer’s account which has enabled
DataCaptureConfig
enabled.
- exclude_features_attribute
The attributes of the input data to exclude from the analysis.
- local_path
Path to the filesystem where the endpoint data is available to the container.
- s3_data_distribution_type
Whether input data distributed in Amazon S3 is fully replicated or sharded by an Amazon S3 key.
Defaults to
FullyReplicated
- s3_input_mode
Whether the
Pipe
orFile
is used as the input mode for transferring data for the monitoring job.Pipe
mode is recommended for large datasets.File
mode is useful for small files that fit in memory. Defaults toFile
.
JsonProperty
- class CfnDataQualityJobDefinition.JsonProperty(*, line=None)
Bases:
object
The Json format.
- Parameters:
line (
Union
[bool
,IResolvable
,None
]) – A boolean flag indicating if it is JSON line format.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker json_property = sagemaker.CfnDataQualityJobDefinition.JsonProperty( line=False )
Attributes
- line
A boolean flag indicating if it is JSON line format.
MonitoringOutputConfigProperty
- class CfnDataQualityJobDefinition.MonitoringOutputConfigProperty(*, monitoring_outputs, kms_key_id=None)
Bases:
object
The output configuration for monitoring jobs.
- Parameters:
monitoring_outputs (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,MonitoringOutputProperty
,Dict
[str
,Any
]]]]) – Monitoring outputs for monitoring jobs. This is where the output of the periodic monitoring jobs is uploaded.kms_key_id (
Optional
[str
]) – The AWS Key Management Service ( AWS KMS ) key that Amazon SageMaker uses to encrypt the model artifacts at rest using Amazon S3 server-side encryption.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker monitoring_output_config_property = sagemaker.CfnDataQualityJobDefinition.MonitoringOutputConfigProperty( monitoring_outputs=[sagemaker.CfnDataQualityJobDefinition.MonitoringOutputProperty( s3_output=sagemaker.CfnDataQualityJobDefinition.S3OutputProperty( local_path="localPath", s3_uri="s3Uri", # the properties below are optional s3_upload_mode="s3UploadMode" ) )], # the properties below are optional kms_key_id="kmsKeyId" )
Attributes
- kms_key_id
The AWS Key Management Service ( AWS KMS ) key that Amazon SageMaker uses to encrypt the model artifacts at rest using Amazon S3 server-side encryption.
- monitoring_outputs
Monitoring outputs for monitoring jobs.
This is where the output of the periodic monitoring jobs is uploaded.
MonitoringOutputProperty
- class CfnDataQualityJobDefinition.MonitoringOutputProperty(*, s3_output)
Bases:
object
The output object for a monitoring job.
- Parameters:
s3_output (
Union
[IResolvable
,S3OutputProperty
,Dict
[str
,Any
]]) – The Amazon S3 storage location where the results of a monitoring job are saved.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker monitoring_output_property = sagemaker.CfnDataQualityJobDefinition.MonitoringOutputProperty( s3_output=sagemaker.CfnDataQualityJobDefinition.S3OutputProperty( local_path="localPath", s3_uri="s3Uri", # the properties below are optional s3_upload_mode="s3UploadMode" ) )
Attributes
- s3_output
The Amazon S3 storage location where the results of a monitoring job are saved.
MonitoringResourcesProperty
- class CfnDataQualityJobDefinition.MonitoringResourcesProperty(*, cluster_config)
Bases:
object
Identifies the resources to deploy for a monitoring job.
- Parameters:
cluster_config (
Union
[IResolvable
,ClusterConfigProperty
,Dict
[str
,Any
]]) – The configuration for the cluster resources used to run the processing job.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker monitoring_resources_property = sagemaker.CfnDataQualityJobDefinition.MonitoringResourcesProperty( cluster_config=sagemaker.CfnDataQualityJobDefinition.ClusterConfigProperty( instance_count=123, instance_type="instanceType", volume_size_in_gb=123, # the properties below are optional volume_kms_key_id="volumeKmsKeyId" ) )
Attributes
- cluster_config
The configuration for the cluster resources used to run the processing job.
NetworkConfigProperty
- class CfnDataQualityJobDefinition.NetworkConfigProperty(*, enable_inter_container_traffic_encryption=None, enable_network_isolation=None, vpc_config=None)
Bases:
object
Networking options for a job, such as network traffic encryption between containers, whether to allow inbound and outbound network calls to and from containers, and the VPC subnets and security groups to use for VPC-enabled jobs.
- Parameters:
enable_inter_container_traffic_encryption (
Union
[bool
,IResolvable
,None
]) – Whether to encrypt all communications between distributed processing jobs. ChooseTrue
to encrypt communications. Encryption provides greater security for distributed processing jobs, but the processing might take longer.enable_network_isolation (
Union
[bool
,IResolvable
,None
]) – Whether to allow inbound and outbound network calls to and from the containers used for the processing job.vpc_config (
Union
[IResolvable
,VpcConfigProperty
,Dict
[str
,Any
],None
]) – Specifies a VPC that your training jobs and hosted models have access to. Control access to and from your training and model containers by configuring the VPC.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker network_config_property = sagemaker.CfnDataQualityJobDefinition.NetworkConfigProperty( enable_inter_container_traffic_encryption=False, enable_network_isolation=False, vpc_config=sagemaker.CfnDataQualityJobDefinition.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] ) )
Attributes
- enable_inter_container_traffic_encryption
Whether to encrypt all communications between distributed processing jobs.
Choose
True
to encrypt communications. Encryption provides greater security for distributed processing jobs, but the processing might take longer.
- enable_network_isolation
Whether to allow inbound and outbound network calls to and from the containers used for the processing job.
- vpc_config
Specifies a VPC that your training jobs and hosted models have access to.
Control access to and from your training and model containers by configuring the VPC.
S3OutputProperty
- class CfnDataQualityJobDefinition.S3OutputProperty(*, local_path, s3_uri, s3_upload_mode=None)
Bases:
object
The Amazon S3 storage location where the results of a monitoring job are saved.
- Parameters:
local_path (
str
) – The local path to the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job. LocalPath is an absolute path for the output data.s3_uri (
str
) – A URI that identifies the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job.s3_upload_mode (
Optional
[str
]) – Whether to upload the results of the monitoring job continuously or after the job completes.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker s3_output_property = sagemaker.CfnDataQualityJobDefinition.S3OutputProperty( local_path="localPath", s3_uri="s3Uri", # the properties below are optional s3_upload_mode="s3UploadMode" )
Attributes
- local_path
The local path to the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job.
LocalPath is an absolute path for the output data.
- s3_upload_mode
Whether to upload the results of the monitoring job continuously or after the job completes.
- s3_uri
A URI that identifies the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job.
StatisticsResourceProperty
- class CfnDataQualityJobDefinition.StatisticsResourceProperty(*, s3_uri=None)
Bases:
object
The statistics resource for a monitoring job.
- Parameters:
s3_uri (
Optional
[str
]) – The Amazon S3 URI for the statistics resource.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker statistics_resource_property = sagemaker.CfnDataQualityJobDefinition.StatisticsResourceProperty( s3_uri="s3Uri" )
Attributes
- s3_uri
The Amazon S3 URI for the statistics resource.
StoppingConditionProperty
- class CfnDataQualityJobDefinition.StoppingConditionProperty(*, max_runtime_in_seconds)
Bases:
object
Specifies a limit to how long a job can run.
When the job reaches the time limit, SageMaker ends the job. Use this API to cap costs.
To stop a training job, SageMaker sends the algorithm the
SIGTERM
signal, which delays job termination for 120 seconds. Algorithms can use this 120-second window to save the model artifacts, so the results of training are not lost.The training algorithms provided by SageMaker automatically save the intermediate results of a model training job when possible. This attempt to save artifacts is only a best effort case as model might not be in a state from which it can be saved. For example, if training has just started, the model might not be ready to save. When saved, this intermediate data is a valid model artifact. You can use it to create a model with
CreateModel
. .. epigraph:The Neural Topic Model (NTM) currently does not support saving intermediate model artifacts. When training NTMs, make sure that the maximum runtime is sufficient for the training job to complete.
- Parameters:
max_runtime_in_seconds (
Union
[int
,float
]) – The maximum length of time, in seconds, that a training or compilation job can run before it is stopped. For compilation jobs, if the job does not complete during this time, aTimeOut
error is generated. We recommend starting with 900 seconds and increasing as necessary based on your model. For all other jobs, if the job does not complete during this time, SageMaker ends the job. WhenRetryStrategy
is specified in the job request,MaxRuntimeInSeconds
specifies the maximum time for all of the attempts in total, not each individual attempt. The default value is 1 day. The maximum value is 28 days. The maximum time that aTrainingJob
can run in total, including any time spent publishing metrics or archiving and uploading models after it has been stopped, is 30 days.- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker stopping_condition_property = sagemaker.CfnDataQualityJobDefinition.StoppingConditionProperty( max_runtime_in_seconds=123 )
Attributes
- max_runtime_in_seconds
The maximum length of time, in seconds, that a training or compilation job can run before it is stopped.
For compilation jobs, if the job does not complete during this time, a
TimeOut
error is generated. We recommend starting with 900 seconds and increasing as necessary based on your model.For all other jobs, if the job does not complete during this time, SageMaker ends the job. When
RetryStrategy
is specified in the job request,MaxRuntimeInSeconds
specifies the maximum time for all of the attempts in total, not each individual attempt. The default value is 1 day. The maximum value is 28 days.The maximum time that a
TrainingJob
can run in total, including any time spent publishing metrics or archiving and uploading models after it has been stopped, is 30 days.
VpcConfigProperty
- class CfnDataQualityJobDefinition.VpcConfigProperty(*, security_group_ids, subnets)
Bases:
object
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to.
You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC .
- Parameters:
security_group_ids (
Sequence
[str
]) – The VPC security group IDs, in the formsg-xxxxxxxx
. Specify the security groups for the VPC that is specified in theSubnets
field.subnets (
Sequence
[str
]) – The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones .
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_sagemaker as sagemaker vpc_config_property = sagemaker.CfnDataQualityJobDefinition.VpcConfigProperty( security_group_ids=["securityGroupIds"], subnets=["subnets"] )
Attributes
- security_group_ids
The VPC security group IDs, in the form
sg-xxxxxxxx
.Specify the security groups for the VPC that is specified in the
Subnets
field.
- subnets
The ID of the subnets in the VPC to which you want to connect your training job or model.
For information about the availability of specific instance types, see Supported Instance Types and Availability Zones .