CfnJobProps

class aws_cdk.aws_databrew.CfnJobProps(*, name, role_arn, type, database_outputs=None, data_catalog_outputs=None, dataset_name=None, encryption_key_arn=None, encryption_mode=None, job_sample=None, log_subscription=None, max_capacity=None, max_retries=None, output_location=None, outputs=None, profile_configuration=None, project_name=None, recipe=None, tags=None, timeout=None, validation_configurations=None)

Bases: object

Properties for defining a CfnJob.

Parameters:
  • name (str) – The unique name of the job.

  • role_arn (str) – The Amazon Resource Name (ARN) of the role to be assumed for this job.

  • type (str) – The job type of the job, which must be one of the following:. - PROFILE - A job to analyze a dataset, to determine its size, data types, data distribution, and more. - RECIPE - A job to apply one or more transformations to a dataset.

  • database_outputs (Union[IResolvable, Sequence[Union[IResolvable, DatabaseOutputProperty, Dict[str, Any]]], None]) – Represents a list of JDBC database output objects which defines the output destination for a DataBrew recipe job to write into.

  • data_catalog_outputs (Union[IResolvable, Sequence[Union[IResolvable, DataCatalogOutputProperty, Dict[str, Any]]], None]) – One or more artifacts that represent the AWS Glue Data Catalog output from running the job.

  • dataset_name (Optional[str]) – A dataset that the job is to process.

  • encryption_key_arn (Optional[str]) – The Amazon Resource Name (ARN) of an encryption key that is used to protect the job output. For more information, see Encrypting data written by DataBrew jobs

  • encryption_mode (Optional[str]) – The encryption mode for the job, which can be one of the following:. - SSE-KMS - Server-side encryption with keys managed by AWS KMS . - SSE-S3 - Server-side encryption with keys managed by Amazon S3.

  • job_sample (Union[IResolvable, JobSampleProperty, Dict[str, Any], None]) – A sample configuration for profile jobs only, which determines the number of rows on which the profile job is run. If a JobSample value isn’t provided, the default value is used. The default value is CUSTOM_ROWS for the mode parameter and 20,000 for the size parameter.

  • log_subscription (Optional[str]) – The current status of Amazon CloudWatch logging for the job.

  • max_capacity (Union[int, float, None]) – The maximum number of nodes that can be consumed when the job processes data.

  • max_retries (Union[int, float, None]) – The maximum number of times to retry the job after a job run fails.

  • output_location (Union[IResolvable, OutputLocationProperty, Dict[str, Any], None]) – AWS::DataBrew::Job.OutputLocation.

  • outputs (Union[IResolvable, Sequence[Union[IResolvable, OutputProperty, Dict[str, Any]]], None]) – One or more artifacts that represent output from running the job.

  • profile_configuration (Union[IResolvable, ProfileConfigurationProperty, Dict[str, Any], None]) – Configuration for profile jobs. Configuration can be used to select columns, do evaluations, and override default parameters of evaluations. When configuration is undefined, the profile job will apply default settings to all supported columns.

  • project_name (Optional[str]) – The name of the project that the job is associated with.

  • recipe (Union[IResolvable, RecipeProperty, Dict[str, Any], None]) – A series of data transformation steps that the job runs.

  • tags (Optional[Sequence[Union[CfnTag, Dict[str, Any]]]]) – Metadata tags that have been applied to the job.

  • timeout (Union[int, float, None]) – The job’s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT .

  • validation_configurations (Union[IResolvable, Sequence[Union[IResolvable, ValidationConfigurationProperty, Dict[str, Any]]], None]) – List of validation configurations that are applied to the profile job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html

ExampleMetadata:

fixture=_generated

Example:

# The code below shows an example of how to instantiate this type.
# The values are placeholders you should change.
import aws_cdk.aws_databrew as databrew

cfn_job_props = databrew.CfnJobProps(
    name="name",
    role_arn="roleArn",
    type="type",

    # the properties below are optional
    database_outputs=[databrew.CfnJob.DatabaseOutputProperty(
        database_options=databrew.CfnJob.DatabaseTableOutputOptionsProperty(
            table_name="tableName",

            # the properties below are optional
            temp_directory=databrew.CfnJob.S3LocationProperty(
                bucket="bucket",

                # the properties below are optional
                bucket_owner="bucketOwner",
                key="key"
            )
        ),
        glue_connection_name="glueConnectionName",

        # the properties below are optional
        database_output_mode="databaseOutputMode"
    )],
    data_catalog_outputs=[databrew.CfnJob.DataCatalogOutputProperty(
        database_name="databaseName",
        table_name="tableName",

        # the properties below are optional
        catalog_id="catalogId",
        database_options=databrew.CfnJob.DatabaseTableOutputOptionsProperty(
            table_name="tableName",

            # the properties below are optional
            temp_directory=databrew.CfnJob.S3LocationProperty(
                bucket="bucket",

                # the properties below are optional
                bucket_owner="bucketOwner",
                key="key"
            )
        ),
        overwrite=False,
        s3_options=databrew.CfnJob.S3TableOutputOptionsProperty(
            location=databrew.CfnJob.S3LocationProperty(
                bucket="bucket",

                # the properties below are optional
                bucket_owner="bucketOwner",
                key="key"
            )
        )
    )],
    dataset_name="datasetName",
    encryption_key_arn="encryptionKeyArn",
    encryption_mode="encryptionMode",
    job_sample=databrew.CfnJob.JobSampleProperty(
        mode="mode",
        size=123
    ),
    log_subscription="logSubscription",
    max_capacity=123,
    max_retries=123,
    output_location=databrew.CfnJob.OutputLocationProperty(
        bucket="bucket",

        # the properties below are optional
        bucket_owner="bucketOwner",
        key="key"
    ),
    outputs=[databrew.CfnJob.OutputProperty(
        location=databrew.CfnJob.S3LocationProperty(
            bucket="bucket",

            # the properties below are optional
            bucket_owner="bucketOwner",
            key="key"
        ),

        # the properties below are optional
        compression_format="compressionFormat",
        format="format",
        format_options=databrew.CfnJob.OutputFormatOptionsProperty(
            csv=databrew.CfnJob.CsvOutputOptionsProperty(
                delimiter="delimiter"
            )
        ),
        max_output_files=123,
        overwrite=False,
        partition_columns=["partitionColumns"]
    )],
    profile_configuration=databrew.CfnJob.ProfileConfigurationProperty(
        column_statistics_configurations=[databrew.CfnJob.ColumnStatisticsConfigurationProperty(
            statistics=databrew.CfnJob.StatisticsConfigurationProperty(
                included_statistics=["includedStatistics"],
                overrides=[databrew.CfnJob.StatisticOverrideProperty(
                    parameters={
                        "parameters_key": "parameters"
                    },
                    statistic="statistic"
                )]
            ),

            # the properties below are optional
            selectors=[databrew.CfnJob.ColumnSelectorProperty(
                name="name",
                regex="regex"
            )]
        )],
        dataset_statistics_configuration=databrew.CfnJob.StatisticsConfigurationProperty(
            included_statistics=["includedStatistics"],
            overrides=[databrew.CfnJob.StatisticOverrideProperty(
                parameters={
                    "parameters_key": "parameters"
                },
                statistic="statistic"
            )]
        ),
        entity_detector_configuration=databrew.CfnJob.EntityDetectorConfigurationProperty(
            entity_types=["entityTypes"],

            # the properties below are optional
            allowed_statistics=databrew.CfnJob.AllowedStatisticsProperty(
                statistics=["statistics"]
            )
        ),
        profile_columns=[databrew.CfnJob.ColumnSelectorProperty(
            name="name",
            regex="regex"
        )]
    ),
    project_name="projectName",
    recipe=databrew.CfnJob.RecipeProperty(
        name="name",

        # the properties below are optional
        version="version"
    ),
    tags=[CfnTag(
        key="key",
        value="value"
    )],
    timeout=123,
    validation_configurations=[databrew.CfnJob.ValidationConfigurationProperty(
        ruleset_arn="rulesetArn",

        # the properties below are optional
        validation_mode="validationMode"
    )]
)

Attributes

data_catalog_outputs

One or more artifacts that represent the AWS Glue Data Catalog output from running the job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-datacatalogoutputs

database_outputs

Represents a list of JDBC database output objects which defines the output destination for a DataBrew recipe job to write into.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-databaseoutputs

dataset_name

A dataset that the job is to process.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-datasetname

encryption_key_arn

The Amazon Resource Name (ARN) of an encryption key that is used to protect the job output.

For more information, see Encrypting data written by DataBrew jobs

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-encryptionkeyarn

encryption_mode

.

  • SSE-KMS - Server-side encryption with keys managed by AWS KMS .

  • SSE-S3 - Server-side encryption with keys managed by Amazon S3.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-encryptionmode

Type:

The encryption mode for the job, which can be one of the following

job_sample

A sample configuration for profile jobs only, which determines the number of rows on which the profile job is run.

If a JobSample value isn’t provided, the default value is used. The default value is CUSTOM_ROWS for the mode parameter and 20,000 for the size parameter.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-jobsample

log_subscription

The current status of Amazon CloudWatch logging for the job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-logsubscription

max_capacity

The maximum number of nodes that can be consumed when the job processes data.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-maxcapacity

max_retries

The maximum number of times to retry the job after a job run fails.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-maxretries

name

The unique name of the job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-name

output_location

AWS::DataBrew::Job.OutputLocation.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-outputlocation

outputs

One or more artifacts that represent output from running the job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-outputs

profile_configuration

Configuration for profile jobs.

Configuration can be used to select columns, do evaluations, and override default parameters of evaluations. When configuration is undefined, the profile job will apply default settings to all supported columns.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-profileconfiguration

project_name

The name of the project that the job is associated with.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-projectname

recipe

A series of data transformation steps that the job runs.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-recipe

role_arn

The Amazon Resource Name (ARN) of the role to be assumed for this job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-rolearn

tags

Metadata tags that have been applied to the job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-tags

timeout

The job’s timeout in minutes.

A job that attempts to run longer than this timeout period ends with a status of TIMEOUT .

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-timeout

type

.

  • PROFILE - A job to analyze a dataset, to determine its size, data types, data distribution, and more.

  • RECIPE - A job to apply one or more transformations to a dataset.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-type

Type:

The job type of the job, which must be one of the following

validation_configurations

List of validation configurations that are applied to the profile job.

Link:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html#cfn-databrew-job-validationconfigurations