AWS Tools for Windows PowerShell
Command Reference

AWS services or capabilities described in AWS Documentation may vary by region/location. Click Getting Started with Amazon AWS to see specific differences applicable to the China (Beijing) Region.

Synopsis

Calls the AWS Glue DataBrew CreateProfileJob API operation.

Syntax

New-GDBProfileJob
-Name <String>
-EntityDetectorConfiguration_AllowedStatistic <AllowedStatistics[]>
-OutputLocation_Bucket <String>
-Configuration_ColumnStatisticsConfiguration <ColumnStatisticsConfiguration[]>
-DatasetName <String>
-EncryptionKeyArn <String>
-EncryptionMode <EncryptionMode>
-EntityDetectorConfiguration_EntityType <String[]>
-DatasetStatisticsConfiguration_IncludedStatistic <String[]>
-OutputLocation_Key <String>
-LogSubscription <LogSubscription>
-MaxCapacity <Int32>
-MaxRetry <Int32>
-JobSample_Mode <SampleMode>
-DatasetStatisticsConfiguration_Override <StatisticOverride[]>
-Configuration_ProfileColumn <ColumnSelector[]>
-RoleArn <String>
-JobSample_Size <Int64>
-Tag <Hashtable>
-Timeout <Int32>
-ValidationConfiguration <ValidationConfiguration[]>
-Select <String>
-PassThru <SwitchParameter>
-Force <SwitchParameter>

Description

Creates a new job to analyze a dataset and create its data profile.

Parameters

-Configuration_ColumnStatisticsConfiguration <ColumnStatisticsConfiguration[]>
List of configurations for column evaluations. ColumnStatisticsConfigurations are used to select evaluations and override parameters of evaluations for particular columns. When ColumnStatisticsConfigurations is undefined, the profile job will profile all supported columns and run all supported evaluations.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesConfiguration_ColumnStatisticsConfigurations
-Configuration_ProfileColumn <ColumnSelector[]>
List of column selectors. ProfileColumns can be used to select columns from the dataset. When ProfileColumns is undefined, the profile job will profile all supported columns.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesConfiguration_ProfileColumns
-DatasetName <String>
The name of the dataset that this job is to act upon.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-DatasetStatisticsConfiguration_IncludedStatistic <String[]>
List of included evaluations. When the list is undefined, all supported evaluations will be included.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesConfiguration_DatasetStatisticsConfiguration_IncludedStatistics
-DatasetStatisticsConfiguration_Override <StatisticOverride[]>
List of overrides for evaluations.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesConfiguration_DatasetStatisticsConfiguration_Overrides
-EncryptionKeyArn <String>
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-EncryptionMode <EncryptionMode>
The encryption mode for the job, which can be one of the following:
  • SSE-KMS - SSE-KMS - Server-side encryption with KMS-managed keys.
  • SSE-S3 - Server-side encryption with keys managed by Amazon S3.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-EntityDetectorConfiguration_AllowedStatistic <AllowedStatistics[]>
Configuration of statistics that are allowed to be run on columns that contain detected entities. When undefined, no statistics will be computed on columns that contain detected entities.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesConfiguration_EntityDetectorConfiguration_AllowedStatistics
-EntityDetectorConfiguration_EntityType <String[]>
Entity types to detect. Can be any of the following:
  • USA_SSN
  • EMAIL
  • USA_ITIN
  • USA_PASSPORT_NUMBER
  • PHONE_NUMBER
  • USA_DRIVING_LICENSE
  • BANK_ACCOUNT
  • CREDIT_CARD
  • IP_ADDRESS
  • MAC_ADDRESS
  • USA_DEA_NUMBER
  • USA_HCPCS_CODE
  • USA_NATIONAL_PROVIDER_IDENTIFIER
  • USA_NATIONAL_DRUG_CODE
  • USA_HEALTH_INSURANCE_CLAIM_NUMBER
  • USA_MEDICARE_BENEFICIARY_IDENTIFIER
  • USA_CPT_CODE
  • PERSON_NAME
  • DATE
The Entity type group USA_ALL is also supported, and includes all of the above entity types except PERSON_NAME and DATE.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesConfiguration_EntityDetectorConfiguration_EntityTypes
This parameter overrides confirmation prompts to force the cmdlet to continue its operation. This parameter should always be used with caution.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-JobSample_Mode <SampleMode>
A value that determines whether the profile job is run on the entire dataset or a specified number of rows. This value must be one of the following:
  • FULL_DATASET - The profile job is run on the entire dataset.
  • CUSTOM_ROWS - The profile job is run on the number of rows specified in the Size parameter.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-JobSample_Size <Int64>
The Size parameter is only required when the mode is CUSTOM_ROWS. The profile job is run on the specified number of rows. The maximum value for size is Long.MAX_VALUE.Long.MAX_VALUE = 9223372036854775807
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-LogSubscription <LogSubscription>
Enables or disables Amazon CloudWatch logging for the job. If logging is enabled, CloudWatch writes one log stream for each job run.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-MaxCapacity <Int32>
The maximum number of nodes that DataBrew can use when the job processes data.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-MaxRetry <Int32>
The maximum number of times to retry the job after a job run fails.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesMaxRetries
-Name <String>
The name of the job to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space.
Required?True
Position?1
Accept pipeline input?True (ByValue, ByPropertyName)
-OutputLocation_Bucket <String>
The Amazon S3 bucket name.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-OutputLocation_Key <String>
The unique name of the object in the bucket.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-PassThru <SwitchParameter>
Changes the cmdlet behavior to return the value passed to the Name parameter. The -PassThru parameter is deprecated, use -Select '^Name' instead. This parameter will be removed in a future version.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-RoleArn <String>
The Amazon Resource Name (ARN) of the Identity and Access Management (IAM) role to be assumed when DataBrew runs the job.
Required?True
Position?Named
Accept pipeline input?True (ByPropertyName)
-Select <String>
Use the -Select parameter to control the cmdlet output. The default value is 'Name'. Specifying -Select '*' will result in the cmdlet returning the whole service response (Amazon.GlueDataBrew.Model.CreateProfileJobResponse). Specifying the name of a property of type Amazon.GlueDataBrew.Model.CreateProfileJobResponse will result in that property being returned. Specifying -Select '^ParameterName' will result in the cmdlet returning the selected cmdlet parameter value.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-Tag <Hashtable>
Metadata tags to apply to this job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesTags
-Timeout <Int32>
The job's timeout in minutes. A job that attempts to run longer than this timeout period ends with a status of TIMEOUT.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-ValidationConfiguration <ValidationConfiguration[]>
List of validation configurations that are applied to the profile job.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesValidationConfigurations

Common Credential and Region Parameters

-AccessKey <String>
The AWS access key for the user account. This can be a temporary access key if the corresponding session token is supplied to the -SessionToken parameter.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesAK
-Credential <AWSCredentials>
An AWSCredentials object instance containing access and secret key information, and optionally a token for session-based credentials.
Required?False
Position?Named
Accept pipeline input?True (ByValue, ByPropertyName)
-EndpointUrl <String>
The endpoint to make the call against.Note: This parameter is primarily for internal AWS use and is not required/should not be specified for normal usage. The cmdlets normally determine which endpoint to call based on the region specified to the -Region parameter or set as default in the shell (via Set-DefaultAWSRegion). Only specify this parameter if you must direct the call to a specific custom endpoint.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
-NetworkCredential <PSCredential>
Used with SAML-based authentication when ProfileName references a SAML role profile. Contains the network credentials to be supplied during authentication with the configured identity provider's endpoint. This parameter is not required if the user's default network identity can or should be used during authentication.
Required?False
Position?Named
Accept pipeline input?True (ByValue, ByPropertyName)
-ProfileLocation <String>
Used to specify the name and location of the ini-format credential file (shared with the AWS CLI and other AWS SDKs)If this optional parameter is omitted this cmdlet will search the encrypted credential file used by the AWS SDK for .NET and AWS Toolkit for Visual Studio first. If the profile is not found then the cmdlet will search in the ini-format credential file at the default location: (user's home directory)\.aws\credentials.If this parameter is specified then this cmdlet will only search the ini-format credential file at the location given.As the current folder can vary in a shell or during script execution it is advised that you use specify a fully qualified path instead of a relative path.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesAWSProfilesLocation, ProfilesLocation
-ProfileName <String>
The user-defined name of an AWS credentials or SAML-based role profile containing credential information. The profile is expected to be found in the secure credential file shared with the AWS SDK for .NET and AWS Toolkit for Visual Studio. You can also specify the name of a profile stored in the .ini-format credential file used with the AWS CLI and other AWS SDKs.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesStoredCredentials, AWSProfileName
-Region <Object>
The system name of an AWS region or an AWSRegion instance. This governs the endpoint that will be used when calling service operations. Note that the AWS resources referenced in a call are usually region-specific.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesRegionToCall
-SecretKey <String>
The AWS secret key for the user account. This can be a temporary secret key if the corresponding session token is supplied to the -SessionToken parameter.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesSK, SecretAccessKey
-SessionToken <String>
The session token if the access and secret keys are temporary session-based credentials.
Required?False
Position?Named
Accept pipeline input?True (ByPropertyName)
AliasesST

Outputs

This cmdlet returns a System.String object. The service call response (type Amazon.GlueDataBrew.Model.CreateProfileJobResponse) can also be referenced from properties attached to the cmdlet entry in the $AWSHistory stack.

Supported Version

AWS Tools for PowerShell: 2.x.y.z