StartPiiEntitiesDetectionJob
Starts an asynchronous PII entity detection job for a collection of documents.
Request Syntax
{
"ClientRequestToken": "string
",
"DataAccessRoleArn": "string
",
"InputDataConfig": {
"DocumentReaderConfig": {
"DocumentReadAction": "string
",
"DocumentReadMode": "string
",
"FeatureTypes": [ "string
" ]
},
"InputFormat": "string
",
"S3Uri": "string
"
},
"JobName": "string
",
"LanguageCode": "string
",
"Mode": "string
",
"OutputDataConfig": {
"KmsKeyId": "string
",
"S3Uri": "string
"
},
"RedactionConfig": {
"MaskCharacter": "string
",
"MaskMode": "string
",
"PiiEntityTypes": [ "string
" ]
},
"Tags": [
{
"Key": "string
",
"Value": "string
"
}
]
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- ClientRequestToken
-
A unique identifier for the request. If you don't set the client request token, Amazon Comprehend generates one.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 64.
Pattern:
^[a-zA-Z0-9-]+$
Required: No
- DataAccessRoleArn
-
The Amazon Resource Name (ARN) of the IAM role that grants Amazon Comprehend read access to your input data.
Type: String
Length Constraints: Minimum length of 20. Maximum length of 2048.
Pattern:
arn:aws(-[^:]+)?:iam::[0-9]{12}:role/.+
Required: Yes
- InputDataConfig
-
The input properties for a PII entities detection job.
Type: InputDataConfig object
Required: Yes
- JobName
-
The identifier of the job.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 256.
Pattern:
^([\p{L}\p{Z}\p{N}_.:/=+\-%@]*)$
Required: No
- LanguageCode
-
The language of the input documents. Enter the language code for English (en) or Spanish (es).
Type: String
Valid Values:
en | es
Required: Yes
- Mode
-
Specifies whether the output provides the locations (offsets) of PII entities or a file in which PII entities are redacted.
Type: String
Valid Values:
ONLY_REDACTION | ONLY_OFFSETS
Required: Yes
- OutputDataConfig
-
Provides configuration parameters for the output of PII entity detection jobs.
Type: OutputDataConfig object
Required: Yes
- RedactionConfig
-
Provides configuration parameters for PII entity redaction.
This parameter is required if you set the
Mode
parameter toONLY_REDACTION
. In that case, you must provide aRedactionConfig
definition that includes thePiiEntityTypes
parameter.Type: RedactionConfig object
Required: No
- Tags
-
Tags to associate with the PII entities detection job. A tag is a key-value pair that adds metadata to a resource used by Amazon Comprehend. For example, a tag with "Sales" as the key might be added to a resource to indicate its use by the sales department.
Type: Array of Tag objects
Required: No
Response Syntax
{
"JobArn": "string",
"JobId": "string",
"JobStatus": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- JobArn
-
The Amazon Resource Name (ARN) of the PII entity detection job. It is a unique, fully qualified identifier for the job. It includes the AWS account, AWS Region, and the job ID. The format of the ARN is as follows:
arn:<partition>:comprehend:<region>:<account-id>:pii-entities-detection-job/<job-id>
The following is an example job ARN:
arn:aws:comprehend:us-west-2:111122223333:pii-entities-detection-job/1234abcd12ab34cd56ef1234567890ab
Type: String
Length Constraints: Maximum length of 256.
Pattern:
arn:aws(-[^:]+)?:comprehend:[a-zA-Z0-9-]*:[0-9]{12}:[a-zA-Z0-9-]{1,64}/[a-zA-Z0-9](-*[a-zA-Z0-9])*((/dataset/[a-zA-Z0-9](-*[a-zA-Z0-9])*)|(/version/[a-zA-Z0-9](-*[a-zA-Z0-9])*))?
- JobId
-
The identifier generated for the job.
Type: String
Length Constraints: Minimum length of 1. Maximum length of 32.
Pattern:
^([\p{L}\p{Z}\p{N}_.:/=+\-%@]*)$
- JobStatus
-
The status of the job.
Type: String
Valid Values:
SUBMITTED | IN_PROGRESS | COMPLETED | FAILED | STOP_REQUESTED | STOPPED
Errors
For information about the errors that are common to all actions, see Common Errors.
- InternalServerException
-
An internal server error occurred. Retry your request.
HTTP Status Code: 500
- InvalidRequestException
-
The request is invalid.
HTTP Status Code: 400
- KmsKeyValidationException
-
The KMS customer managed key (CMK) entered cannot be validated. Verify the key and re-enter it.
HTTP Status Code: 400
- ResourceInUseException
-
The specified resource name is already in use. Use a different name and try your request again.
HTTP Status Code: 400
- TooManyRequestsException
-
The number of requests exceeds the limit. Resubmit your request later.
HTTP Status Code: 400
- TooManyTagsException
-
The request contains more tags than can be associated with a resource (50 tags per resource). The maximum number of tags includes both existing tags and those included in your current request.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: