Classification Job Creation
The Classification Job Creation resource represents the collection of settings that define the scope and schedule for a classification job. A classification job, also referred to as a sensitive data discovery job, is a job that you create to analyze objects in Amazon Simple Storage Service (Amazon S3) general purpose buckets, and determine whether the objects contain sensitive data. To detect sensitive data, a job can use managed data identifiers that Amazon Macie provides, custom data identifiers that you define, or a combination of the two.
When you create a classification job, you can configure it to address specific scenarios. For example, you can use property- and tag-based conditions to perform targeted analysis of S3 buckets and objects that match specific criteria. You can also define a schedule for running the job on a recurring basis, such as every day or a specific day of each week or month. This can be helpful if you want to align your analysis with periodic updates to bucket objects or monitor buckets for the presence of sensitive data. In addition to these settings, you can configure a job to use one or more allow lists. Allow lists define specific text or text patterns that you want Macie to ignore when it analyzes objects. You can create and use allow lists in all the AWS Regions where Macie is currently available except the Asia Pacific (Osaka) Region. For more information about creating and configuring jobs, see Running sensitive data discovery jobs in the Amazon Macie User Guide.
You can use the Classification Job Creation resource to create and define the settings for a classification job. Note that you can't change any settings for a job after you create it. This helps to ensure that you have an immutable history of sensitive data findings and discovery results for data privacy and protection audits or investigations that you perform.
URI
/jobs
HTTP methods
POST
Operation ID: CreateClassificationJob
Creates and defines the settings for a classification job.
Status code | Response model | Description |
---|---|---|
200 | CreateClassificationJobResponse | The request succeeded. The specified job was created. |
400 | ValidationException | The request failed because the input doesn't satisfy the constraints specified by the service. |
402 | ServiceQuotaExceededException | The request failed because fulfilling the request would exceed one or more service quotas for your account. |
403 | AccessDeniedException | The request was denied because you don't have sufficient access to the specified resource. |
404 | ResourceNotFoundException | The request failed because the specified resource wasn't found. |
409 | ConflictException | The request failed because it conflicts with the current state of the specified resource. |
429 | ThrottlingException | The request failed because you sent too many requests during a certain amount of time. |
500 | InternalServerException | The request failed due to an unknown internal server error, exception, or failure. |
Schemas
Request bodies
{ "allowListIds": [ "string" ], "clientToken": "string", "customDataIdentifierIds": [ "string" ], "description": "string", "initialRun": boolean, "jobType": enum, "managedDataIdentifierIds": [ "string" ], "managedDataIdentifierSelector": enum, "name": "string", "s3JobDefinition": { "bucketCriteria": { "excludes": { "and": [ { "simpleCriterion": { "comparator": enum, "key": enum, "values": [ "string" ] }, "tagCriterion": { "comparator": enum, "tagValues": [ { "key": "string", "value": "string" } ] } } ] }, "includes": { "and": [ { "simpleCriterion": { "comparator": enum, "key": enum, "values": [ "string" ] }, "tagCriterion": { "comparator": enum, "tagValues": [ { "key": "string", "value": "string" } ] } } ] } }, "bucketDefinitions": [ { "accountId": "string", "buckets": [ "string" ] } ], "scoping": { "excludes": { "and": [ { "simpleScopeTerm": { "comparator": enum, "key": enum, "values": [ "string" ] }, "tagScopeTerm": { "comparator": enum, "key": "string", "tagValues": [ { "key": "string", "value": "string" } ], "target": enum } } ] }, "includes": { "and": [ { "simpleScopeTerm": { "comparator": enum, "key": enum, "values": [ "string" ] }, "tagScopeTerm": { "comparator": enum, "key": "string", "tagValues": [ { "key": "string", "value": "string" } ], "target": enum } } ] } } }, "samplingPercentage": integer, "scheduleFrequency": { "dailySchedule": { }, "monthlySchedule": { "dayOfMonth": integer }, "weeklySchedule": { "dayOfWeek": enum } }, "tags": { } }
Response bodies
{ "message": "string" }
{ "message": "string" }
{ "message": "string" }
{ "message": "string" }
{ "message": "string" }
{ "message": "string" }
{ "message": "string" }
Properties
AccessDeniedException
Provides information about an error that occurred due to insufficient access to a specified resource.
Property | Type | Required | Description |
---|---|---|---|
message | string | False | The explanation of the error that occurred. |
ConflictException
Provides information about an error that occurred due to a versioning conflict for a specified resource.
Property | Type | Required | Description |
---|---|---|---|
message | string | False | The explanation of the error that occurred. |
CreateClassificationJobRequest
Specifies the scope, schedule, and other settings for a classification job. You can't change any settings for a classification job after you create it. This helps to ensure that you have an immutable history of sensitive data findings and discovery results for data privacy and protection audits or investigations.
Property | Type | Required | Description |
---|---|---|---|
allowListIds | Array of type string | False | An array of unique identifiers, one for each allow list for the job to use when it analyzes data. |
clientToken | string | True | A unique, case-sensitive token that you provide to ensure the idempotency of the request. |
customDataIdentifierIds | Array of type string | False | An array of unique identifiers, one for each custom data identifier for the job to
use when it analyzes data. To use only managed data identifiers, don't specify a
value for this property and specify a value other than |
description | string | False | A custom description of the job. The description can contain as many as 200 characters. |
initialRun | boolean | False | For a recurring job, specifies whether to analyze all existing, eligible objects
immediately after the job is created ( If you configure the job to run only once, don't specify a value for this property. |
jobType | True | The schedule for running the job. Valid values are:
| |
managedDataIdentifierIds | Array of type string | False | An array of unique identifiers, one for each managed data identifier for the job
to include (use) or exclude (not use) when it analyzes data. Inclusion or exclusion
depends on the managed data identifier selection type that you specify for the job
( To retrieve a list of valid values for this property, use the
|
managedDataIdentifierSelector | False | The selection type to apply when determining which managed data identifiers the job uses to analyze data. Valid values are:
If you don't specify a value for this property, the job uses the recommended set of managed data identifiers. If the job is a recurring job and you specify To learn about individual managed data identifiers or determine which ones are in the recommended set, see Using managed data identifiers or Recommended managed data identifiers in the Amazon Macie User Guide. | |
name | string | True | A custom name for the job. The name can contain as many as 500 characters. |
s3JobDefinition | True | The S3 buckets that contain the objects to analyze, and the scope of that analysis. | |
samplingPercentage | integer Format: int32 | False | The sampling depth, as a percentage, for the job to apply when processing objects.
This value determines the percentage of eligible objects that the job analyzes. If
this value is less than |
scheduleFrequency | False | The recurrence pattern for running the job. To run the job only once, don't
specify a value for this property and set the value for the | |
tags | False | A map of key-value pairs that specifies the tags to associate with the job. A job can have a maximum of 50 tags. Each tag consists of a tag key and an associated tag value. The maximum length of a tag key is 128 characters. The maximum length of a tag value is 256 characters. |
CreateClassificationJobResponse
Provides information about a classification job that was created in response to a request.
Property | Type | Required | Description |
---|---|---|---|
jobArn | string | False | The Amazon Resource Name (ARN) of the job. |
jobId | string | False | The unique identifier for the job. |
CriteriaBlockForJob
Specifies one or more property- and tag-based conditions that define criteria for including or excluding S3 buckets from a classification job.
Property | Type | Required | Description |
---|---|---|---|
and | Array of type CriteriaForJob | False | An array of conditions, one for each condition that determines which buckets to include or exclude from the job. If you specify more than one condition, Amazon Macie uses AND logic to join the conditions. |
CriteriaForJob
Specifies a property- or tag-based condition that defines criteria for including or excluding S3 buckets from a classification job.
Property | Type | Required | Description |
---|---|---|---|
simpleCriterion | False | A property-based condition that defines a property, operator, and one or more values for including or excluding buckets from the job. | |
tagCriterion | False | A tag-based condition that defines an operator and tag keys, tag values, or tag key and value pairs for including or excluding buckets from the job. |
DailySchedule
Specifies that a classification job runs once a day, every day. This is an empty object.
InternalServerException
Provides information about an error that occurred due to an unknown internal server error, exception, or failure.
Property | Type | Required | Description |
---|---|---|---|
message | string | False | The explanation of the error that occurred. |
JobComparator
The operator to use in a condition. Depending on the type of condition, possible values are:
EQ
GT
GTE
LT
LTE
NE
CONTAINS
STARTS_WITH
JobScheduleFrequency
Specifies the recurrence pattern for running a classification job.
Property | Type | Required | Description |
---|---|---|---|
dailySchedule | False | Specifies a daily recurrence pattern for running the job. | |
monthlySchedule | False | Specifies a monthly recurrence pattern for running the job. | |
weeklySchedule | False | Specifies a weekly recurrence pattern for running the job. |
JobScopeTerm
Specifies a property- or tag-based condition that defines criteria for including
or excluding S3 objects from a classification job. A JobScopeTerm
object
can contain only one simpleScopeTerm
object or one
tagScopeTerm
object.
Property | Type | Required | Description |
---|---|---|---|
simpleScopeTerm | False | A property-based condition that defines a property, operator, and one or more values for including or excluding objects from the job. | |
tagScopeTerm | False | A tag-based condition that defines the operator and tag keys or tag key and value pairs for including or excluding objects from the job. |
JobScopingBlock
Specifies one or more property- and tag-based conditions that define criteria for including or excluding S3 objects from a classification job.
Property | Type | Required | Description |
---|---|---|---|
and | Array of type JobScopeTerm | False | An array of conditions, one for each property- or tag-based condition that determines which objects to include or exclude from the job. If you specify more than one condition, Amazon Macie uses AND logic to join the conditions. |
JobType
The schedule for running a classification job. Valid values are:
ONE_TIME
SCHEDULED
ManagedDataIdentifierSelector
The selection type that determines which managed data identifiers a classification job uses to analyze data. Valid values are:
ALL
EXCLUDE
INCLUDE
NONE
RECOMMENDED
MonthlySchedule
Specifies a monthly recurrence pattern for running a classification job.
Property | Type | Required | Description |
---|---|---|---|
dayOfMonth | integer Format: int32 | False | The numeric day of the month when Amazon Macie runs the job. This value
can be an integer from If this value exceeds the number of days in a certain month, Macie
doesn't run the job that month. Macie runs the job only during months
that have the specified day. For example, if this value is |
ResourceNotFoundException
Provides information about an error that occurred because a specified resource wasn't found.
Property | Type | Required | Description |
---|---|---|---|
message | string | False | The explanation of the error that occurred. |
S3BucketCriteriaForJob
Specifies property- and tag-based conditions that define criteria for including or excluding S3 buckets from a classification job. Exclude conditions take precedence over include conditions.
Property | Type | Required | Description |
---|---|---|---|
excludes | False | The property- and tag-based conditions that determine which buckets to exclude from the job. | |
includes | False | The property- and tag-based conditions that determine which buckets to include in the job. |
S3BucketDefinitionForJob
Specifies an AWS account that owns S3 buckets for a classification job to analyze, and one or more specific buckets to analyze for that account.
Property | Type | Required | Description |
---|---|---|---|
accountId | string | True | The unique identifier for the AWS account that owns the buckets. |
buckets | Array of type string | True | An array that lists the names of the buckets. |
S3JobDefinition
Specifies which S3 buckets contain the objects that a classification job analyzes,
and the scope of that analysis. The bucket specification can be static
(bucketDefinitions
) or dynamic (bucketCriteria
). If it's
static, the job analyzes objects in the same predefined set of buckets each time the
job runs. If it's dynamic, the job analyzes objects in any buckets that match the
specified criteria each time the job starts to run.
Property | Type | Required | Description |
---|---|---|---|
bucketCriteria | False | The property- and tag-based conditions that determine which S3 buckets to include
or exclude from the analysis. Each time the job runs, the job uses these criteria to
determine which buckets contain objects to analyze. A job's definition can contain a
| |
bucketDefinitions | Array of type S3BucketDefinitionForJob | False | An array of objects, one for each AWS account that owns specific S3
buckets to analyze. Each object specifies the account ID for an account and one or
more buckets to analyze for that account. A job's definition can contain a
|
scoping | False | The property- and tag-based conditions that determine which S3 objects to include or exclude from the analysis. Each time the job runs, the job uses these criteria to determine which objects to analyze. |
ScopeFilterKey
The property to use in a condition that determines whether an S3 object is included or excluded from a classification job. Valid values are:
OBJECT_EXTENSION
OBJECT_LAST_MODIFIED_DATE
OBJECT_SIZE
OBJECT_KEY
Scoping
Specifies one or more property- and tag-based conditions that define criteria for including or excluding S3 objects from a classification job. Exclude conditions take precedence over include conditions.
Property | Type | Required | Description |
---|---|---|---|
excludes | False | The property- and tag-based conditions that determine which objects to exclude from the analysis. | |
includes | False | The property- and tag-based conditions that determine which objects to include in the analysis. |
ServiceQuotaExceededException
Provides information about an error that occurred due to one or more service quotas for an account.
Property | Type | Required | Description |
---|---|---|---|
message | string | False | The explanation of the error that occurred. |
SimpleCriterionForJob
Specifies a property-based condition that determines whether an S3 bucket is included or excluded from a classification job.
Property | Type | Required | Description |
---|---|---|---|
comparator | False | The operator to use in the condition. Valid values are | |
key | False | The property to use in the condition. | |
values | Array of type string | False | An array that lists one or more values to use in the condition. If you specify
multiple values, Amazon Macie uses OR logic to join the values. Valid
values for each supported property (
Values are case sensitive. Also, Macie doesn't support use of partial values or wildcard characters in these values. |
SimpleCriterionKeyForJob
The property to use in a condition that determines whether an S3 bucket is included or excluded from a classification job. Valid values are:
ACCOUNT_ID
S3_BUCKET_NAME
S3_BUCKET_EFFECTIVE_PERMISSION
S3_BUCKET_SHARED_ACCESS
SimpleScopeTerm
Specifies a property-based condition that determines whether an S3 object is included or excluded from a classification job.
Property | Type | Required | Description |
---|---|---|---|
comparator | False | The operator to use in the condition. Valid values for each supported property
(
| |
key | False | The object property to use in the condition. | |
values | Array of type string | False | An array that lists the values to use in the condition. If the value for the
Valid values for each supported property (
Macie doesn't support use of wildcard characters in these values. Also, string values are case sensitive. |
TagCriterionForJob
Specifies a tag-based condition that determines whether an S3 bucket is included or excluded from a classification job.
Property | Type | Required | Description |
---|---|---|---|
comparator | False | The operator to use in the condition. Valid values are | |
tagValues | Array of type TagCriterionPairForJob | False | The tag keys, tag values, or tag key and value pairs to use in the condition. |
TagCriterionPairForJob
Specifies a tag key, a tag value, or a tag key and value (as a pair) to use in a tag-based condition that determines whether an S3 bucket is included or excluded from a classification job. Tag keys and values are case sensitive. Also, Amazon Macie doesn't support use of partial values or wildcard characters in tag-based conditions.
Property | Type | Required | Description |
---|---|---|---|
key | string | False | The value for the tag key to use in the condition. |
value | string | False | The tag value to use in the condition. |
TagMap
A string-to-string map of key-value pairs that specifies the tags (keys and values) for an Amazon Macie resource.
Property | Type | Required | Description |
---|---|---|---|
| string | False |
TagScopeTerm
Specifies a tag-based condition that determines whether an S3 object is included or excluded from a classification job.
Property | Type | Required | Description |
---|---|---|---|
comparator | False | The operator to use in the condition. Valid values are | |
key | string | False | The object property to use in the condition. The only valid value is
|
tagValues | Array of type TagValuePair | False | The tag keys or tag key and value pairs to use in the condition. To specify only tag keys in a condition, specify the keys in this array and set the value for each associated tag value to an empty string. |
target | False | The type of object to apply the condition to. |
TagTarget
The type of object to apply a tag-based condition to. Valid values are:
S3_OBJECT
TagValuePair
Specifies a tag key or tag key and value pair to use in a tag-based condition that determines whether an S3 object is included or excluded from a classification job. Tag keys and values are case sensitive. Also, Amazon Macie doesn't support use of partial values or wildcard characters in tag-based conditions.
Property | Type | Required | Description |
---|---|---|---|
key | string | False | The value for the tag key to use in the condition. |
value | string | False | The tag value, associated with the specified tag key ( |
ThrottlingException
Provides information about an error that occurred because too many requests were sent during a certain amount of time.
Property | Type | Required | Description |
---|---|---|---|
message | string | False | The explanation of the error that occurred. |
ValidationException
Provides information about an error that occurred due to a syntax error in a request.
Property | Type | Required | Description |
---|---|---|---|
message | string | False | The explanation of the error that occurred. |
WeeklySchedule
Specifies a weekly recurrence pattern for running a classification job.
Property | Type | Required | Description |
---|---|---|---|
dayOfWeek | string Values: | False | The day of the week when Amazon Macie runs the job. |
See also
For more information about using this API in one of the language-specific AWS SDKs and references, see the following: