Running sensitive data discovery jobs in Amazon Macie

With Amazon Macie, you create and run sensitive data discovery jobs to automate discovery, logging, and reporting of sensitive data in Amazon Simple Storage Service (Amazon S3) buckets. A sensitive data discovery job is a series of automated processing and analysis tasks that Macie performs to analyze objects in S3 buckets and determine whether the objects contain sensitive data. Each job provides detailed reports of the sensitive data that Macie finds and the analysis that Macie performs.

To help you meet and maintain compliance with your data security and privacy requirements, Macie provides several options for scheduling and defining the scope of each job. With these options, you can build and maintain a comprehensive view of the data that your organization stores in Amazon S3 and any security or compliance risks for that data.

You can configure a job to run only once for on-demand analysis and assessment, or on a recurring basis for periodic analysis, assessment, and monitoring. You also define the breadth and depth of each job's analysis. When you create a job, you start by specifying which S3 buckets contain objects that you want the job to analyze—specific buckets that you select or buckets that match specific criteria. You can then refine the scope of that analysis by choosing additional options. The options include custom include and exclude criteria that derive from properties of S3 objects, such as tags, prefixes, and the date when an object was last modified.

You also specify the types of sensitive data that you want to detect. You can configure a job to use managed data identifiers that Macie provides, custom data identifiers that you define, or a combination of the two. By selecting specific managed and custom data identifiers for a job, you can tailor the analysis to focus on specific types of sensitive data. To fine tune the analysis, you can also configure a job to use allow lists that you define. Allow lists specify text and text patterns that you want Macie to ignore, typically sensitive data exceptions for your organization's particular scenarios or environment.

Each job produces records of the sensitive data that Macie finds and the analysis that Macie performs—sensitive data findings and sensitive data discovery results. A sensitive data finding is a detailed report of sensitive data that Macie found in an object. A sensitive data discovery result is a record that logs details about the analysis of an object. Macie creates a sensitive data discovery result for each object that you configure a job to analyze. This includes objects that don’t contain sensitive data and therefore don't produce sensitive data findings. Each type of record adheres to a standardized schema, which can help you query, monitor, and process the records to meet your security and compliance requirements.