Running sensitive data discovery jobs in Amazon Macie - Amazon Macie

Running sensitive data discovery jobs in Amazon Macie

With Amazon Macie, you can create and run sensitive data discovery jobs to automate discovery, logging, and reporting of sensitive data in Amazon Simple Storage Service (Amazon S3) general purpose buckets. A sensitive data discovery job is a series of automated processing and analysis tasks that Macie performs to detect and report sensitive data in Amazon S3 objects. Each job provides detailed reports of the sensitive data that Macie finds and the analysis that Macie performs. By creating and running jobs, you can build and maintain a comprehensive view of the data that your organization stores in Amazon S3 and any security or compliance risks for that data.

To help you meet and maintain compliance with your data security and privacy requirements, Macie provides several options for scheduling and defining the scope of a job. You can configure a job to run only once for on-demand analysis and assessment, or on a recurring basis for periodic analysis, assessment, and monitoring. You also define the breadth and depth of a job's analysis—specific S3 buckets that you select or buckets that match specific criteria. You can optionally refine the scope of that analysis by choosing additional options. The options include custom include and exclude criteria that derive from properties of S3 objects, such as tags, prefixes, and when an object was last modified.

For each job, you also specify the types of sensitive data that you want Macie to detect and report. You can configure a job to use managed data identifiers that Macie provides, custom data identifiers that you define, or a combination of the two. By selecting specific managed and custom data identifiers for a job, you can tailor the analysis to focus on specific types of sensitive data. To fine tune the analysis, you can also configure a job to use allow lists that you define. Allow lists specify text and text patterns that you want Macie to ignore, typically sensitive data exceptions for your organization's particular scenarios or environment.

Each job produces records of the sensitive data that Macie finds and the analysis that Macie performs—sensitive data findings and sensitive data discovery results. A sensitive data finding is a detailed report of sensitive data that Macie found in an S3 object. A sensitive data discovery result is a record that logs details about the analysis of an S3 object. Macie creates a sensitive data discovery result for each object that you configure a job to analyze. This includes objects that Macie doesn’t find sensitive data in, and therefore don't produce sensitive data findings, and objects that Macie can't analyze due to errors or issues. Each type of record adheres to a standardized schema, which can help you query, monitor, and process the records to meet your security and compliance requirements.