Running sensitive data discovery jobs in Amazon Macie - Amazon Macie

Running sensitive data discovery jobs in Amazon Macie

With Amazon Macie, you create and run sensitive data discovery jobs to automate discovery, logging, and reporting of sensitive data in Amazon Simple Storage Service (Amazon S3) buckets. A sensitive data discovery job analyzes objects in S3 buckets to determine whether the objects contain sensitive data, and it provides detailed reports of the sensitive data that it finds and the analysis that it performs.

When Macie runs a job, it uses a combination of criteria and of techniques, such as machine learning and pattern matching, to analyze objects in S3 buckets that you specify. These techniques and criteria, referred to as managed data identifiers, can detect a large and growing list of sensitive data types for many countries and regions, including multiple types of personally identifiable information (PII), personal health information (PHI), and financial data. You can optionally supplement these managed data identifiers by creating custom data identifiers for your particular data and scenarios.

To help you meet and maintain compliance with your data security and privacy requirements, Macie provides several options for scheduling and defining the scope of each job. You can configure a job to run only once for on-demand analysis and assessment, or on a recurring basis for periodic analysis, assessment, and monitoring. In addition, you control the breadth and depth of each job's analysis. When you create a job, you start by specifying which S3 buckets you want the job to analyze—specific buckets that you select or buckets that match specific criteria. You can then refine the scope of that analysis by choosing various options, including custom include and exclude criteria that derive from properties of S3 objects. With these scheduling and scope options, you can build and maintain a comprehensive view of the data that your organization stores in Amazon S3 and any security or compliance risks for that data.

In Macie, each job produces records of the sensitive data that it finds and the analysis that it performs—sensitive data findings and sensitive data discovery results. A sensitive data finding is a detailed report of sensitive data that Macie found in an object. A sensitive data discovery result is a record that logs details about the analysis of an object. Macie creates a sensitive data discovery result for each object that you configure a job to analyze, including objects that don’t contain sensitive data. Each type of record adheres to a standardized schema, which can help you query, monitor, and process the records to meet your security and compliance requirements.