Performing automated sensitive data discovery with Amazon Macie - Amazon Macie

Performing automated sensitive data discovery with Amazon Macie

For broad visibility into where sensitive data might reside in your Amazon Simple Storage Service (Amazon S3) data estate, configure Amazon Macie to perform automated sensitive data discovery for your account or organization. With automated sensitive data discovery, Macie continually evaluates your S3 bucket inventory and uses sampling techniques to identify and select representative S3 objects in your buckets. Macie then retrieves and analyzes the selected objects, inspecting them for sensitive data.

By default, Macie selects and analyzes objects from all of your S3 general purpose buckets. If you're the Macie administrator for an organization, this includes objects in buckets that your member accounts own. You can adjust the scope of the analyses by excluding specific buckets. For example, you might exclude buckets that typically store AWS logging data. If you're a Macie administrator, an additional option is to enable or disable automated sensitive data discovery on a case-by-case basis for individual accounts in your organization.

You can tailor the analyses to focus on specific types of sensitive data. By default, Macie analyzes S3 objects by using the set of managed data identifiers that we recommend for automated sensitive data discovery. To tailor the analyses, you can configure Macie to use specific managed data identifiers that Macie provides, custom data identifiers that you define, or a combination of the two. You can also refine the analyses by configuring Macie to use allow lists that you specify.

As the analysis progresses each day, Macie produces records of the sensitive data that it finds and the analysis that it performs: sensitive data findings, which report sensitive data that Macie finds in individual S3 objects, and sensitive data discovery results, which log details about the analysis of individual S3 objects. Macie also updates statistics, inventory data, and other information that it provides about your Amazon S3 data. For example, an interactive heat map on the console provides a visual representation of data sensitivity across your data estate:

The S3 buckets heat map. It contains squares of different colors, one for each S3 bucket, grouped by account.

These features are designed to help you evaluate data sensitivity across your Amazon S3 data estate, and drill down to investigate and assess individual accounts, buckets, and objects. They can also help you determine where to perform deeper, more immediate analysis by running sensitive data discovery jobs. Combined with information that Macie provides about the security and privacy of your Amazon S3 data, you can also use these features to identify cases where immediate remediation might be necessary—for example, a publicly accessible bucket that Macie found sensitive data in.

To configure and manage automated sensitive data discovery, you must be the Macie administrator for an organization or have a standalone Macie account.