Reviewing your S3 bucket inventory with Amazon Macie - Amazon Macie

Reviewing your S3 bucket inventory with Amazon Macie

On the Amazon Macie console, the S3 buckets page provides detailed insight into the security and privacy of your Amazon Simple Storage Service (Amazon S3) data in the current AWS Region. With this page, you can review and analyze a complete inventory of your S3 buckets in the current Region, and review detailed information and statistics for individual buckets. If you're the Macie administrator for an organization, your inventory includes details and statistics for S3 buckets that are owned by member accounts in your organization.

The S3 buckets page also indicates when Macie most recently retrieved bucket or object metadata from Amazon S3 for your account. You can find this information in the Last updated field at the top of the page. If you're the Macie administrator for an organization, this field indicates the earliest date and time when Macie retrieved the data for an account in your organization. For more information, see Data refreshes.

Note that most inventory data is limited to buckets that Macie is allowed to access for your account. If a bucket's permissions settings prevent Macie from retrieving information about the bucket or the bucket's objects, Macie can only provide a subset of information about the bucket. If this is the case for a particular bucket, Macie displays a warning icon ( A red triangle with a red exclamation point in it ) and message for the bucket in your bucket inventory. For the bucket's details, Macie displays only a subset of fields and data: the account ID for the AWS account that owns the bucket; the bucket's name, Amazon Resource Name (ARN), creation date, and Region; and, when Macie most recently retrieved both bucket and object metadata for the bucket as part of the daily refresh cycle. To investigate the issue, review the bucket’s policy and permissions settings in Amazon S3. For example, the bucket might have a restrictive bucket policy. For more information, see Allowing Macie to access S3 buckets and objects.

If you prefer to access and query your inventory data programmatically, you can use the DescribeBuckets operation of the Amazon Macie API.

Reviewing your S3 bucket inventory

The S3 buckets page on the Amazon Macie console provides information about your S3 buckets in the current AWS Region. On this page, a table displays summary information for each bucket in your inventory. To customize your view, you can sort and filter the table. If you choose a bucket in the table, the details panel displays additional information about the bucket. This includes details and statistics for settings and metrics that provide insight into the security and privacy of the bucket’s data. You can optionally export data from the table to a comma-separated values (CSV) file.

If automated sensitive data discovery is enabled for your account, you also have the option of reviewing your inventory by using an interactive heat map. The map provides a visual representation of data sensitivity across your Amazon S3 data estate. It captures the results of automated sensitive data discovery activities that Macie has performed for your account or organization. To learn about this map, see Visualizing data sensitivity with the S3 buckets map.

To review your S3 bucket inventory
  1. Open the Amazon Macie console at https://console.aws.amazon.com/macie/.

  2. In the navigation pane, choose S3 buckets. The S3 buckets page displays your bucket inventory.

    If the page displays an interactive map of your bucket inventory, choose table ( The table view button, which is a button that contains three black horizontal lines ) at the top of the page. Macie then displays the number of buckets in your inventory and a table of the buckets.

  3. At the top of the page, optionally choose refresh ( The refresh button, which is a button that contains an empty, dark gray circle with an arrow ) to retrieve the latest bucket metadata from Amazon S3.

    If the information icon ( A blue circle with a blue, lowercase letter i in it ) appears next to any bucket names, we recommend that you do this. This icon indicates that a bucket was created during the past 24 hours, possibly after Macie last retrieved bucket and object metadata from Amazon S3 as part of the daily refresh cycle.

  4. On the S3 buckets page, use the table to review a subset of information about each bucket in your inventory:

    • Sensitivity – The bucket's current sensitivity score. This column appears only if automated sensitive data discovery is enabled for your account. For information about the range of sensitivity scores that Macie defines, see Sensitivity scoring for S3 buckets.

    • Bucket – The name of the bucket.

    • Account – The account ID for the AWS account that owns the bucket.

    • Classifiable objects – The total number of objects that Macie can analyze to detect sensitive data in the bucket.

    • Classifiable size – The total storage size of all the objects that Macie can analyze to detect sensitive data in the bucket.

      Note that this value doesn’t reflect the actual size of any compressed objects after they're decompressed. Also, if versioning is enabled for the bucket, this value is based on the storage size of the latest version of each object in the bucket.

    • Monitored by job – Whether any sensitive data discovery jobs are configured to periodically analyze objects in the bucket on a daily, weekly, or monthly basis.

      If the value for this field is Yes, the bucket is explicitly included in a periodic job or the bucket matched the criteria for a periodic job within the past 24 hours. In addition, the status of at least one of those jobs is not Cancelled. Macie updates this data on a daily basis.

    • Latest job run – If any one-time or periodic sensitive data discovery jobs are configured to analyze objects in the bucket, the value for this field indicates the most recent date and time when one of those jobs started to run. Otherwise, this field is empty.

    In the preceding data, objects are classifiable if they use a supported Amazon S3 storage class and they have a file name extension for a supported file or storage format. You can detect sensitive data in the objects by using Macie. For more information, see Supported storage classes and formats.

  5. To analyze your inventory by using the table, do any of the following:

    • To sort the table by a specific field, click the column heading for the field. To change the sort order, click the column heading again.

    • To filter the table and display only those buckets that have a specific value for a field, place your cursor in the filter box, and then add a filter condition for the field. To further refine the results, add filter conditions for additional fields. For more information, see Filtering your S3 bucket inventory.

  6. To review details and statistics for a particular bucket, choose the bucket's name in the table, and then refer to the details panel.

    Tip

    You can pivot and drill down on many of the fields in the bucket details panel. To show buckets that have the same value for a field, choose A magnifying glass with a plus sign in the field. To show buckets that have other values for a field, choose A magnifying glass with a minus sign in the field.

  7. To export data from the table to a CSV file, select the check box for each row that you want to export, or select the check box in the selection column heading to select all rows. Then choose Export to CSV at the top of the page. You can export up to 50,000 rows from the table.

Reviewing the details of S3 buckets

On the Amazon Macie console, you can use the details panel on the S3 buckets page to review statistics and other information about individual S3 buckets in your bucket inventory. This includes details and statistics for settings and metrics that provide insight into the security and privacy of a bucket’s data.

For example, you can review breakdowns of an S3 bucket’s public access settings, and determine whether a bucket is configured to replicate objects or is shared with other AWS accounts. You can also determine whether any sensitive data discovery jobs are configured to inspect the bucket for sensitive data. If there are, you can access details about the job that ran most recently, and optionally display any findings that the job produced.

If automated sensitive data discovery is enabled for your account, you can also use the details panel to review sensitive data discovery statistics and other information about individual S3 buckets. The panel captures the results of automated sensitive data discovery activities that Macie has performed thus far for a bucket. To learn about these details, see Reviewing data sensitivity details for individual S3 buckets.

To review the details of an S3 bucket
  1. Open the Amazon Macie console at https://console.aws.amazon.com/macie/.

  2. In the navigation pane, choose S3 buckets. The S3 buckets page displays your bucket inventory.

  3. At the top of the page, optionally choose refresh ( The refresh button, which is a button that contains an empty, dark gray circle with an arrow ) to retrieve the latest bucket metadata from Amazon S3.

  4. In the S3 buckets table or map, choose the bucket whose details you want to review. The details panel displays statistics and other information about the bucket.

In the details panel, bucket statistics and information are organized into the following primary sections:

Overview | Object statistics | Server-side encryption | Sensitive data discovery | Public access | Replication | Tags

As you review the information in each section, you can optionally pivot and drill down on certain fields. To show buckets that have the same value for a field, choose A magnifying glass with a plus sign in the field. To show buckets that have other values for a field, choose A magnifying glass with a minus sign in the field.

Overview

This section provides general information about the bucket, such as the bucket’s name, when the bucket was created, and the account ID for the AWS account that owns the bucket. Of special note, the Last updated field indicates when Macie most recently retrieved metadata from Amazon S3 for the bucket or the bucket’s objects.

The Shared access field indicates whether the bucket is shared with another AWS account, an Amazon CloudFront origin access identity (OAI), or a CloudFront origin access control (OAC):

  • External – The bucket is shared with one or more of the following or any combination of the following: a CloudFront OAI, a CloudFront OAC, or an account that's external to (not part of) your organization.

  • Internal – The bucket is shared with one or more accounts that are internal to (part of) your organization. It isn't shared with a CloudFront OAI or OAC.

  • Not shared – The bucket isn't shared with another account, a CloudFront OAI, or a CloudFront OAC.

  • Unknown – Macie wasn't able to evaluate the shared access settings for the bucket.

To determine whether a bucket is shared with another AWS account, Macie analyzes the bucket policy and access control list (ACL) for the bucket. The analysis is limited to bucket-level settings. It doesn’t reflect any object-level settings for sharing specific objects in the bucket. In addition, an organization is defined as a set of Macie accounts that are centrally managed as a group of related accounts through AWS Organizations or by Macie invitation. To learn about Amazon S3 options for sharing buckets, see Identity and access management in Amazon S3 in the Amazon Simple Storage Service User Guide.

Note

In certain cases, Macie might incorrectly indicate that a bucket is shared with an AWS account that's external to (not part of) your organization. This can occur if Macie isn’t able to fully evaluate the relationship between the Principal element in the bucket’s policy and certain AWS global condition context keys or Amazon S3 condition keys in the Condition element of the policy. The applicable condition keys are: aws:PrincipalAccount, aws:PrincipalArn, aws:PrincipalOrgID, aws:PrincipalOrgPaths, aws:PrincipalTag, aws:PrincipalType, aws:SourceAccount, aws:SourceArn, aws:userid, s3:DataAccessPointAccount, and s3:DataAccessPointArn. We recommend that you review the bucket’s policy to determine whether this access is intended and safe.

To determine whether a bucket is shared with a CloudFront OAI or OAC, Macie analyzes the bucket policy for the bucket. A CloudFront OAI or OAC allows users to access a bucket's objects through one or more specified CloudFront distributions. To learn about CloudFront OAIs and OACs, see Restricting access to an Amazon S3 origin in the Amazon CloudFront Developer Guide.

The Overview section of the panel also includes the Latest automated discovery run field. If automated sensitive data discovery is enabled for your account, this field indicates when Macie most recently analyzed objects in the bucket while performing automated discovery for your account. If automated sensitive data discovery is disabled for your account, a dash (–) appears in this field.

Object statistics

This section provides information about the objects in the bucket, starting with the total number of objects in the bucket (Total count), the total storage size of all those objects (Total storage size), and the total storage size of all the objects that are compressed (.gz, .gzip, or .zip) files (Total compressed size). Additional statistics in this section can help you assess how much data Macie can analyze to detect sensitive data in the bucket.

If you recently created the bucket or made significant changes to the bucket's objects during the past 24 hours, optionally choose refresh ( The refresh button, which is a button that contains an empty, dark gray circle with an arrow ) to retrieve the latest metadata for the bucket's objects. Macie displays the information icon ( A blue circle with a blue, lowercase letter i in it ) to help you determine whether this might be the case. The refresh option is available if a bucket contains 30,000 or fewer objects.

As you review the statistics in this section, keep the following in mind:

  • If versioning is enabled for the bucket, size values are based on the storage size of the latest version of each object in the bucket.

  • If the bucket contains compressed objects, size values don't reflect the actual size of those objects after they're decompressed.

  • If you refresh object metadata for a bucket, Macie temporarily reports Unknown for encryption statistics that apply to the objects. Macie will re-evaluate and update the data for these statistics when it performs the next daily refresh of bucket and object metadata, which is within 24 hours.

  • By default, object counts and size values include data for any object parts that the bucket contains as a result of incomplete multipart uploads. If you refresh object metadata for a bucket, Macie excludes data for object parts from the recalculated values. When Macie performs the next daily refresh of bucket and object metadata (within 24 hours), Macie recalculates and updates the values for these statistics and includes data for object parts in the values again.

    Note that Macie can't analyze object parts to detect sensitive data. Amazon S3 must first finish assembling the parts into one or more objects for Macie to analyze. For information about multipart uploads and object parts, including how to delete parts automatically with lifecycle rules, see Uploading and copying objects using multipart upload in the Amazon Simple Storage Service User Guide. To identify buckets that contain object parts, you can refer to incomplete multipart upload metrics in Amazon S3 Storage Lens. For more information, see Assessing your storage activity and usage in the Amazon Simple Storage Service User Guide.

Object statistics are organized as follows.

Classifiable objects

This section indicates the total number of objects that Macie can analyze to detect sensitive data and the total storage size of those objects. These objects use a supported Amazon S3 storage class and have a file name extension for a supported file or storage format. You can detect sensitive data in the objects by using Macie. For more information, see Supported storage classes and formats.

Unclassifiable objects

This section indicates the total number of objects that Macie can’t analyze to detect sensitive data and the total storage size of those objects. These objects don’t use a supported Amazon S3 storage class or they don’t have a file name extension for a supported file or storage format.

Unclassifiable objects: Storage class

This section provides a breakdown of the number and storage size of the objects that Macie can’t analyze because the objects don’t use a supported Amazon S3 storage class.

Unclassifiable objects: File type

This section provides a breakdown of the number and storage size of the objects that Macie can’t analyze because the objects don’t have a file name extension for a supported file or storage format.

Objects by encryption type

This section provides a breakdown of the number of objects that use each type of encryption that Amazon S3 supports:

  • Customer provided – The number of objects that are encrypted with a customer-provided key. These objects use SSE-C encryption.

  • AWS KMS managed – The number of objects that are encrypted with an AWS KMS key, either an AWS managed key or a customer managed key. These objects use DSSE-KMS or SSE-KMS encryption.

  • Amazon S3 managed – The number of objects that are encrypted with an Amazon S3 managed key. These objects use SSE-S3 encryption.

  • No encryption – The number of objects that aren’t encrypted or use client-side encryption. (If an object is encrypted using client-side encryption, Macie can't access and report encryption data for the object.)

  • Unknown – The number of objects that Macie doesn't have current encryption metadata for. This typically occurs if you recently chose to manually refresh the metadata for the bucket's objects. Macie will update the encryption statistics when it performs the next daily refresh of bucket and object metadata, which is within 24 hours.

For information about each supported encryption type, see Protecting data with encryption in the Amazon Simple Storage Service User Guide.

Server-side encryption

This section provides insight into the server-side encryption settings for the bucket.

The Encryption required by bucket policy field indicates whether the bucket's policy requires server-side encryption of objects when objects are added to the bucket:

  • No – The bucket doesn't have a bucket policy or the bucket's policy doesn't require server-side encryption of new objects. If a bucket policy exists, it doesn't require PutObject requests to include a valid server-side encryption header.

  • Yes – The bucket's policy requires server-side encryption of new objects. PutObject requests for the bucket must include a valid server-side encryption header. Otherwise, Amazon S3 denies the request.

  • Unknown – Macie wasn't able to evaluate the bucket's policy to determine whether it requires server-side encryption of new objects.

For this assessment, valid server-side encryption headers are: x-amz-server-side-encryption with a value of AES256 or aws:kms, and x-amz-server-side-encryption-customer-algorithm with a value of AES256. For information about using bucket policies to require server-side encryption of new objects, see Protecting data with server-side encryption in the Amazon Simple Storage Service User Guide.

The Default encryption field indicates which server-side encryption algorithm the bucket is configured to apply by default to objects that are added to the bucket:

  • AES256 – The bucket's default encryption settings are configured to encrypt new objects with an Amazon S3 managed key. New objects are encrypted automatically using SSE-S3 encryption.

  • aws:kms – The bucket's default encryption settings are configured to encrypt new objects with an AWS KMS key, either an AWS managed key or a customer managed key. New objects are encrypted automatically using SSE-KMS encryption. The AWS KMS key field shows the Amazon Resource Name (ARN) or unique identifier (key ID) for the key that's used.

  • aws:kms:dsse – The bucket's default encryption settings are configured to encrypt new objects with an AWS KMS key, either an AWS managed key or a customer managed key. New objects are encrypted automatically using DSSE-KMS encryption. The AWS KMS key field shows the ARN or key ID for the key that's used.

  • None – The bucket's default encryption settings don't specify server-side encryption behavior for new objects.

Starting January 5, 2023, Amazon S3 automatically applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for objects that are added to buckets. You can optionally configure a bucket's default encryption settings to instead use server-side encryption with an AWS KMS key (SSE-KMS) or dual-layer server-side encryption with an AWS KMS key (DSSE-KMS). For information about default encryption settings and options, see Setting default server-side encryption behavior for S3 buckets in the Amazon Simple Storage Service User Guide.

Sensitive data discovery

This section indicates whether any sensitive data discovery jobs are configured to periodically analyze objects in the bucket on a daily, weekly, or monthly basis. If the value for the Actively monitored by job field is Yes, the bucket is explicitly included in a periodic job or the bucket matched the criteria for a periodic job within the past 24 hours. In addition, the status of at least one of those jobs is not Cancelled. Macie updates this data on a daily basis.

If any type of sensitive data discovery job (either a periodic job or a one-time job) is configured to inspect the bucket, the Latest job field provides the unique identifier for the job that most recently started to run. The Latest job run field indicates when that job started to run.

Tip

To display all the sensitive data findings that the job produced, choose the link in the Latest job field. In the job details panel that appears, choose Show results at the top of the panel, and then choose Show findings.

Public access

This section indicates whether the bucket is publicly accessible. It also provides a breakdown of the various account- and bucket-level settings that determine whether this is the case. The Effective permission field indicates the cumulative result of these settings:

  • Not public – The bucket isn’t publicly accessible.

  • Public – The bucket is publicly accessible.

  • Unknown – Macie wasn’t able to evaluate all the public access settings for the bucket.

Note that this data is limited to account- and bucket-level settings. It doesn’t reflect object-level settings that enable public access to specific objects in a bucket.

To learn about Amazon S3 settings for managing public access to buckets and bucket data, see Identity and access management in Amazon S3 and Blocking public access to your Amazon S3 storage in the Amazon Simple Storage Service User Guide.

Replication

In this section, the Replicated field indicates whether the bucket is configured to replicate objects to other buckets. If the value for this field is Yes, one or more replication rules are configured and enabled for the bucket. This section then also lists the account ID for each AWS account that owns a destination bucket.

The Replicated externally field indicates whether the bucket is configured to replicate objects to buckets for AWS accounts that are external to (not part of) your organization. An organization is a set of Macie accounts that are centrally managed as a group of related accounts through AWS Organizations or by Macie invitation. If the value for this field is Yes, a replication rule is configured and enabled for the bucket, and the rule is configured to replicate objects to a bucket that's owned by an external AWS account.

Note

Under certain conditions, Macie might incorrectly indicate that a bucket is configured to replicate objects to a bucket that's owned by an external AWS account. This can occur if the destination bucket was created in a different AWS Region during the preceding 24 hours, after Macie retrieved bucket and object metadata from Amazon S3 as part of the daily refresh cycle.

To investigate the issue by using Macie, choose refresh ( The refresh button, which is a button that contains an empty, dark gray circle with an arrow ) to retrieve the latest bucket metadata from Amazon S3. Then review the list of account IDs in this section. For deeper investigation, use Amazon S3 to review the replication rules for the bucket.

To learn about Amazon S3 options and settings for replicating bucket objects, see Replicating objects in the Amazon Simple Storage Service User Guide.

Tags

If tags are associated with the bucket, this section appears in the panel and lists those tags. Tags are labels that you can define and assign to certain types of AWS resources, including S3 buckets. Each tag consists of a required tag key and an optional tag value.

To learn about tagging buckets, see Using cost allocation S3 bucket tags in the Amazon Simple Storage Service User Guide.