Filtering your S3 bucket inventory with Amazon Macie - Amazon Macie

Filtering your S3 bucket inventory with Amazon Macie

To identify and focus on buckets that have specific characteristics, you can filter your S3 bucket inventory on the Amazon Macie console and in queries that you submit programmatically using the Amazon Macie API. When you create a filter, you use specific bucket attributes to define criteria for including or excluding buckets from a view or from query results. A bucket attribute is a field that stores specific metadata for a bucket.

In Macie, a filter consists of one or more conditions. Each condition, also referred to as a criterion, consists of three parts:

  • An attribute-based field, such as Bucket name, Tag key, or Defined in job.

  • An operator, such as equals or not equals.

  • One or more values. The type and number of values depends on the field and operator that you choose.

How you define and apply filter conditions depends on whether you use the Amazon Macie console or the Amazon Macie API.

Filtering your inventory on the Amazon Macie console

If you use the Amazon Macie console to filter your bucket inventory, Macie provides options to help you choose fields, operators, and values for individual conditions. You access these options by using the filter bar on the S3 buckets page, as shown in the following image.


                The filter bar above the table on the S3 buckets
                    page.

When you place your cursor in the filter bar, Macie displays a list of fields that you can use in filter conditions. The fields are organized by logical category. For example, the Common fields category includes fields that store general information about a bucket, and the Public access category includes fields that store data about the various types of public access settings that can apply to a bucket. The fields are sorted alphabetically within each category.

To add a condition, start by choosing a field from the list. To find a field, browse the complete list, or enter part of the field's name to narrow the list of fields.

Depending on the field that you choose, Macie displays different options. The options reflect the type and nature of the field that you choose. For example, if you choose the Defined in job field, Macie displays a list of values to choose from. If you choose the Bucket name field, Macie displays a text box in which you can enter a bucket name. Whichever field you choose, Macie guides you through the steps to add a condition that includes the required settings for the field.

After you add a condition, Macie applies the condition's criteria and adds the condition to a filter box in the filter bar, as shown in the following image.


                The filter bar, above the table on the S3 buckets page,
                    with a filter box for a condition.

In this example, the condition is configured to include all buckets that are publicly accessible, and to exclude all other buckets. It returns buckets where the value for the Effective permission field equals Public.

As you add more conditions, Macie applies their criteria and adds them to the filter bar. If you add multiple conditions, Macie uses AND logic to join the conditions and evaluate the filter criteria. This means that a bucket meets the filter criteria only if it matches all the conditions in the filter.

You can refer to the filter bar at any time to see which criteria you've applied.

To filter your inventory by using the console

  1. Open the Macie console at https://console.aws.amazon.com/macie/.

  2. In the navigation pane, choose S3 buckets. The S3 buckets page opens and displays the number of buckets in your inventory and a table of the buckets.

  3. To retrieve the latest bucket metadata from Amazon S3, choose refresh ( The refresh button, which is a button that contains an empty, dark gray circle with an arrow ) at the top of the page.

  4. Place your cursor in the filter bar, and then choose the field to use for the condition.

  5. Choose or enter the appropriate type of value for the field, keeping the following tips in mind.

    Dates, times, and time ranges

    For dates and times, use the From and To boxes to define an inclusive time range:

    • To define a fixed time range, use the From and To boxes to specify the first date and time and the last date and time in the range, respectively.

    • To define a relative time range that starts at a certain date and time and ends at the current time, enter the start date and time in the From boxes, and delete any text in To boxes.

    • To define a relative time range that ends at a certain date and time, enter the end date and time in the To boxes, and delete any text in the From boxes.

    Note that time values use 24-hour notation. If you use the date picker to choose dates, you can refine the values by entering text directly in the From and To boxes.

    Numbers and numeric ranges

    For numeric values, use the From and To boxes to enter integers that define an inclusive numeric range:

    • To define a fixed numeric range, use the From and To boxes to specify the lowest and highest numbers in the range, respectively.

    • To define a fixed numeric range that's limited to one specific value, enter the value in both the From and To boxes. For example, to include only those buckets that contain exactly 15 objects, enter 15 in the From and To boxes.

    • To define a relative numeric range that starts at a certain number, enter the number in the From box, and don’t enter any text in the To box.

    • To define a relative numeric range that ends at a certain number, enter the number in the To box, and don’t enter any text in the From box.

    Text (string) values

    For this type of value, enter a complete, valid value for the field. Values are case sensitive.

    Note that you can’t use a partial value or wildcard characters in this type of value. The only exception is the Bucket name field. For that field, you can specify a prefix instead of a complete bucket name. For example, to find all S3 buckets whose names begin with my-S3, enter my-S3 as the filter value for Bucket name field. If you enter any other value, such as My-s3 or my*, Macie won’t return the buckets.

  6. When you finish adding a value for the field, choose Apply. Macie applies the filter criteria and adds the condition to a filter box in the filter bar.

    Tip

    For many fields, you can change a condition's operator from equals to not equals by choosing the equals icon ( A solid, dark gray circle ) in the filter box. If you do this, Macie changes the operator to not equals and displays the not equals icon ( An empty, dark gray circle with a backslash ) in the filter box. To switch to the equals operator again, choose the not equals icon.

  7. Repeat steps 4 through 6 for each additional condition that you want to add.

  8. To remove a condition, choose the remove condition icon ( A circle with an X in it ) in the filter box for the condition.

  9. To change a condition, remove the condition by choosing the remove condition icon ( A circle with an X in it ) in the filter box for the condition. Then repeat steps 4 through 6 to add a condition with the correct settings.

Filtering your inventory programmatically with the Amazon Macie API

To filter your bucket inventory programmatically, specify filter criteria in queries that you submit using the DescribeBuckets operation of the Amazon Macie API. This operation returns an array of objects. Each object contains statistical data and other information about a bucket that meets the filter criteria.

To specify filter criteria in a query, include a map of filter conditions in your request. For each condition, specify a field, an operator, and one or more values for the field. The type and number of values depends on the field and operator that you choose. For information about the fields, operators, and types of values that you can use in a condition, see Amazon S3 Data Source in the Amazon Macie API Reference.

The following examples show you how to specify filter criteria in queries that you submit using the AWS Command Line Interface (AWS CLI). You can also do this by sending HTTPS requests directly to Macie, or by using a current version of another AWS command line tool or an AWS SDK. For information about AWS tools and SDKs, see Tools to Build on AWS.

The examples use the describe-buckets command. If an example runs successfully, Macie returns a buckets array. The array contains an object for each bucket that’s in the current AWS Region and meets the filter criteria. For an example of this output, expand the following section.

In this example, the buckets array provides details about two buckets that met the filter criteria specified in a query.

{ "buckets": [ { "accountId": "123456789012", "bucketArn": "arn:aws:s3:::DOC-EXAMPLE-BUCKET1", "bucketCreatedAt": "2020-05-18T19:54:00+00:00", "bucketName": "DOC-EXAMPLE-BUCKET1", "allowsUnencryptedObjectUploads": "FALSE", "classifiableObjectCount": 13, "classifiableSizeInBytes": 1592088, "jobDetails": { "isDefinedInJob": "TRUE", "isMonitoredByJob": "TRUE", "lastJobId": "08c81dc4a2f3377fae45c9ddaexample", "lastJobRunTime": "2021-04-26T14:55:30.270000+00:00" }, "lastUpdated": "2021-04-30T07:33:06.337000+00:00", "objectCount": 13, "objectCountByEncryptionType": { "customerManaged": 0, "kmsManaged": 2, "s3Managed": 7, "unencrypted": 4, "unknown": 0 }, "publicAccess": { "effectivePermission": "NOT_PUBLIC", "permissionConfiguration": { "accountLevelPermissions": { "blockPublicAccess": { "blockPublicAcls": true, "blockPublicPolicy": true, "ignorePublicAcls": true, "restrictPublicBuckets": true } }, "bucketLevelPermissions": { "accessControlList": { "allowsPublicReadAccess": false, "allowsPublicWriteAccess": false }, "blockPublicAccess": { "blockPublicAcls": true, "blockPublicPolicy": true, "ignorePublicAcls": true, "restrictPublicBuckets": true }, "bucketPolicy": { "allowsPublicReadAccess": false, "allowsPublicWriteAccess": false } } } }, "region": "us-east-1", "replicationDetails": { "replicated": false, "replicatedExternally": false, "replicationAccounts": [] }, "serverSideEncryption": { "kmsMasterKeyId": null, "type": "NONE" }, "sharedAccess": "NOT_SHARED", "sizeInBytes": 4549746, "sizeInBytesCompressed": 0, "tags": [ { "key": "Division", "value": "HR" }, { "key": "Team", "value": "Recruiting" } ], "unclassifiableObjectCount": { "fileType": 0, "storageClass": 0, "total": 0 }, "unclassifiableObjectSizeInBytes": { "fileType": 0, "storageClass": 0, "total": 0 }, "versioning": false }, { "accountId": "123456789012", "bucketArn": "arn:aws:s3:::DOC-EXAMPLE-BUCKET2", "bucketCreatedAt": "2020-11-25T18:24:38+00:00", "bucketName": "DOC-EXAMPLE-BUCKET2", "allowsUnencryptedObjectUploads": "TRUE", "classifiableObjectCount": 8, "classifiableSizeInBytes": 133810, "jobDetails": { "isDefinedInJob": "TRUE", "isMonitoredByJob": "FALSE", "lastJobId": "188d4f6044d621771ef7d65f2example", "lastJobRunTime": "2021-04-09T19:37:11.511000+00:00" }, "lastUpdated": "2021-04-30T07:33:06.337000+00:00", "objectCount": 8, "objectCountByEncryptionType": { "customerManaged": 0, "kmsManaged": 0, "s3Managed": 8, "unencrypted": 0, "unknown": 0 }, "publicAccess": { "effectivePermission": "NOT_PUBLIC", "permissionConfiguration": { "accountLevelPermissions": { "blockPublicAccess": { "blockPublicAcls": true, "blockPublicPolicy": true, "ignorePublicAcls": true, "restrictPublicBuckets": true } }, "bucketLevelPermissions": { "accessControlList": { "allowsPublicReadAccess": false, "allowsPublicWriteAccess": false }, "blockPublicAccess": { "blockPublicAcls": true, "blockPublicPolicy": true, "ignorePublicAcls": true, "restrictPublicBuckets": true }, "bucketPolicy": { "allowsPublicReadAccess": false, "allowsPublicWriteAccess": false } } } }, "region": "us-east-1", "replicationDetails": { "replicated": false, "replicatedExternally": false, "replicationAccounts": [] }, "serverSideEncryption": { "kmsMasterKeyId": null, "type": "AES256" }, "sharedAccess": "EXTERNAL", "sizeInBytes": 175978, "sizeInBytesCompressed": 0, "tags": [ { "key": "Division", "value": "HR" }, { "key": "Team", "value": "Recruiting" } ], "unclassifiableObjectCount": { "fileType": 0, "storageClass": 0, "total": 0 }, "unclassifiableObjectSizeInBytes": { "fileType": 0, "storageClass": 0, "total": 0 }, "versioning": true } ] }

If no buckets meet the filter criteria, Macie returns an empty buckets array.

{ "buckets": [] }

Example 1: Find buckets by bucket name

This example uses the describe-buckets command to query metadata for all buckets whose names begin with my-S3 and are in the current AWS Region.

For Linux, macOS, or Unix:

$ aws macie2 describe-buckets --criteria '{"bucketName":{"prefix":"my-S3"}}'

For Microsoft Windows:

C:\> aws macie2 describe-buckets --criteria={\"bucketName\":{\"prefix\":\"my-S3\"}}

Where:

  • bucketName specifies the JSON name of the Bucket name field.

  • prefix specifies the prefix operator.

  • my-S3 is the value for the Bucket name field.

Example 2: Find buckets that are publicly accessible

This example uses the describe-buckets command to query metadata for buckets that are in the current AWS Region and, based on a combination of permissions settings, are publicly accessible.

For Linux, macOS, or Unix:

$ aws macie2 describe-buckets --criteria '{"publicAccess.effectivePermission":{"eq":["PUBLIC"]}}'

For Microsoft Windows:

C:\> aws macie2 describe-buckets --criteria={\"publicAccess.effectivePermission\":{\"eq\":[\"PUBLIC\"]}}

Where:

  • publicAccess.effectivePermission specifies the JSON name of the Effective permission field.

  • eq specifies the equals operator.

  • PUBLIC is an enumerated value for the Effective permission field.

Example 3: Find buckets that contain unencrypted objects

This example uses the describe-buckets command to query metadata for buckets that are in the current AWS Region and contain unencrypted objects.

For Linux, macOS, or Unix:

$ aws macie2 describe-buckets --criteria '{"objectCountByEncryptionType.unencrypted":{"gte":1}}'

For Microsoft Windows:

C:\> aws macie2 describe-buckets --criteria={\"objectCountByEncryptionType.unencrypted\":{\"gte\":1}}

Where:

  • objectCountByEncryptionType.unencrypted specifies the JSON name of the No encryption field.

  • gte specifies the greater than or equal to operator.

  • 1 is the lowest value in an inclusive, relative numeric range for the No encryption field.

Example 4: Find buckets that aren’t monitored by a job

This example uses the describe-buckets command to query metadata for buckets that are in the current AWS Region and aren’t associated with any periodic sensitive data discovery jobs.

For Linux, macOS, or Unix:

$ aws macie2 describe-buckets --criteria '{"jobDetails.isMonitoredByJob":{"eq":["FALSE"]}}'

For Microsoft Windows:

C:\> aws macie2 describe-buckets --criteria={\"jobDetails.isMonitoredByJob\":{\"eq\":[\"FALSE\"]}}

Where:

  • jobDetails.isMonitoredByJob specifies the JSON name of the Actively monitored by job field.

  • eq specifies the equals operator.

  • FALSE is an enumerated value for the Actively monitored by job field.

Example 5: Find buckets that replicate data to external accounts

This example uses the describe-buckets command to query metadata for buckets that are in the current AWS Region and are configured to replicate objects to an AWS account that isn’t part of your organization.

For Linux, macOS, or Unix:

$ aws macie2 describe-buckets --criteria '{"replicationDetails.replicatedExternally":{"eq":["true"]}}'

For Microsoft Windows:

C:\> aws macie2 describe-buckets --criteria={\"replicationDetails.replicatedExternally\":{\"eq\":[\"true\"]}}

Where:

  • replicationDetails.replicatedExternally specifies the JSON name of the Replicated externally field.

  • eq specifies the equals operator.

  • true specifies a Boolean value for the Replicated externally field.

Example 6: Find buckets based on multiple criteria

This example uses the describe-buckets command to query metadata for buckets that are in the current AWS Region and meet the following criteria: are publicly accessible based on a combination of permission settings; contain unencrypted objects; and aren’t associated with any periodic sensitive data discovery jobs.

For Linux, macOS, or Unix, using the backslash (\) line-continuation character to improve readability:

$ aws macie2 describe-buckets \ --criteria '{"publicAccess.effectivePermission":{"eq":["PUBLIC"]},"objectCountByEncryptionType.unencrypted":{"gte":1},"jobDetails.isMonitoredByJob":{"eq":["FALSE"]}}'

For Microsoft Windows, using the caret (^) line-continuation character to improve readability:

C:\> aws macie2 describe-buckets ^ --criteria={\"publicAccess.effectivePermission\":{\"eq\":[\"PUBLIC\"]},\"objectCountByEncryptionType.unencrypted\":{\"gte\":1},\"jobDetails.isMonitoredByJob\":{\"eq\":[\"FALSE\"]}}

Where:

  • publicAccess.effectivePermission specifies the JSON name of the Effective permission field, and:

    • eq specifies the equals operator.

    • PUBLIC is an enumerated value for the Effective permission field.

  • objectCountByEncryptionType.unencrypted specifies the JSON name of the No encryption field, and:

    • gte specifies the greater than or equal to operator.

    • 1 is the lowest value in an inclusive, relative numeric range for the No encryption field.

  • jobDetails.isMonitoredByJob specifies the JSON name of the Actively monitored by job field, and:

    • eq specifies the equals operator.

    • FALSE is an enumerated value for the Actively monitored by job field.