Storing and retaining sensitive data discovery results with Amazon Macie - Amazon Macie

Storing and retaining sensitive data discovery results with Amazon Macie

When Amazon Macie runs a sensitive data discovery job, it creates a record for each Amazon S3 object that the job is configured to analyze. This includes objects that don't contain sensitive data, and therefore don't produce a finding, and objects that Macie can't analyze due to issues such as permissions settings. If an object does contain sensitive data, the record indicates where Macie found each occurrence of sensitive data in the object. Macie stores these records, referred to as sensitive data discovery results, for 90 days. To learn more about sensitive data discovery results, see Reviewing job statistics and results.

To access your sensitive data discovery results and enable long-term storage and retention of them, configure Macie to store the results in an S3 bucket and encrypt them using an AWS Key Management Service (AWS KMS) key. If you do this, Macie writes your sensitive data discovery results to JSON Lines files, which it adds to the S3 bucket as GNU Zip (GZ) files. Consequently, the S3 bucket can serve as a definitive, long-term repository for all of your sensitive data discovery results.

This topic walks you through the process of configuring this type of repository for your discovery results. The configuration is a combination of an S3 bucket that stores the results, an AWS KMS key that encrypts the results, and Macie settings that indicate which bucket and key to use. When you configure the settings in Macie, your choices apply only to the current AWS Region. If your account is the Macie master account for an organization, your choices apply only to your account. They don't apply to any associated member accounts.

If you use Macie in multiple Regions, you need to configure the repository settings for each Region in which you use Macie. If you prefer to store all discovery results for all Regions in one S3 bucket, you can do this by choosing the same bucket, located in one specific Region, for each (and every) Region in which you use Macie.

Step 1: Verify your permissions

Before you configure a repository for your sensitive data discovery results, verify that you're allowed to perform the following Macie action:

macie2:PutClassificationExportConfiguration

This action allows you to add or change the repository settings in Macie.

Also verify that you're allowed to perform the following Amazon S3 actions:

  • s3:CreateBucket

  • s3:GetBucketLocation

  • s3:ListAllMyBuckets

  • s3:PutBucketAcl

  • s3:PutBucketPolicy

  • s3:PutBucketPublicAccessBlock

  • s3:PutObject

These actions enable you to access and configure Amazon S3 buckets that can serve as the repository.

Finally, verify that you're allowed to perform the kms:ListAliases action. This action enables you to retrieve information about AWS KMS keys that can encrypt the data in the repository. If you plan to create a new AWS KMS key to encrypt the data, you also need to be allowed to perform the kms:CreateKey, kms:GetKeyPolicy, and kms:PutKeyPolicy actions.

You can verify your permissions by using the AWS Identity and Access Management (IAM) console. To do this, choose Users in the navigation pane of the IAM console. Then choose your user name. The Permissions tab lists all the policies that are associated with your account. Choose a policy to view its details.

Step 2: Define the AWS KMS key and policy

When you configure Macie to store your sensitive data discovery results in an S3 bucket, you specify which AWS KMS key you want Macie to use to encrypt the results. This key must be a symmetric, customer master key (CMK) that's in the same AWS Region as the S3 bucket where you want to store the results. The key can be an existing CMK, or a new CMK that you create before you configure the repository settings in Macie.

If you want to use a new CMK, create the key before proceeding. To learn how, see Creating keys in the AWS Key Management Service Developer Guide. If you want to use an existing key that's owned by another account, sign in to the account that owns the key and note the Amazon Resource Name (ARN) of the key. You'll need to enter this ARN when you configure the repository settings in Macie. To learn how to find the ARN of a key, see Finding the key ID and ARN in the AWS Key Management Service Developer Guide.

After you determine which CMK you want to use, you have to give Macie permission to use the key. Otherwise, Macie won't be able to encrypt (or store) discovery results in the repository. To give Macie permission to use the key, change the key's policy.

To change the key's policy

  1. Open the AWS KMS console at https://console.aws.amazon.com/kms.

  2. To change the AWS Region, use the Region selector in the upper-right corner of the page.

  3. Choose the key that you want to use to encrypt the results.

  4. On the Key policy tab, choose Edit.

  5. Add the following statement to the policy:

    { "Sid": "Allow Macie to use the key", "Effect": "Allow", "Principal": { "Service": "macie.amazonaws.com" }, "Action": [ "kms:GenerateDataKey", "kms:Encrypt" ], "Resource": "*" }

    When you add this statement to the policy, make sure that the syntax is valid. Policies use JSON format. This means that you need to also add a comma before or after the statement, depending on where you add the statement to the policy. If you add the statement as the last statement, add a comma after the closing curly brace for the preceding section. If you add it as the first statement or between two existing statements, add a comma after the closing curly brace. The following examples show how to add the statement to a default key policy.

    The following example shows a default key policy that doesn't grant any additional permissions:

    { "Id": "key-consolepolicy", "Version": "2012-10-17", "Statement": [ { "Sid": "Enable IAM user permissions", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::111122223333:root" }, "Action": "kms:*", "Resource": "*" } ] }

    The following example shows how to add the statement as the first statement for the policy:

    { "Id": "key-consolepolicy", "Version": "2012-10-17", "Statement": [ { "Sid": "Allow Macie to use the key", "Effect": "Allow", "Principal": {"Service": "macie.amazonaws.com"}, "Action": [ "kms:GenerateDataKey", "kms:Encrypt" ], "Resource": "*" }, <-- Add a comma after this curly brace { "Sid": "Enable IAM user permissions", "Effect": "Allow", "Principal": {"AWS": "arn:aws:iam::111122223333:root"}, "Action": "kms:*", "Resource": "*" } ] }

    The following example shows how to add the statement as the last statement for the policy:

    { "Id": "key-consolepolicy", "Version": "2012-10-17", "Statement": [ { "Sid": "Enable IAM user permissions", "Effect": "Allow", "Principal": {"AWS": "arn:aws:iam::111122223333:root"}, "Action": "kms:*", "Resource": "*" }, <-- Add a comma after this curly brace { "Sid": "Allow Macie to use the key", "Effect": "Allow", "Principal": {"Service": "macie.amazonaws.com"}, "Action": [ "kms:GenerateDataKey", "kms:Encrypt" ], "Resource": "*" } ] }
  6. When you finish adding the statement, choose Save changes.

Step 3: Specify the S3 bucket to use

After you verify your permissions and define the AWS KMS key to use, you're ready to specify which S3 bucket you want to use as the repository for your sensitive data discovery results. You have two options:

  • Use a new S3 bucket that Macie creates – If you choose this option, Macie automatically creates a new S3 bucket for your discovery results. It also applies a bucket policy to the bucket. The policy allows Macie to create (put) objects in the bucket. To review this policy, choose View policy on the Amazon Macie console after you enter a name for the bucket.

  • Use an existing S3 bucket that you create – If you prefer to store your discovery results in a particular S3 bucket that you create, create the bucket before you proceed. Then update the bucket's policy to ensure that it allows Macie to create (put) objects in the bucket. This topic explains how to update the policy. It also provides samples of the statements to add to the policy.

The following sections provide step-by-step instructions for each of these options. Choose the section for the option that you want.

If you choose to use a new S3 bucket that Macie creates for you, the final step in the process is to configure the repository settings in Macie.

To configure the repository settings in Macie

  1. Open the Macie console at https://console.aws.amazon.com/macie/.

  2. In the navigation pane, under Settings, choose Discovery results.

  3. Under Repository for sensitive data discovery results, choose Create bucket.

  4. In the Create a bucket box, enter a name for the bucket. The name must be unique across all S3 buckets. In addition, the name must start with a lowercase letter or a number.

  5. (Optional) To specify a path prefix to use in the path to a location in the bucket, expand the Advanced section. Then, for Data discovery result prefix, enter the path prefix to use.

    When you enter a value, Macie updates the example below the field to show the path to the bucket location where it will store your discovery results.

  6. For Block all public access, choose whether to enable all block public access settings for the bucket. For information about these settings, see Using Amazon S3 block public access in the Amazon Simple Storage Service Developer Guide.

  7. Under KMS encryption, specify the AWS KMS key that you want to use to encrypt the results:

    • To use a key for your own account, choose Select a key from your account. Then, from the KMS key alias list, choose the alias of the key to use.

    • To use a key that's owned by another account and you're authorized to use, choose Enter the ARN of a key in another account. Then, in the KMS key ARN field, enter the ARN of the key to use.

    The key must be a symmetric, customer master key (CMK) that's in the same Region as the S3 bucket.

  8. When you finish entering the settings, choose Save. Macie then tests the settings to verify that they're correct. If any settings are incorrect, it displays an error message to help you address the issue.

After you save the repository settings, Macie adds existing discovery results for the preceding 90 days to the repository. Macie also starts adding new discovery results to the repository.

If you prefer to store your sensitive data discovery results in a particular S3 bucket that you create, create the bucket before you proceed.

After you create the bucket, add a bucket policy that allows Macie to retrieve information about the bucket and create (put) objects in the bucket. You can then configure the repository settings in Macie.

Important

If you change the bucket path after you configure the repository settings in Macie, you have to update the bucket policy. Otherwise, Macie won't be allowed to add discovery results to the bucket.

To add the bucket policy to the bucket

  1. Open the Amazon S3 console at https://console.aws.amazon.com/s3/.

  2. Choose the bucket that you want to store your discovery results in.

  3. On the Permissions tab, choose Bucket Policy.

  4. Copy the following example policy to your clipboard:

    { "Version": "2012-10-17", "Statement": [ { "Sid": "Allow Macie to use the GetBucketLocation operation", "Effect": "Allow", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:GetBucketLocation", "Resource": "arn:aws:s3:::myBucketName" }, { "Sid": "Allow Macie to upload objects to the bucket", "Effect": "Allow", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:PutObject", "Resource": "arn:aws:s3:::myBucketName/[optional prefix]/*" }, { "Sid": "Deny unencrypted object uploads. This is optional", "Effect": "Deny", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:PutObject", "Resource": "arn:aws:s3:::myBucketName/[optional prefix]/*", "Condition": { "StringNotEquals": { "s3:x-amz-server-side-encryption": "aws:kms" } } }, { "Sid": "Deny incorrect encryption headers. This is optional", "Effect": "Deny", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:PutObject", "Resource": "arn:aws:s3:::myBucketName/[optional prefix]/*", "Condition": { "StringNotEquals": { "s3:x-amz-server-side-encryption-aws-kms-key-id": "arn:aws:kms:Region:111122223333:key/KMSKeyId" } } }, { "Sid": "Deny non-HTTPS access", "Effect": "Deny", "Principal": "*", "Action": "s3:*", "Resource": "arn:aws:s3:::myBucketName/*", "Condition": { "Bool": { "aws:SecureTransport": "false" } } } ] }
  5. Paste the example policy in the Bucket policy editor on the Amazon S3 console. Then replace the placeholder values with the correct values for your environment, where:

    • myBucketName is the name of the bucket.

    • Region is the AWS Region that hosts the AWS KMS customer master key (CMK) to use for encryption of the discovery results.

    • 111122223333 is your AWS account ID, or the AWS account ID for the account that owns the AWS KMS CMK to use for encryption of the discovery results.

    • KMSKeyId is the key ID of the AWS KMS CMK to use for encryption of the discovery results.

  6. When you finish updating the bucket policy, choose Save.

You can now configure the repository settings in Macie.

To configure the repository settings in Macie

  1. Open the Macie console at https://console.aws.amazon.com/macie/.

  2. In the navigation pane, under Settings, choose Discovery results.

  3. Under Repository for sensitive data discovery results, choose Existing bucket.

  4. For Choose a bucket, select the bucket that you want to store your discovery results in.

  5. (Optional) To specify a path prefix to use in the path to a location in the bucket, expand the Advanced section. Then, for Data discovery result prefix, enter the path prefix to use.

    When you enter a value, Macie updates the example below the field to show the path to the bucket location where it will store your discovery results.

  6. Under KMS encryption, specify the AWS KMS key that you want to use to encrypt the results:

    • To use a key for your own account, choose Select a key from your account. Then, from the KMS key alias list, choose the alias of the key to use.

    • To use a key that's owned by another account and you're authorized to use, choose Enter the ARN of a key in another account. Then, in the KMS key ARN field, enter the ARN of the key to use.

    The key must be a symmetric, customer master key (CMK) in the same Region as the S3 bucket that you specified.

  7. When you finish entering the settings, choose Save. Macie then tests the settings to verify that they're correct. If any settings are incorrect, it displays an error message to help you address the issue.

After you save the repository settings, Macie adds existing discovery results for the preceding 90 days to the repository. Macie also starts adding new discovery results to the repository.

Troubleshooting errors

If an error occurs when Macie tries to add sensitive data discovery results to the repository, Macie displays an error message on the Repository for sensitive data discovery results page of the console. Macie also sends a notification to the email address that's associated with your account. If you don't address the error, Macie stores backups of your discovery results for up to 90 days.

Errors typically occur because Macie loses access to the repository—for example, the S3 bucket was deleted or permissions for the bucket were changed. They also occur if the AWS KMS key that's used to encrypt the results becomes inaccessible. If an error occurs, use the information in this topic as a guide to walk through possible causes and solutions for the error. For example, review the policy for the AWS KMS key and confirm that it's still correct.

After you address the error, update the configuration settings in Macie. Macie then starts adding new discovery results to the repository. Macie also adds any existing results that it created and stored while the error existed (for up to 90 days).