Storing and retaining sensitive data discovery results with Amazon Macie - Amazon Macie

Storing and retaining sensitive data discovery results with Amazon Macie

When Amazon Macie runs a sensitive data discovery job, it creates a record for each Amazon S3 object that you configure the job to analyze. This includes objects that don't contain sensitive data, and therefore don't produce a finding, and objects that Macie can't analyze due to issues such as permissions settings or use of an unsupported format. If an object does contain sensitive data, the record includes data from the corresponding finding. It provides additional information too, such as the location of as many as 1,000 occurrences of each type of sensitive data that Macie found in the object. Macie stores these records, referred to as sensitive data discovery results, for 90 days. To learn more about sensitive data discovery results, see Reviewing job statistics and results.

To access your sensitive data discovery results and enable long-term storage and retention of them, configure Macie to store the results in an S3 bucket and encrypt them using an AWS Key Management Service (AWS KMS) key. If you do this, Macie writes your sensitive data discovery results to JSON Lines files, which it adds to the S3 bucket as GNU Zip (GZ) files. The S3 bucket can then serve as a definitive, long-term repository for all of your sensitive data discovery results.

This topic walks you through the process of configuring this type of repository for your discovery results. The configuration is a combination of an S3 bucket that stores the results, an AWS KMS key that encrypts the results, and Macie settings that indicate which bucket and key to use.

When you configure the settings in Macie, your choices apply only to the current AWS Region. If your account is the Macie administrator account for an organization, your choices apply only to your account. They don't apply to any associated member accounts.

If you use Macie in multiple Regions, configure the repository settings for each Region in which you use Macie. If you prefer to store all discovery results for all Regions in one S3 bucket, you can do this by choosing the same bucket, located in one specific Region, for each (and every) Region in which you use Macie.

Step 1: Verify your permissions

Before you configure a repository for your sensitive data discovery results, verify that you have the permissions that you need. You can do this by using the AWS Identity and Access Management (IAM) console:

  1. Sign in to the AWS Management Console and open the IAM console at https://console.aws.amazon.com/iam/.

  2. In the navigation pane, choose Users.

  3. Choose your user name.

The Permissions tab lists all the IAM policies that are attached to your user name. Choose a policy to view its details. Then compare the information in the policy to the following list of actions that you must be allowed to perform in order to configure the repository.

Macie

For Macie, verify that you're allowed to perform the following action:

macie2:PutClassificationExportConfiguration

This action allows you to add or change the repository settings in Macie.

Amazon S3

For Amazon S3, verify that you're allowed to perform the following actions:

  • s3:CreateBucket

  • s3:GetBucketLocation

  • s3:ListAllMyBuckets

  • s3:PutBucketAcl

  • s3:PutBucketPolicy

  • s3:PutBucketPublicAccessBlock

  • s3:PutObject

These actions enable you to access and configure an S3 bucket that can serve as the repository.

AWS KMS

For AWS KMS, verify that you're allowed to perform the following action:

kms:ListAliases

This action enables you to retrieve information about AWS KMS keys that can encrypt the data in the repository. If you plan to create a new AWS KMS key to encrypt the data, you also need to be allowed to perform the following actions: kms:CreateKey, kms:GetKeyPolicy, and kms:PutKeyPolicy.

If you're not allowed to perform one or more of the preceding actions, ask your AWS administrator for assistance before you proceed to the next step.

Step 2: Define the AWS KMS key and policy

When you configure Macie to store your sensitive data discovery results in an S3 bucket, you specify which AWS KMS key you want Macie to use to encrypt the results. This key must be a symmetric, customer master key (CMK) that's in the same AWS Region as the S3 bucket where you want to store the results. The key can be an existing CMK, or a new CMK that you create before you configure the repository settings in Macie.

If you want to use a new CMK, create the key before proceeding. To learn how, see Creating keys in the AWS Key Management Service Developer Guide. If you want to use an existing key that's owned by another account, sign in to the account that owns the key and note the Amazon Resource Name (ARN) of the key. You'll need to enter this ARN when you configure the repository settings in Macie. To learn how to find the ARN of a key, see Finding the key ID and ARN in the AWS Key Management Service Developer Guide.

After you determine which CMK you want to use, give Macie permission to use the key. Otherwise, Macie won't be able to encrypt or store discovery results in the repository. To give Macie permission to use the key, change the key's policy.

To change the key's policy

  1. Open the AWS KMS console at https://console.aws.amazon.com/kms.

  2. To change the AWS Region, use the Region selector in the upper-right corner of the page.

  3. Choose the key that you want to use to encrypt the results.

  4. On the Key policy tab, choose Edit.

  5. Add the following statement to the policy:

    { "Sid": "Allow Macie to use the key", "Effect": "Allow", "Principal": { "Service": "macie.amazonaws.com" }, "Action": [ "kms:GenerateDataKey", "kms:Encrypt" ], "Resource": "*" }

    When you add this statement to the policy, make sure that the syntax is valid. Policies use JSON format. This means that you need to also add a comma before or after the statement, depending on where you add the statement to the policy.

    If you add the statement as the last statement, add a comma after the closing curly brace for the preceding section. If you add it as the first statement or between two existing statements, add a comma after the closing curly brace. The following examples show you how to add the statement to a default key policy.

    The following example shows a default key policy that doesn't grant any additional permissions:

    { "Id": "key-consolepolicy", "Version": "2012-10-17", "Statement": [ { "Sid": "Enable IAM user permissions", "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::111122223333:root" }, "Action": "kms:*", "Resource": "*" } ] }

    The following example shows you how to add the statement as the first statement in the policy:

    { "Id": "key-consolepolicy", "Version": "2012-10-17", "Statement": [ { "Sid": "Allow Macie to use the key", "Effect": "Allow", "Principal": {"Service": "macie.amazonaws.com"}, "Action": [ "kms:GenerateDataKey", "kms:Encrypt" ], "Resource": "*" }, <-- Add a comma after this curly brace { "Sid": "Enable IAM user permissions", "Effect": "Allow", "Principal": {"AWS": "arn:aws:iam::111122223333:root"}, "Action": "kms:*", "Resource": "*" } ] }

    The following example shows you how to add the statement as the last statement in the policy:

    { "Id": "key-consolepolicy", "Version": "2012-10-17", "Statement": [ { "Sid": "Enable IAM user permissions", "Effect": "Allow", "Principal": {"AWS": "arn:aws:iam::111122223333:root"}, "Action": "kms:*", "Resource": "*" }, <-- Add a comma after this curly brace { "Sid": "Allow Macie to use the key", "Effect": "Allow", "Principal": {"Service": "macie.amazonaws.com"}, "Action": [ "kms:GenerateDataKey", "kms:Encrypt" ], "Resource": "*" } ] }
  6. When you finish adding the statement, choose Save changes.

Step 3: Specify the S3 bucket to use

After you verify your permissions and define the AWS KMS key to use, you're ready to specify which S3 bucket you want to use as the repository for your sensitive data discovery results. You have two options:

  • Use a new S3 bucket that Macie creates – If you choose this option, Macie automatically creates a new S3 bucket for your discovery results. It also applies a bucket policy to the bucket. The policy allows Macie to create (put) objects in the bucket. To review this policy, choose View policy on the Amazon Macie console after you enter a name for the bucket.

  • Use an existing S3 bucket that you create – If you prefer to store your discovery results in a particular S3 bucket that you create, create the bucket before you proceed. Then update the bucket's policy to ensure that it allows Macie to create (put) objects in the bucket. This topic explains how to update the policy. It also provides samples of the statements to add to the policy.

The following sections provide step-by-step instructions for each of these options. Choose the section for the option that you want.

If you prefer to use a new S3 bucket that Macie creates for you, the final step in the process is to configure the repository settings in Macie.

To configure the repository settings in Macie

  1. Open the Macie console at https://console.aws.amazon.com/macie/.

  2. In the navigation pane, under Settings, choose Discovery results.

  3. Under Repository for sensitive data discovery results, choose Create bucket.

  4. In the Create a bucket box, enter a name for the bucket. The name must be unique across all S3 buckets. In addition, the name must start with a lowercase letter or a number.

  5. (Optional) To specify a path prefix to use in the path to a location in the bucket, expand the Advanced section. Then, for Data discovery result prefix, enter the path prefix to use.

    When you enter a value, Macie updates the example below the field to show the path to the bucket location where it will store your discovery results.

  6. For Block all public access, choose whether to enable all block public access settings for the bucket. For information about these settings, see Using Amazon S3 block public access in the Amazon Simple Storage Service Developer Guide.

  7. Under KMS encryption, specify the AWS KMS key that you want to use to encrypt the results:

    • To use a key for your own account, choose Select a key from your account. Then, from the KMS key alias list, choose the alias of the key to use.

    • To use a key that's owned by another account and you're authorized to use, choose Enter the ARN of a key in another account. Then, in the KMS key ARN field, enter the ARN of the key to use.

    The key must be a symmetric, customer master key (CMK) that's in the same Region as the S3 bucket.

  8. When you finish entering the settings, choose Save. Macie then tests the settings to verify that they're correct. If any settings are incorrect, it displays an error message to help you address the issue.

After you save the repository settings, Macie adds existing discovery results for the preceding 90 days to the repository. Macie also starts adding new discovery results to the repository.

If you prefer to store your sensitive data discovery results in a particular S3 bucket that you create, create the bucket before you proceed.

After you create the bucket, add a bucket policy that allows Macie to retrieve information about the bucket and create (put) objects in the bucket. You can then configure the repository settings in Macie.

Important

If you change the bucket path after you configure the repository settings in Macie, you have to update the bucket policy. Otherwise, Macie won't be allowed to add discovery results to the bucket.

To add the bucket policy to the bucket

  1. Open the Amazon S3 console at https://console.aws.amazon.com/s3/.

  2. Choose the bucket that you want to store your discovery results in.

  3. Choose the Permissions tab.

  4. In the Bucket policy section, choose Edit.

  5. Copy the following example policy to your clipboard:

    { "Version": "2012-10-17", "Statement": [ { "Sid": "Allow Macie to use the GetBucketLocation operation", "Effect": "Allow", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:GetBucketLocation", "Resource": "arn:aws:s3:::myBucketName" }, { "Sid": "Allow Macie to upload objects to the bucket", "Effect": "Allow", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:PutObject", "Resource": "arn:aws:s3:::myBucketName/[optional prefix]/*" }, { "Sid": "Deny unencrypted object uploads. This is optional", "Effect": "Deny", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:PutObject", "Resource": "arn:aws:s3:::myBucketName/[optional prefix]/*", "Condition": { "StringNotEquals": { "s3:x-amz-server-side-encryption": "aws:kms" } } }, { "Sid": "Deny incorrect encryption headers. This is optional", "Effect": "Deny", "Principal": { "Service": "macie.amazonaws.com" }, "Action": "s3:PutObject", "Resource": "arn:aws:s3:::myBucketName/[optional prefix]/*", "Condition": { "StringNotEquals": { "s3:x-amz-server-side-encryption-aws-kms-key-id": "arn:aws:kms:Region:111122223333:key/KMSKeyId" } } }, { "Sid": "Deny non-HTTPS access", "Effect": "Deny", "Principal": "*", "Action": "s3:*", "Resource": "arn:aws:s3:::myBucketName/*", "Condition": { "Bool": { "aws:SecureTransport": "false" } } } ] }
  6. Paste the example policy in the Bucket policy editor on the Amazon S3 console. Then replace the placeholder values with the correct values for your environment, where:

    • myBucketName is the name of the bucket.

    • Region is the AWS Region that hosts the AWS KMS customer master key (CMK) to use for encryption of the discovery results.

    • 111122223333 is your AWS account ID, or the AWS account ID for the account that owns the AWS KMS CMK to use for encryption of the discovery results.

    • KMSKeyId is the key ID of the AWS KMS CMK to use for encryption of the discovery results.

  7. When you finish updating the bucket policy, choose Save changes.

You can now configure the repository settings in Macie.

To configure the repository settings in Macie

  1. Open the Macie console at https://console.aws.amazon.com/macie/.

  2. In the navigation pane, under Settings, choose Discovery results.

  3. Under Repository for sensitive data discovery results, choose Existing bucket.

  4. For Choose a bucket, select the bucket that you want to store your discovery results in.

  5. (Optional) To specify a path prefix to use in the path to a location in the bucket, expand the Advanced section. Then, for Data discovery result prefix, enter the path prefix to use.

    When you enter a value, Macie updates the example below the field to show the path to the bucket location where it will store your discovery results.

  6. Under KMS encryption, specify the AWS KMS key that you want to use to encrypt the results:

    • To use a key for your own account, choose Select a key from your account. Then, from the KMS key alias list, choose the alias of the key to use.

    • To use a key that's owned by another account and you're authorized to use, choose Enter the ARN of a key in another account. Then, in the KMS key ARN field, enter the ARN of the key to use.

    The key must be a symmetric, customer master key (CMK) in the same Region as the S3 bucket that you specified.

  7. When you finish entering the settings, choose Save. Macie then tests the settings to verify that they're correct. If any settings are incorrect, it displays an error message to help you address the issue.

After you save the repository settings, Macie adds existing discovery results for the preceding 90 days to the repository. Macie also starts adding new discovery results to the repository.

Troubleshooting errors

If an error occurs when Macie tries to add sensitive data discovery results to the repository, Macie displays an error message on the Repository for sensitive data discovery results page of the console. In addition, we notify you by sending email to the address that's associated with your AWS account. If you don't address the error, Macie stores backups of your discovery results for up to 90 days.

Errors typically occur because Macie loses access to the repository—for example, the S3 bucket was deleted or permissions for the bucket were changed. They also occur if the AWS KMS key that's used to encrypt the results becomes inaccessible. If an error occurs, use the information in this topic as a guide to walk through possible causes and solutions for the error. For example, review the policy for the AWS KMS key and confirm that it's still correct.

After you address the error, update the configuration settings in Macie. Macie then starts adding new discovery results to the repository. Macie also adds any existing results that it created and stored while the error existed (for up to 90 days).