# Storage & backup
<a name="storageandbackup-pattern-list"></a>

**Topics**
+ [Allow EC2 instances write access to S3 buckets in AMS accounts](allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts.md)
+ [Automate data stream ingestion into a Snowflake database by using Snowflake Snowpipe, Amazon S3, Amazon SNS, and Amazon Data Firehose](automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose.md)
+ [Automatically encrypt existing and new Amazon EBS volumes](automatically-encrypt-existing-and-new-amazon-ebs-volumes.md)
+ [Back up Sun SPARC servers in the Stromasys Charon-SSP emulator on the AWS Cloud](back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud.md)
+ [Back up and archive data to Amazon S3 with Veeam Backup & Replication](back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication.md)
+ [Copy data from an Amazon S3 bucket to another account and Region by using the AWS CLI](copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli.md)
+ [Enable DB2 log archiving directly to Amazon S3 in an IBM Db2 database](enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database.md)
+ [Migrate data from an on-premises Hadoop environment to Amazon S3 using DistCp with AWS PrivateLink for Amazon S3](migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3.md)
+ [More patterns](storageandbackup-more-patterns-pattern-list.md)

# Allow EC2 instances write access to S3 buckets in AMS accounts
<a name="allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts"></a>

*Mansi Suratwala, Amazon Web Services*

## Summary
<a name="allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts-summary"></a>

AWS Managed Services (AMS) helps you operate your AWS infrastructure more efficiently and securely. AMS accounts have security guardrails for standardized administration of your AWS resources. One guardrail is that default Amazon Elastic Compute Cloud (Amazon EC2) instance profiles don’t allow write access to Amazon Simple Storage Service (Amazon S3) buckets. However, your organization might have multiple S3 buckets and require more control over access by EC2 instances. For example, you might want to store database backups from EC2 instances in an S3 bucket.

This pattern explains how to use requests for change (RFCs) to allow your EC2 instances write access to S3 buckets in your AMS account. An RFC is a request created by you or AMS to make a change in your managed environment and that includes a [change type](https://docs.aws.amazon.com/managedservices/latest/ctref/classifications.html) (CT) ID for a particular operation.

## Prerequisites and limitations
<a name="allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts-prereqs"></a>

**Prerequisites **
+ An AMS Advanced account. For more information about this, see [AMS operations plans](https://docs.aws.amazon.com/managedservices/latest/accelerate-guide/what-is-ams-op-plans.html) in the AMS documentation. 
+ Access to the AWS Identity and Access Management (IAM) `customer-mc-user-role` role to submit RFCs. 
+ AWS Command Line Interface (AWS CLI), installed and configured with the EC2 instances in your AMS account. 
+ An understanding of how to create and submit RFCs in AMS. For more information about this, see [What are AMS change types?](https://docs.aws.amazon.com/managedservices/latest/ctref/what-are-change-types.html) in the AMS documentation.
+ An understanding of manual and automated change types (CTs). For more information about this, see [Automated and manual CTs](https://docs.aws.amazon.com/managedservices/latest/userguide/ug-automated-or-manual.html) in the AMS documentation.

## Architecture
<a name="allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts-architecture"></a>

**Technology stack  **
+ AMS
+ AWS CLI
+ Amazon EC2
+ Amazon S3
+ IAM

## Tools
<a name="allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts-tools"></a>
+ [AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) is an open-source tool that helps you interact with AWS services through commands in your command-line shell.
+ [AWS Identity and Access Management (IAM)](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) helps you securely manage access to your AWS resources by controlling who is authenticated and authorized to use them.
+ [AWS Managed Services (AMS)](https://docs.aws.amazon.com/managedservices/latest/userguide/what-is-ams.html) helps you operate your AWS infrastructure more efficiently and securely. 
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.
+ [Amazon Elastic Compute Cloud (Amazon EC2)](https://docs.aws.amazon.com/ec2/) provides scalable computing capacity in the AWS Cloud. You can launch as many virtual servers as you need and quickly scale them up or down.

## Epics
<a name="allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts-epics"></a>

### Create an S3 bucket with an RFC
<a name="create-an-s3-bucket-with-an-rfc"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create an S3 bucket by using an automated RFC. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts.html)Make sure that you record the S3 bucket's name. | AWS systems administrator, AWS developer | 

### Create an IAM instance profile and associate it with the EC2 instances
<a name="create-an-iam-instance-profile-and-associate-it-with-the-ec2-instances"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Submit a manual RFC to create an IAM role. | When an AMS account is onboarded, a default IAM instance profile named `customer-mc-ec2-instance-profile` is created and associated with each EC2 instance in your AMS account. However, the instance profile doesn’t have write permissions to your S3 buckets.To add the write permissions, submit the **Create IAM Resource** manual RFC to create an IAM role that has the following three policies: `customer_ec2_instance_`, `customer_deny_policy`, and `customer_ec2_s3_integration_policy`. The `customer_ec2_instance_` and `customer_deny_policy` policies already exist in your AMS account. However, you need to create `customer_ec2_s3_integration_policy` by using the following sample policy:<pre>{<br />  "Version": "2012-10-17",		 	 	 <br />   "Statement": [<br />    {<br />      "Sid": "",<br />       "Effect": "Allow",<br />       "Principal": {<br />         "Service": "ec2.amazonaws.com"<br />      },<br />       "Action": "sts:AssumeRole"<br />    }<br />  ]<br />}<br /> <br />Role Permissions:<br />{<br />     "Version": "2012-10-17",		 	 	 <br />     "Statement": [<br />        {<br />             "Action": [<br />                 "s3:ListBucket",<br />                 "s3:GetBucketLocation"<br />            ],<br />             "Resource": "arn:aws:s3:::",<br />             "Effect": "Allow"<br />        },<br />        {<br />             "Action": [<br />                 "s3:GetObject",<br />                 "s3:PutObject",<br />                 "s3:ListMultipartUploadParts",<br />                 "s3:AbortMultipartUpload"<br />            ],<br />             "Resource": "arn:aws:s3:::/*",<br />             "Effect": "Allow"<br />        }<br />    ]<br />}</pre> | AWS systems administrator, AWS developer | 
| Submit a manual RFC to replace the IAM instance profile. | Submit a manual RFC to associate the target EC2 instances with the new IAM instance profile. | AWS systems administrator, AWS developer | 
| Test a copy operation to the S3 bucket. | Test a copy operation to the S3 bucket by running the following command in the AWS CLI:<pre>aws s3 cp test.txt s3://<S3 bucket>/test2.txt</pre> | AWS systems administrator, AWS developer | 

## Related resources
<a name="allow-ec2-instances-write-access-to-s3-buckets-in-ams-accounts-resources"></a>
+ [Create an IAM instance profile for your Amazon EC2 instances](https://docs.aws.amazon.com/codedeploy/latest/userguide/getting-started-create-iam-instance-profile.html)
+ [Creating an S3 bucket (using the Amazon S3 console, AWS SDKs, or AWS CLI)](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-bucket.html)

# Automate data stream ingestion into a Snowflake database by using Snowflake Snowpipe, Amazon S3, Amazon SNS, and Amazon Data Firehose
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose"></a>

*Bikash Chandra Rout, Amazon Web Services*

## Summary
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-summary"></a>

This pattern describes how you can use services on the Amazon Web Services (AWS) Cloud to process a continuous stream of data and load it into a Snowflake database. The pattern uses Amazon Data Firehose to deliver the data to Amazon Simple Storage Service (Amazon S3), Amazon Simple Notification Service (Amazon SNS) to send notifications when new data is received, and Snowflake Snowpipe to load the data into a Snowflake database.

By following this pattern, you can have continuously generated data available for analysis in seconds, avoid multiple manual `COPY` commands, and have full support for semi-structured data on load.

## Prerequisites and limitations
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-prereqs"></a>

**Prerequisites **
+ An active AWS account.
+ A data source that is continuously sending data to a Firehose delivery stream.
+ An existing S3 bucket that is receiving the data from the Firehose delivery stream.
+ An active Snowflake account.

**Limitations **
+ Snowflake Snowpipe doesn't connect directly to Firehose.

## Architecture
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-architecture"></a>

![\[Data ingested by Firehose goes to Amazon S3, Amazon SNS, Snowflake Snowpipe, and the Snowflake DB.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/images/pattern-img/0c6f473b-973f-4229-a12e-ef697ae9b299/images/0adee3fb-1b90-4f7d-b2d0-b3b958f62c75.png)


**Technology stack **
+ Amazon Data Firehose
+ Amazon SNS
+ Amazon S3
+ Snowflake Snowpipe
+ Snowflake database

## Tools
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-tools"></a>
+ [Amazon Data Firehose](https://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html) is a fully managed service for delivering real-time streaming data to destinations such as Amazon S3, Amazon Redshift, Amazon OpenSearch Service, Splunk, and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers.
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html) is storage for the internet.
+ [Amazon Simple Notification Service (Amazon SNS)](https://docs.aws.amazon.com/sns/latest/dg/welcome.html) coordinates and manages the delivery or sending of messages to subscribing endpoints or clients.
+ [Snowflake](https://www.snowflake.com/) – Snowflake is an analytic data warehouse provided as Software-as-a-Service (SaaS).
+ [Snowflake Snowpipe](https://docs.snowflake.com/en/user-guide/data-load-snowpipe-intro.html) – Snowpipe loads data from files as soon as they’re available in a Snowflake stage.

## Epics
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-epics"></a>

### Set up a Snowflake Snowpipe
<a name="set-up-a-snowflake-snowpipe"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create a CSV file in Snowflake. | Sign in to Snowflake and run the `CREATE FILE FORMAT` command to create a CSV file with a specified field delimiter. For more information about this and other Snowflake commands, see the [Additional information](#automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-additional) section. | Developer | 
| Create an external Snowflake stage. | Run the `CREATE STAGE` command to create an external Snowflake stage that references the CSV file you created earlier. Important: You will need the URL for the S3 bucket, your AWS access key, and your AWS secret access key. Run the `SHOW STAGES` command to verify that the Snowflake stage is created. | Developer  | 
| Create the Snowflake target table. | Run the `CREATE TABLE` command to create the Snowflake table. | Developer | 
| Create a pipe. | Run the `CREATE PIPE` command; make sure that `auto_ingest=true` is in the command. Run the `SHOW PIPES` command to verify that the pipe is created. Copy and save the `notification_channel` column value. This value will be used to configure Amazon S3 event notifications. | Developer | 

### Configure the S3 bucket
<a name="configure-the-s3-bucket"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create a 30-day lifecycle policy for the S3 bucket. | Sign in to the AWS Management Console and open the Amazon S3 console. Choose the S3 bucket that contains the data from Firehose. Then choose the **Management** tab in the S3 bucket and choose **Add lifecycle rule**. Enter a name for your rule in the **Lifecycle rule** dialog box, and configure a 30-day lifecycle rule for your bucket. For help with this and other stories, see the [Related resources](#automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-resources) section. | System Administrator, Developer | 
| Create an IAM policy for the S3 bucket. | Open the AWS Identity and Access Management (IAM) console and choose **Policies**. Choose **Create policy**, and choose the **JSON** tab. Copy and paste the policy from the [Additional information](#automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-additional) section into the JSON field. This policy will grant `PutObject` and `DeleteObject` permissions, as well as `GetObject`, `GetObjectVersion`, and `ListBucket` permissions. Choose **Review policy**, enter a policy name, and then choose **Create policy**. | System Administrator, Developer | 
| Assign the policy to an IAM role. | Open the IAM console, choose **Roles**, and then choose **Create role**. Choose **Another AWS account** as the trusted entity. Enter your AWS account ID, and choose **Require external ID**. Enter a placeholder ID that you will change it later. Choose **Next**, and assign the IAM policy you created earlier. Then create the IAM role. | System Administrator, Developer | 
| Copy the Amazon Resource Name (ARN) for the IAM role. | Open the IAM console, and choose **Roles**. Choose the IAM role you created earlier, and then copy and store the **Role ARN**. | System Administrator, Developer | 

### Set up a storage integration in Snowflake
<a name="set-up-a-storage-integration-in-snowflake"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create a storage integration in Snowflake. | Sign in to Snowflake and run the `CREATE STORAGE INTEGRATION` command. This will modify the trusted relationship, grant access to Snowflake, and provide the external ID for your Snowflake stage. | System Administrator, Developer | 
| Retrieve the IAM role for your Snowflake account. | Run the `DESC INTEGRATION` command to retrieve the ARN for the IAM role.`<integration_ name>` is the name of the Snowflake storage integration you created earlier. | System Administrator, Developer | 
| Record two column values. | Copy and save the values for the `storage_aws_iam_user_arn` and `storage_aws_external_id` columns. | System Administrator, Developer | 

### Allow Snowflake Snowpipe to access the S3 bucket
<a name="allow-snowflake-snowpipe-to-access-the-s3-bucket"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Modify the IAM role policy. | Open the IAM console and choose **Roles**. Choose the IAM role you created earlier and choose the **Trust relationships** tab. Choose **Edit trust relationship**. Replace `snowflake_external_id` with the `storage_aws_external_id` value you copied earlier. Replace `snowflake_user_arn` with the `storage_aws_iam_user_arn` value you copied earlier. Then choose **Update trust policy**. | System Administrator, Developer | 

### Turn on and configure SNS notifications for the S3 bucket
<a name="turn-on-and-configure-sns-notifications-for-the-s3-bucket"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Turn on event notifications for the S3 bucket. | Open the Amazon S3 console and choose your bucket. Choose **Properties**, and under **Advanced setting**s, choose **Events**. Choose **Add notification**, and enter a name for this event. If you don't enter a name, a globally unique identifier (GUID) will be used. | System Administrator, Developer | 
| Configure Amazon SNS notifications for the S3 bucket. | Under **Events**, choose **ObjectCreate (All)**, and then choose **SQS Queue** in the **Send to** dropdown list. In the **SNS** list, choose **Add SQS queue ARN**, and paste the `notification_channel` value you copied earlier. Then choose **Save**. | System Administrator, Developer | 
| Subscribe the Snowflake SQS queue to the SNS topic. | Subscribe the Snowflake SQS queue to the SNS topic you created. For help with this step, see the [Related resources](#automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-resources) section. | System Administrator, Developer | 

### Check the Snowflake stage integration
<a name="check-the-snowflake-stage-integration"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Check and test Snowpipe. | Sign in to Snowflake and open the Snowflake stage. Drop files into your S3 bucket and check if the Snowflake table loads them. Amazon S3 will send SNS notifications to Snowpipe when new objects appear in the S3 bucket. | System Administrator, Developer | 

## Related resources
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-resources"></a>
+ [Managing your storage lifecycle](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html)
+ [Subscribe the Snowflake SQS Queue to the Amazon SNS Topic](https://docs.snowflake.com/en/user-guide/data-load-snowpipe-auto-s3.html#prerequisite-create-an-amazon-sns-topic-and-subscription)

## Additional information
<a name="automate-data-stream-ingestion-into-a-snowflake-database-by-using-snowflake-snowpipe-amazon-s3-amazon-sns-and-amazon-data-firehose-additional"></a>

**Create a file format:**

```
CREATE FILE FORMAT <name>
TYPE = 'CSV'
FIELD_DELIMITER = '|'
SKIP_HEADER = 1;
```

**Create an external stage:**

```
externalStageParams (for Amazon S3) ::=
  URL = 's3://[//]'

  [ { STORAGE_INTEGRATION =  } | { CREDENTIALS = ( {  { AWS_KEY_ID = `` AWS_SECRET_KEY = `` [ AWS_TOKEN = `` ] } | AWS_ROLE = ``  } ) ) }` ]
  [ ENCRYPTION = ( [ TYPE = 'AWS_CSE' ] [ MASTER_KEY = '' ] |
                   [ TYPE = 'AWS_SSE_S3' ] |
                   [ TYPE = 'AWS_SSE_KMS' [ KMS_KEY_ID = '' ] |
                   [ TYPE = NONE ] )
```

**Create a table:**

```
CREATE [ OR REPLACE ] [ { [ LOCAL | GLOBAL ] TEMP[ORARY] | VOLATILE } | TRANSIENT ] TABLE [ IF NOT EXISTS ]
  <table_name>
    ( <col_name> <col_type> [ { DEFAULT <expr>
                               | { AUTOINCREMENT | IDENTITY } [ ( <start_num> , <step_num> ) | START <num> INCREMENT <num> ] } ]
                                /* AUTOINCREMENT / IDENTITY supported only for numeric data types (NUMBER, INT, etc.) */
                            [ inlineConstraint ]
      [ , <col_name> <col_type> ... ]
      [ , outoflineConstraint ]
      [ , ... ] )
  [ CLUSTER BY ( <expr> [ , <expr> , ... ] ) ]
  [ STAGE_FILE_FORMAT = ( { FORMAT_NAME = '<file_format_name>'
                           | TYPE = { CSV | JSON | AVRO | ORC | PARQUET | XML } [ formatTypeOptions ] } ) ]
  [ STAGE_COPY_OPTIONS = ( copyOptions ) ]
  [ DATA_RETENTION_TIME_IN_DAYS = <num> ]
  [ COPY GRANTS ]
  [ COMMENT = '<string_literal>' ]
```

**Show stages:**

```
SHOW STAGES;
```

**Create a pipe:**

```
CREATE [ OR REPLACE ] PIPE [ IF NOT EXISTS ] 
  [ AUTO_INGEST = [ TRUE | FALSE ] ]
  [ AWS_SNS_TOPIC =  ]
  [ INTEGRATION = '' ]
  [ COMMENT = '' ]
  AS
```

**Show pipes:**

```
SHOW PIPES [ LIKE '<pattern>' ]           
           [ IN { ACCOUNT | [ DATABASE ] <db_name> | [ SCHEMA ] <schema_name> } ]
```

**Create a storage integration:**

```
CREATE STORAGE INTEGRATION <integration_name>
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = S3
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = '<iam_role>'
  STORAGE_ALLOWED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/')
  [ STORAGE_BLOCKED_LOCATIONS = ('s3://<bucket>/<path>/', 's3://<bucket>/<path>/') ]
```

Example:

```
create storage integration s3_int
  type = external_stage
  storage_provider = s3
  enabled = true
  storage_aws_role_arn = 'arn:aws:iam::001234567890:role/myrole'
  storage_allowed_locations = ('s3://amzn-s3-demo-bucket1/mypath1/', 's3://amzn-s3-demo-bucket2/mypath2/')
  storage_blocked_locations = ('s3://amzn-s3-demo-bucket1/mypath1/sensitivedata/', 's3://amzn-s3-demo-bucket2/mypath2/sensitivedata/');
```

For more information about this step, see [Configuring a Snowflake storage integration to access Amazon S3](https://docs.snowflake.com/en/user-guide/data-load-s3-config-storage-integration.html) from the Snowflake documentation.

**Describe an integration:**

```
DESC INTEGRATION <integration_name>;
```

**S3 bucket policy:**

```
{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
              "s3:PutObject",
              "s3:GetObject",
              "s3:GetObjectVersion",
              "s3:DeleteObject",
              "s3:DeleteObjectVersion"
            ],
            "Resource": "arn:aws:s3::://*"
        },
        {
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::",
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "/*"
                    ]
                }
            }
        }
    ]
}
```

# Automatically encrypt existing and new Amazon EBS volumes
<a name="automatically-encrypt-existing-and-new-amazon-ebs-volumes"></a>

*Tony DeMarco and Josh Joy, Amazon Web Services*

## Summary
<a name="automatically-encrypt-existing-and-new-amazon-ebs-volumes-summary"></a>

Encryption of Amazon Elastic Block Store (Amazon EBS) volumes is important to an organization's data protection strategy. It is an important step in establishing a well-architected environment. Although there is no direct way to encrypt existing unencrypted EBS volumes or snapshots, you can encrypt them by creating a new volume or snapshot. For more information, see [Encrypt EBS resources](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html#encryption-parameters) in the Amazon EC2 documentation. This pattern provides preventative and detective controls for encrypting your EBS volumes, both new and existing. In this pattern, you configure account settings, create automated remediation processes, and implement access controls.

## Prerequisites and limitations
<a name="automatically-encrypt-existing-and-new-amazon-ebs-volumes-prereqs"></a>

**Prerequisites**
+ An active Amazon Web Services (AWS) account
+ [AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html), installed and configured on macOS, Linux, or Windows
+ [jq](https://stedolan.github.io/jq/download/), installed and configured on macOS, Linux, or Windows
+ AWS Identity and Access Management (IAM) permissions are provisioned to have read and write access to AWS CloudFormation, Amazon Elastic Compute Cloud (Amazon EC2), AWS Systems Manager, AWS Config, and AWS Key Management Service (AWS KMS)
+ AWS Organizations is configured with all features enabled, a requirement for service control policies
+ AWS Config is enabled in the target accounts

**Limitations**
+ In your target AWS account, there must be no AWS Config rules named **encrypted-volumes**. This solution deploys a rule with this name. Preexisting rules with this name can cause the deployment to fail and result in unnecessary charges related to processing the same rule more than once.
+ This solution encrypts all EBS volumes with the same AWS KMS key.
+ If you enable encryption of EBS volumes for the account, this setting is Region-specific. If you enable it for an AWS Region, you cannot disable it for individual volumes or snapshots in that Region. For more information, see [Encryption by default](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html#encryption-by-default) in the Amazon EC2 documentation.
+ When you remediate existing, unencrypted EBS volumes, ensure that the EC2 instance is not in use. This automation shuts down the instance in order to detach the unencrypted volume and attach the encrypted one. There is downtime while the remediation is in progress. If this is a critical piece of infrastructure for your organization, make sure that [manual](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/scenarios-enis.html#create-a-low-budget-high-availability-solution) or [automatic](https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html) high-availability configurations are in place so as to not impact the availability of any applications running on the instance. We recommend that you remediate critical resources only during standard maintenance windows.

## Architecture
<a name="automatically-encrypt-existing-and-new-amazon-ebs-volumes-architecture"></a>

**Automation workflow**

![\[High-level architecture diagram showing the automation process and services\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/images/pattern-img/484fd5fe-e10a-41f6-aafe-260ea824883b/images/483f551c-ca1d-4c1e-b3c7-989df7d3b059.png)


1. AWS Config detects an unencrypted EBS volume.

1. An administrator uses AWS Config to send a remediation command to Systems Manager.

1. The Systems Manager automation takes a snapshot of the unencrypted EBS volume.

1. The Systems Manager automation uses AWS KMS to create an encrypted copy of the snapshot.

1. The Systems Manager automation does the following:

   1. Stops the affected EC2 instance if it is running

   1. Attaches the new, encrypted copy of the volume to the EC2 instance

   1. Returns the EC2 instance to its original state

## Tools
<a name="automatically-encrypt-existing-and-new-amazon-ebs-volumes-tools"></a>

**AWS services**
+ [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) – The AWS Command Line Interface (AWS CLI) provides direct access to the public application programming interfaces (APIs) of AWS services. You can explore a service's capabilities with the AWS CLI and develop shell scripts to manage your resources. In addition to the low-level API-equivalent commands, several AWS services provide customizations for the AWS CLI. Customizations can include higher-level commands that simplify using a service with a complex API.
+ [AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html) – AWS CloudFormation is a service that helps you model and set up your AWS resources. You create a template that describes all the AWS resources that you want (such as Amazon EC2 instances), and CloudFormation provisions and configures those resources for you.
+ [AWS Config](https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html) – AWS Config provides a detailed view of the configuration of AWS resources in your AWS account. This includes how the resources are related to one another and how they were configured in the past so that you can see how the configurations and relationships change over time.
+ [Amazon EC2](https://docs.aws.amazon.com/ec2/?id=docs_gateway) – Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable computing capacity that you use to build and host your software systems.
+ [AWS KMS](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html) – AWS Key Management Service (AWS KMS) is an encryption and key management service scaled for the cloud. AWS KMS keys and functionality are used by other AWS services, and you can use them to protect data in your AWS environment.
+ [AWS Organizations](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_introduction.html) – AWS Organizations is an account management service that enables you to consolidate multiple AWS accounts into an organization that you create and centrally manage.
+ [AWS Systems Manager Automation](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-automation.html) – Systems Manager Automation simplifies common maintenance and deployment tasks for Amazon EC2 instances and other AWS resources.

**Other services**
+ [jq](https://stedolan.github.io/jq/download/) – jq is a lightweight and flexible command-line JSON processor. You use this tool to extract key information from the AWS CLI output.

**Code**
+ The code for this pattern is available in the GitHub [Automatically remediate unencrypted EBS Volumes using customer KMS keys](https://github.com/aws-samples/aws-system-manager-automation-unencrypted-to-encrypted-resources/tree/main/ebs) repository.

## Epics
<a name="automatically-encrypt-existing-and-new-amazon-ebs-volumes-epics"></a>

### Automate remediation of unencrypted volumes
<a name="automate-remediation-of-unencrypted-volumes"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Download scripts and CloudFormation templates. | Download the shell script, JSON file, and CloudFormation templates from the GitHub [Automatically remediate unencrypted EBS Volumes using customer KMS keys ](https://github.com/aws-samples/aws-system-manager-automation-unencrypted-to-encrypted-resources/tree/main/ebs)repository. | AWS administrator, General AWS | 
| Identify the administrator for the AWS KMS key. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html) | AWS administrator, General AWS | 
| Deploy the Stack1 CloudFormation template. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html)For more information about deploying a CloudFormation template, see [Working with AWS CloudFormation templates](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-guide.html) in the CloudFormation documentation. | AWS administrator, General AWS | 
| Deploy the Stack2 CloudFormation template. | In CloudFormation, deploy the `Stack2.yaml` template. Note the following deployment details:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html) | AWS administrator, General AWS | 
| Create an unencrypted volume for testing. | Create an EC2 instance with an unencrypted EBS volume. For instructions, see [Create an Amazon EBS volume](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-creating-volume.html) in the Amazon EC2 documentation. The instance type does not matter, and access to the instance is not needed. You can create a t2.micro instance to stay in the free tier, and you don’t need to create a key pair. | AWS administrator, General AWS | 
| Test the AWS Config rule. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html)You can view the remediation progress and status in Systems Manager as follows:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html) | AWS administrator, General AWS | 
| Configure additional accounts or AWS Regions. | As needed for your use case, repeat this epic for any additional accounts or AWS Regions. | AWS administrator, General AWS | 

### Enable account-level encryption of EBS volumes
<a name="enable-account-level-encryption-of-ebs-volumes"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Run the enable script. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html) | AWS administrator, General AWS, bash | 
| Confirm the settings are updated. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html) | AWS administrator, General AWS | 
| Configure additional accounts or AWS Regions. | As needed for your use case, repeat this epic for any additional accounts or AWS Regions. | AWS administrator, General AWS | 

### Prevent creation of unencrypted instances
<a name="prevent-creation-of-unencrypted-instances"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create a service control policy. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/automatically-encrypt-existing-and-new-amazon-ebs-volumes.html) | AWS administrator, General AWS | 

## Related resources
<a name="automatically-encrypt-existing-and-new-amazon-ebs-volumes-resources"></a>

**AWS service documentation**
+ [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html)
+ [AWS Config](https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html)
+ [AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html)
+ [Amazon EC2](https://docs.aws.amazon.com/ec2/?id=docs_gateway)
+ [AWS KMS](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html)
+ [AWS Organizations](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_introduction.html)
+ [AWS Systems Manager Automation](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-automation.html)

**Other resources**
+ [jq manual](https://stedolan.github.io/jq/manual/) (jq website)
+ [jq download](https://github.com/stedolan/jq) (GitHub)

# Back up Sun SPARC servers in the Stromasys Charon-SSP emulator on the AWS Cloud
<a name="back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud"></a>

*Kevin Yung and Rohit Darji, Amazon Web Services*

*Luis Ramos, Stromasys*

## Summary
<a name="back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud-summary"></a>

This pattern provides four options for backing up your Sun Microsystems SPARC servers after a migration from an on-premises environment to the Amazon Web Services (AWS) Cloud. These backup options help you to implement a backup plan that meets your organization’s recovery point objective (RPO) and recovery time objective (RTO), uses automated approaches, and lowers your overall operational costs. The pattern provides an overview of the four backup options and steps to implement them.

If you use a Sun SPARC server hosted as a guest on a [Stromasys Charon-SSP emulator](https://www.stromasys.com/solution/charon-on-the-aws-cloud/), you can use one of the following three backup options:
+ **Backup option 1: Stromasys virtual tape **– Use the Charon-SSP virtual tape feature to set up a backup facility in the Sun SPARC server and archive your backup files to [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) using [AWS Systems Manager Automation](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-automation.html). 
+ **Backup option 2: Stromasys snapshot **– Use the Charon-SSP snapshot feature to set up a backup facility for the Sun SPARC guest servers in Charon-SSP. 
+ **Backup option 3: Amazon Elastic Block Store (Amazon EBS) volume snapshot **–** **If you host the Charon-SSP emulator on Amazon Elastic Compute Cloud (Amazon EC2), you can use an [Amazon EBS volume snapshot](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html) to create backups for a Sun SPARC file system.** **

If you use a Sun SPARC server hosted as a guest on hardware and Charon-SSP on Amazon EC2, you can use the following backup option:
+ **Backup option 4: AWS Storage Gateway virtual tape library (VTL) **– Use a backup application with a [Storage Gateway](https://docs.aws.amazon.com/storagegateway/latest/userguide/WhatIsStorageGateway.html) VTL Tape Gateway to back up the Sun SPARC servers. 

If you use a Sun SPARC server hosted as a branded zone in a Sun SPARC server, you can use backup options 1, 2, and 4.

[Stromasys](https://www.stromasys.com) provides software and services to emulate legacy SPARC, Alpha, VAX, and PA-RISC critical systems. For more information about migrating to the AWS Cloud using Stromasys emulation, see [Rehosting SPARC, Alpha, or other legacy systems to AWS with Stromasys](https://aws.amazon.com/blogs/apn/re-hosting-sparc-alpha-or-other-legacy-systems-to-aws-with-stromasys/) on the AWS Blog.  

## Prerequisites and limitations
<a name="back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud-prereqs"></a>

**Prerequisites **
+ An active AWS account. 
+ Existing Sun SPARC servers.
+ Existing licenses for Charon-SSP. Licenses for Charon-SSP are available from AWS Marketplace and licenses for Stromasys Virtual Environment (VE) are available from Stromasys. For more information, contact [Stromasys sales](https://www.stromasys.com/contact/).
+ Familiarity with Sun SPARC servers and Linux backups. 
+ Familiarity with Charon-SSP emulation technology. For more information about this, see [Stromasys legacy server emulation](https://www.stromasys.com/solutions/charon-on-the-aws-cloud/) in the Stromasys documentation.
+ If you want to use the virtual tape facility or backup applications for your Sun SPARC servers file systems, you must create and configure the backup facilities for the Sun SPARC server file system. 
+ An understanding of RPO and RTO. For more information about this, see [Disaster recovery objectives](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/disaster-recovery-dr-objectives.html) from the [Reliability Pillar](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/welcome.html) whitepaper in the AWS Well-Architected Framework documentation. 
+ To use **Backup option 4**, you must have the following: 
  + A software-based backup application that supports a Storage Gateway VTL Tape Gateway. For more information about this, see [Working with VTL devices](https://docs.aws.amazon.com/storagegateway/latest/tgw/WhatIsStorageGateway.html) in the AWS Storage Gateway documentation. 
  + Bacula Director or a similar backup application, installed and configured. For more information about this, see the [Bacula Director](https://www.bacula.org/5.2.x-manuals/en/main/main/Configuring_Director.html) documentation.

The following table provides information about the four backup options in this pattern.


| 
| 
| **Backup options** | **Achieves crash consistency?** | **Achieves application consistency?** | **Virtual backup appliance solution?** | Typical use cases | 
| --- |--- |--- |--- |--- |
| **Option 1 – Stromasys virtual tape ** | **Yes**You can automate Sun SPARC file system snapshots to back up data in a virtual tape. For example, you can use UFS or ZFS snapshots. | **Yes**This backup option requires an automated script to flush in-flight transactions, configure a read-only or temporary offline mode during the file system snapshot, or take an application data dump. You might also require application downtime or read-only mode. | **Yes** | Sun SPARC server file systems backup with .tar or .zip filesApplication data backup | 
| **Option 2 – Stromasys snapshot ** | **Yes**You must configure [Charon-SSP Manager](https://stromasys.atlassian.net/wiki/spaces/DocCHSSP40preAWS/pages/522190974/Charon-SSP+Manager+Installation%20/) or use a command-line startup argument to enable this feature.You must also run a Linux command to ask the Charon-SSP emulator to save the Sun SPARC guest server state into a snapshot file.You must shut down the Sun SPARC guest server.  | **Yes**This backup option creates a snapshot of the emulated guest server, including its virtual disks and memory dump. You must shut down the Sun SPARC guest server during the snapshot. | **No** | Sun SPARC server snapshotApplication data backup | 
| **Option 3 – Amazon EBS volume snapshot ** | **Yes**You can use AWS Backup to automate the Amazon EBS snapshot. | **Yes**This backup option requires an automated script to flush in-flight transactions and configure a read-only or temporary stop of the Amazon EC2 instance during the Amazon EBS volume snapshot.  This backup option might require application downtime or read-only mode to achieve application consistency.  | **No** | Sun SPARC server file systems snapshotApplication data backup | 
| **Option 4 – AWS Storage Gateway VTL** | **Yes**You can automatically back up Sun SPARC file system backup data to the VTL by using a backup agent. | **Yes**This backup option requires an automated script to flush in-flight transactions and configure a read-only or temporary offline mode during the file system snapshot or application data dump.This backup option might require application downtime or read-only mode. | **Yes** | A large fleet of Sun SPARC server file system backupsApplication data backup | 

**Limitations**
+ You can use this pattern's approaches to back up individual Sun SPARC servers, but you can also use these backup options for shared data if you have applications that run in a cluster.

## Tools
<a name="back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud-tools"></a>

**Backup option 1: Stromasys virtual tape**
+ [Stromasys Charon-SSP emulator](https://stromasys.atlassian.net/wiki/spaces/KBP/pages/39158045/CHARON-SSP) creates the virtual replica of the original SPARC hardware inside a standard 64-bit x86 compatible computer system. It runs the original SPARC binary code, including operating systems (OSs) such as SunOS or Solaris, their layered products, and applications.
+ [Amazon Elastic Compute Cloud (Amazon EC2)](https://docs.aws.amazon.com/ec2/index.html) is a web service that provides resizable computing capacity that you use to build and host your software systems.
+ [Amazon Elastic File System (Amazon EFS)](https://docs.aws.amazon.com/efs/latest/ug/whatisefs.html) provides a simple, serverless, set-and-forget elastic file system for use with AWS services and on-premises resources.
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is storage for the internet. 
+ [AWS Systems Manager Automation](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-automation.html) simplifies common maintenance and deployment tasks of Amazon EC2 instances and other AWS resources.

 
**Backup option 2: Stromasys snapshot**
+ [Stromasys Charon-SSP emulator](https://stromasys.atlassian.net/wiki/spaces/KBP/pages/39158045/CHARON-SSP) creates the virtual replica of the original SPARC hardware inside a standard 64-bit x86 compatible computer system. It runs the original SPARC binary code, including OSs such as SunOS or Solaris, their layered products, and applications.
+ [Amazon Elastic Compute Cloud (Amazon EC2)](https://docs.aws.amazon.com/ec2/index.html) is a web service that provides resizable computing capacity that you use to build and host your software systems.
+ [Amazon Elastic File System (Amazon EFS)](https://docs.aws.amazon.com/efs/latest/ug/whatisefs.html) provides a simple, serverless, set-and-forget elastic file system for use with AWS services and on-premises resources.
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is storage for the internet. 
+ [AWS Systems Manager Automation](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-automation.html) simplifies common maintenance and deployment tasks of Amazon EC2 instances and other AWS resources.

 
**Backup option 3: ****Amazon EBS**** volume snapshot**
+ [Stromasys Charon-SSP emulator](https://stromasys.atlassian.net/wiki/spaces/KBP/pages/39158045/CHARON-SSP) emulator creates the virtual replica of the original SPARC hardware inside a standard 64-bit x86 compatible computer system. It runs the original SPARC binary code, including OSs such as SunOS or Solaris, their layered products, and applications.
+ [AWS Backup](https://docs.aws.amazon.com/aws-backup/latest/devguide/whatisbackup.html) is a fully-managed data protection service that makes it easy to centralize and automate across AWS services, in the cloud, and on premises.
+ [Amazon Elastic Block Store (Amazon EBS)](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html) provides block level storage volumes for use with Amazon EC2 instances.
+ [Amazon Elastic Compute Cloud (Amazon EC2)](https://docs.aws.amazon.com/ec2/index.html) is a web service that provides resizable computing capacity that you use to build and host your software systems.

 
**Backup option 4: ****AWS Storage Gateway**** VTL**
+ [Stromasys Charon-SSP emulator](https://stromasys.atlassian.net/wiki/spaces/KBP/pages/39158045/CHARON-SSP) creates the virtual replica of the original SPARC hardware inside a standard 64-bit x86 compatible computer system. It runs the original SPARC binary code, including OSs such as SunOS or Solaris, their layered products, and applications.
+ [Bacula](https://www.baculasystems.com/try/?gclid=EAIaIQobChMInsywntC98gIVkT2tBh16ug3_EAAYASAAEgL-nPD_BwE) is an open-source, enterprise-level computer backup system. For more information about whether your existing backup application supports Tape Gateway, see [Supported third-party backup applications for a Tape Gateway](https://docs.aws.amazon.com/storagegateway/latest/userguide/Requirements.html#requirements-backup-sw-for-vtl) in the AWS Storage Gateway documentation. 
+ [Amazon Elastic Compute Cloud (Amazon EC2)](https://docs.aws.amazon.com/ec2/index.html) is a web service that provides resizable computing capacity that you use to build and host your software systems.
+ [Amazon Relational Database Service (Amazon RDS) for MySQL](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_MySQL.html) supports DB instances running several versions of MySQL. 
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is storage for the internet. 
+ [AWS Storage Gateway](https://docs.aws.amazon.com/storagegateway/latest/userguide/WhatIsStorageGateway.html) connects an on-premises software appliance with cloud-based storage to provide seamless integration with data security features between your on-premises IT environment and the AWS storage infrastructure.

## Epics
<a name="back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud-epics"></a>

### Backup option 1 – Create a Stromasys virtual tape backup
<a name="backup-option-1-ndash-create-a-stromasys-virtual-tape-backup"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create an Amazon EFS shared file system for virtual tape file storage. | Sign in to the AWS Management Console or use the AWS Command Line Interface (AWS CLI) to create an Amazon EFS file system.For more information about this, see [Create an Amazon EFS file system](https://docs.aws.amazon.com/efs/latest/ug/gs-step-two-create-efs-resources.html) in the Amazon EFS documentation. | Cloud architect | 
| Configure the Linux host to mount the shared file system. | Install the Amazon EFS driver on the Amazon EC2 Linux instance and configure the Linux OS to mount the Amazon EFS shared file system during startup.For more information about this, see [Mounting file systems using the Amazon EFS mount helper](https://docs.aws.amazon.com/efs/latest/ug/efs-mount-helper.html) in the Amazon EFS documentation. | DevOps engineer | 
| Install the Charon-SSP emulator. | Install the Charon-SSP emulator on the Amazon EC2 Linux instance.For more information about this, see [Setting up an AWS Cloud instance for Charon-SSP](https://stromasys.atlassian.net/wiki/spaces/DocCHSSP405AWS/pages/718241894/Setting+up+a+Charon-SSP+AWS+Cloud+Instance) in the Stromasys documentation. | DevOps engineer | 
| Create a virtual tape file container in the shared file system for each Sun SPARC guest server. | Run the `touch <vtape-container-name>` command to create a virtual tape file container in the shared file system for each Sun SPARC guest server deployed in the Charon-SSP emulator. | DevOps engineer | 
| Configure Charon-SSP Manager to create virtual tape devices for the Sun SPARC guest servers. | Log in to Charon-SSP Manager, create virtual tape devices, and configure them to use the virtual tape container files for each Sun SPARC guest server.For more information about this, see the [Charon-SSP 5.2 for Linux user guide](https://stromasys.atlassian.net/wiki/spaces/KBP/pages/76429819926/CHARON-SSP+V5.2+for+Linux) in the Stromasys documentation. | DevOps engineer | 
| Validate that the virtual tape device is available in the Sun SPARC guest servers. | Log in to each Sun SPARC guest server and run the `mt -f /dev/rmt/1` command to validate that the virtual tape device is configured in the OS. | DevOps engineer | 
| Develop the Systems Manager Automation runbook and automation. | Develop the Systems Manager Automation runbook and set up maintenance windows and associations in Systems Manager for scheduling the backup process.For more information about this, see [Automation walkthroughs](https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-walk.html) and [Setting up maintenance windows](https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-maintenance-permissions.html) in the AWS Systems Manager documentation. | Cloud architect | 
| Configure Systems Manager Automation to archive rotated virtual tape container files. | Use the code sample from **Back option 1** in the *Additional information* section to develop a Systems Manager Automation runbook to archive rotated virtual tape container files to Amazon S3. | Cloud architect | 
| Deploy the Systems Manager Automation runbook for archiving and scheduling. | Deploy the Systems Manager Automation runbook and schedule it to automatically run in Systems Manager.For more information about this, see [Automation walkthroughs](https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-walk.html) in the Systems Manager documentation. | Cloud architect | 

### Backup option 2 – Create a Stromasys snapshot
<a name="backup-option-2-ndash-create-a-stromasys-snapshot"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create an Amazon EFS shared file system for virtual tape file storage. | Sign in to the AWS Management Console or use the AWS CLI to create an Amazon EFS file system.For more information about this, see [Create your Amazon EFS file system](https://docs.aws.amazon.com/efs/latest/ug/gs-step-two-create-efs-resources.html) in the Amazon EFS documentation. | Cloud architect | 
| Configure the Linux host to mount the shared file system. | Install the Amazon EFS driver in the Amazon EC2 Linux instance and configure the Linux OS to mount the Amazon EFS shared file system during startup.For more information about this, see [Mounting file systems using the Amazon EFS mount helper](https://docs.aws.amazon.com/efs/latest/ug/efs-mount-helper.html) in the Amazon EFS documentation.  | DevOps engineer | 
| Install the Charon-SSP emulator. | Install the Charon-SSP emulator on the Amazon EC2 Linux instance.For more information about this, see [Setting up an AWS Cloud instance for Charon-SSP](https://stromasys.atlassian.net/wiki/spaces/DocCHSSP44xAWSGS/pages/7239901201/Setting+up+an+AWS+Cloud+Instance+for+Charon-SSP) in the Stromasys documentation. | DevOps engineer | 
| Configure the Sun SPARC guest servers to start up with the snapshot option. | Use Charon-SSP Manager to set up the snapshot option for each Sun SPARC guest servers.For more information about this, see the [Charon-SSP 5.2 for Linux user guide](https://stromasys.atlassian.net/wiki/spaces/KBP/pages/76429819926/CHARON-SSP+V5.2+for+Linux) in the Stromasys documentation.   | DevOps engineer | 
| Develop the Systems Manager Automation runbook. | Use the code sample from **Backup option 2** in the *Additional information* section to develop a Systems Manager Automation runbook to remotely run the snapshot command on a Sun SPARC guest server during a maintenance window. | Cloud architect | 
| Deploy the Systems Manager Automation runbook and set up the association to the Amazon EC2 Linux hosts. | Deploy the Systems Manager Automation runbook and set up maintenance windows and associations in Systems Manager for scheduling the backup process.For more information about this, see [Automation walkthroughs](https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-walk.html) and [Setting up Maintenance Windows](https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-maintenance-permissions.html) in the AWS Systems Manager documentation. | Cloud architect | 
| Archive snapshots into long-term storage. | Use the runbook sample code from the *Additional information* section to develop a Systems Manager Automation runbook to archive snapshot files to Amazon S3. | Cloud architect | 

### Backup option 3 – Create an Amazon EBS volume snapshot
<a name="backup-option-3-create-an-ebs-volume-snapshot"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Install the Charon-SSP emulator. | Install the Charon-SSP emulator on the Amazon EC2 Linux instance.For more information about this, see [Setting up an AWS Cloud instance for Charon-SSP](https://stromasys.atlassian.net/wiki/spaces/DocCHSSP44xAWSGS/pages/7239901201/Setting+up+an+AWS+Cloud+Instance+for+Charon-SSP) in the Stromasys documentation.  | DevOps engineer | 
| Create Amazon EBS volumes for the Sun SPRAC guest servers. | Sign in to the AWS Management Console, open the Amazon EBS console, and then create Amazon EBS volumes for the Sun SPRAC guest servers.For more information about this, see [Setting up an AWS Cloud instance for Charon-SSP](https://stromasys.atlassian.net/wiki/spaces/DocCHSSP44xAWSGS/pages/7239901201/Setting+up+an+AWS+Cloud+Instance+for+Charon-SSP) in the Stromasys documentation. | Cloud architect | 
| Attach the Amazon EBS volumes to the Amazon EC2 Linux instance. | On the Amazon EC2 console, attach the Amazon EBS volumes to the Amazon EC2 Linux instance.For more information about this, see [Attach an Amazon EBS volume to an instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-attaching-volume.html) in the Amazon EC2 documentation. | AWS DevOps | 
| Map Amazon EBS volumes as SCSI drives in the Charon-SSP emulator. | Configure Charon-SSP Manager to map the Amazon EBS volumes as SCSI drives in the Sun SPARC guest servers.For more information about this, see the *SCSI storage configuration* section of the [Charon-SSP V5.2 for Linux](https://stromasys.atlassian.net/wiki/spaces/KBP/pages/76429819926/CHARON-SSP+V5.2+for+Linux) guide in the Stromasys documentation. | AWS DevOps | 
| Configure the AWS Backup schedule for snapshotting the Amazon EBS volumes. | Set up AWS Backup policy and schedules to snapshot the Amazon EBS volumes.For more information about this, see the [Amazon EBS backup and restore using AWS Backup](https://aws.amazon.com/getting-started/hands-on/amazon-ebs-backup-and-restore-using-aws-backup/) tutorial in the AWS Developer Center documentation. | AWS DevOps | 

### Backup option 4 – Create an AWS Storage Gateway VTL
<a name="backup-option-4-create-an-awssglong-vtl"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create a Tape Gateway device. | Sign in to the AWS Management Console, open the AWS Storage Gateway console, and then create a Tape Gateway device in a VPC.For more information about this, see [Creating a gateway](https://docs.aws.amazon.com/storagegateway/latest/tgw/create-tape-gateway.html) in the AWS Storage Gateway documentation. | Cloud architect | 
| Create an Amazon RDS DB instance for the Bacula Catalog. | Open the Amazon RDS console and create an Amazon RDS for MySQL DB instance.For more information about this, see [Creating a MySQL DB instance and connecting to a database on a MySQL DB instance](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_GettingStarted.CreatingConnecting.MySQL.html) in the Amazon RDS documentation. | Cloud architect | 
| Deploy the backup application controller in the VPC. | Install Bacula on the Amazon EC2 instance, deploy the backup application controller, and then configure the backup storage to connect with the Tape Gateway device. You can use the sample Bacula Director storage daemon configuration in the `Bacula-storage-daemon-config.txt` file (attached).For more information about this, see the [Bacula documentation](https://www.bacula.org/11.0.x-manuals/en/main/main.pdf). | AWS DevOps | 
| Set up backup application on the Sun SPARC guest servers. | Set up a second client to install and set up the backup application on the Sun SPARC guest servers by using the sample Bacula configuration in the `SUN-SPARC-Guest-Bacula-Config.txt` file (attached). | DevOps engineer | 
| Set up the backup configuration and schedule. | Set up backup configuration and schedules in the backup application controller by using the sample Bacula Director configuration in the `Bacula-Directory-Config.txt` file (attached).For more information about this, see the [Bacula documentation](https://www.bacula.org/11.0.x-manuals/en/main/main.pdf).   | DevOps engineer | 
| Validate that the backup configuration and schedules are correct. | Follow the instruction from the [Bacula documentation](https://www.bacula.org/11.0.x-manuals/en/main/main.pdf) to perform the validation and backup testing for your setup in the Sun SPARC guest servers.For example, you can use the following commands to validate the configuration files:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud.html) | DevOps engineer | 

## Related resources
<a name="back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud-resources"></a>
+ [Charon virtual SPARC with VE licensing](https://aws.amazon.com/marketplace/pp/B08TBQS8NZ?qid=1621489108444&sr=0-2&ref_=srh_res_product_title)
+ [Charon virtual SPARC](https://aws.amazon.com/marketplace/pp/B07XF228LH?qid=1621489108444&sr=0-1&ref_=srh_res_product_title)
+ [Using cloud services and object storage with Bacula Enterprise Edition](https://www.baculasystems.com/wp-content/uploads/ObjectStorage_Bacula_Enterprise.pdf)
+ [Disaster recovery (DR) objectives](https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/disaster-recovery-dr-objectives.html)
+ [Charon legacy system emulation solutions](https://www.stromasys.com/solution/charon-ssp/)

## Additional information
<a name="back-up-sun-sparc-servers-in-the-stromasys-charon-ssp-emulator-on-the-aws-cloud-additional"></a>

**Backup option 1 – Create a Stromasys virtual tape **

You can use the following sample Systems Manager Automation runbook code to automatically start the backup and then swap the tapes:

```
...
# example backup script saved in SUN SPARC Server
 #!/usr/bin/bash
 mt -f  rewind
 tar -cvf  
 mt -f  offline
...        
         mainSteps:
         - action: aws:runShellScript
           name:
           inputs:
             onFailure: Abort
             timeoutSeconds: "1200"
             runCommand:
             - |
               # Validate tape backup container file exists
               if [ ! -f {{TapeBackupContainerFile}} ]; then
                 logger -s -p local3.warning "Tape backup container file is not exists - {{TapeBackupContainerFile}}, create a new one"
                 touch {{TapeBackupContainerFile}}
               fi
         - action: aws:runShellScript
           name: startBackup
           inputs:
             onFailure: Abort
             timeoutSeconds: "1200"
             runCommand:
             - |
               user={{BACKUP_USER}}
               keypair={{KEYPAIR_PATH}}
               server={{SUN_SPARC_IP}}
               backup_script={{BACKUP_SCRIPT}}
               ssh -i $keypair $user@$server -c "/usr/bin/bash $backup_script"
         - action: aws:runShellScript
           name: swapVirtualDiskContainer
           inputs:
             onFailure: Abort
             timeoutSeconds: "1200"
             runCommand:
             - |
               mv {{TapeBackupContainerFile}} {{TapeBackupContainerFile}}.$(date +%s)
               touch {{TapeBackupContainerFile}}
         - action: aws:runShellScript
           name: uploadBackupArchiveToS3
           inputs:
             onFailure: Abort
             timeoutSeconds: "1200"
             runCommand:
             - |
               aws s3 cp {{TapeBackupContainerFile}} s3://{{BACKUP_BUCKET}}/{{SUN_SPARC_IP}}/$(date '+%Y-%m-%d')/
 ...
```

**Backup option 2 –  Stromasys snapshot **

** **You can use the following sample Systems Manager Automation runbook code to automate the backup process:

```
      ...

         mainSteps:
         - action: aws:runShellScript
           name: startSnapshot
           inputs:
             onFailure: Abort
             timeoutSeconds: "1200"
             runCommand:
             - |
               # You may consider some graceful stop of the application before taking a snapshot
               # Query SSP PID by configuration file
               # Example: ps ax | grep ssp-4 | grep Solaris10.cfg | awk '{print $1" "$5}' | grep ssp4 | cut -f1 -d" "
               pid=`ps ax | grep ssp-4 | grep {{SSP_GUEST_CONFIG_FILE}} | awk '{print $1" "$5}' | grep ssp4 | cut -f1 -d" "`
               if [ -n "${pid}" ]; then
                 kill -SIGTSTP ${pid}
               else
                 echo "No PID found for SPARC guest with config {{SSP_GUEST_CONFIG_FILE}}"
                 exit 1
               fi
         - action: aws:runShellScript
           name: startBackup
           inputs:
             onFailure: Abort
             timeoutSeconds: "1200"
             runCommand:
             - |
               # upload snapshot and virtual disk files into S3
               aws s3 sync {{SNAPSHOT_FOLDER}} s3://{{BACKUP_BUCKET}}/$(date '+%Y-%m-%d')/
               aws s3 cp {{VIRTUAL_DISK_FILE}} s3://{{BACKUP_BUCKET}}/$(date '+%Y-%m-%d')/
         - action: aws:runShellScript
           name: restratSPARCGuest
           inputs:
             onFailure: Abort
             timeoutSeconds: "1200"
             runCommand:
             - |
               /opt/charon-ssp/ssp-4u/ssp4u -f {{SSP_GUEST_CONFIG_FILE}} -d -a {{SPARC_GUEST_NAME}} --snapshot {{SNAPSHOT_FOLDER}}
 ...
```

**Backup option 4 – **AWS Storage Gateway** VTL**

If you use Solaris non-global zones to run virtualized legacy Sun SPARC servers, the backup application approach can be applied to non-global zones running in the Sun SPARC servers (for example, the backup client can run inside the non-global zones). However, the backup client can also run in the Solaris host and take snapshots of the non-global zones. The snapshots can then be backed up on a tape.

The following sample configuration adds the file system that hosts the Solaris non-global zones into the backup configuration for the Solaris host:

```
FileSet {
   Name = "Branded Zones"
   Include {
     Options {
       signature = MD5
     }
     File = /zones
   }
 }
```

## Attachments
<a name="attachments-9688ae50-9d0c-4d61-ab40-93df2bce4b7d"></a>

To access additional content that is associated with this document, unzip the following file: [attachment.zip](samples/p-attach/9688ae50-9d0c-4d61-ab40-93df2bce4b7d/attachments/attachment.zip)

# Back up and archive data to Amazon S3 with Veeam Backup & Replication
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication"></a>

*Jeanna James, Anthony Fiore (AWS), and William Quigley, Amazon Web Services*

## Summary
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-summary"></a>

This pattern details the process for sending backups created by Veeam Backup & Replication to supported Amazon Simple Storage Service (Amazon S3) object storage classes by using the Veeam scale-out backup repository capability. 

Veeam supports multiple Amazon S3 storage classes to best fit your specific needs. You can choose the type of storage based on the data access, resiliency, and cost requirements of your backup or archive data. For example, you can store data that you don’t plan to use for 30 days or longer in Amazon S3 infrequent access (IA) for lower cost. If you’re planning to archive data for 90 days or longer, you can use S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive with Veeam’s archive tier. You can also use S3 Object Lock to make backups that are immutable within Amazon S3.

This pattern doesn’t cover how to set up Veeam Backup & Replication with a tape gateway in AWS Storage Gateway. For information about that topic, see [Veeam Backup & Replication using AWS VTL Gateway - Deployment Guide](https://www.veeam.com/resources/wp-using-aws-vtl-gateway-deployment-guide.html) on the Veeam website.


| 
| 
| Warning: This scenario requires AWS Identity and Access Management (IAM) users with programmatic access and long-term credentials, which present a security risk. To help mitigate this risk, we recommend that you provide these users with only the permissions they require to perform the task and that you remove these users when they are no longer needed. Access keys can be updated if necessary. For more information, see [Updating access keys](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_RotateAccessKey) in the *IAM User Guide*. | 
| --- |

## Prerequisites and limitations
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-prereqs"></a>

**Prerequisites**
+ Veeam Backup & Replication, including Veeam Availability Suite or Veeam Backup Essentials, installed (you can register for a [free trial](https://www.veeam.com/backup-replication-virtual-physical-cloud.html))
+ Veeam Backup & Replication license with Enterprise or Enterprise Plus functionality, which includes Veeam Universal License (VUL)
+ An active IAM user with access to an Amazon S3 bucket
+ An active IAM user with access to Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Virtual Private Cloud (Amazon VPC), if using archive tier
+ Network connectivity from on premises to AWS services with available bandwidth for backup and restore traffic through a public internet connection or an AWS Direct Connect public virtual interface (VIF)
+ The following network ports and endpoints opened to ensure proper communication with object storage repositories:
  + Amazon S3 storage – TCP – port 443: Used to communicate with Amazon S3 storage.
  + Amazon S3 storage – cloud endpoints – `*.amazonaws.com` for AWS Regions and the AWS GovCloud (US) Regions, or `*.amazonaws.com.cn` for China Regions: Used to communicate with Amazon S3 storage. For a complete list of connection endpoints, see [Amazon S3 endpoints](https://docs.aws.amazon.com/general/latest/gr/s3.html#s3_region) in the AWS documentation.
  + Amazon S3 storage – TCP HTTP – port 80: Used to verify certificate status. Consider that certificate verification endpoints—certificate revocation list (CRL) URLs and Online Certificate Status Protocol (OCSP) servers—are subject to change. The actual list of addresses can be found in the certificate itself.
  + Amazon S3 storage – certificate verification endpoints – `*.amazontrust.com`: Used to verify certificate status. Consider that certificate verification endpoints (CRL URLs and OCSP servers) are subject to change. The actual list of addresses can be found in the certificate itself.

**Limitations**
+ Veeam doesn’t support S3 Lifecycle policies on any S3 buckets that are used as Veeam object storage repositories. These include polices with Amazon S3 storage class transitions and S3 Lifecycle expiration rules. Veeam **must** be the sole entity that manages these objects. Enabling S3 Lifecycle policies might have unexpected results, including data loss.

**Product versions**
+ Veeam Backup & Replication v9.5 Update 4 or later (backup only or capacity tier)
+ Veeam Backup & Replication v10 or later (backup or capacity tier and S3 Object Lock)
+ Veeam Backup & Replication v11 or later (backup or capacity tier, archive or archive tier, and S3 Object Lock)
+ Veeam Backup & Replication v12 or later (performance tier, backup or capacity tier, archive or archive tier, and S3 Object Lock)
+ S3 Standard
+ S3 Standard-IA
+ S3 One Zone-IA
+ S3 Glacier Flexible Retrieval (v11 and later only)
+ S3 Glacier Deep Archive (v11 and later only)
+ S3 Glacier Instant Retrieval (v12 and later only)

## Architecture
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-architecture"></a>

**Source technology stack**
+ On-premises Veeam Backup & Replication installation with connectivity from a Veeam backup server or a Veeam gateway server to Amazon S3

**Target technology stack  **
+ Amazon S3
+ Amazon VPC and Amazon EC2 (if using archive tier)

**Target architecture: SOBR **

The following diagram shows the scale-out backup repository (SOBR) architecture.

![\[SOBR architecture for backing up data from Veeam to Amazon S3\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/images/pattern-img/7f3a36f7-31dc-45c8-87b2-c5dd37f7a01e/images/b48fd0cd-b66c-4ef7-b6fa-0ed53354e1a2.png)


Veeam Backup and Replication software protects data from logical errors such as system failures, application errors, or accidental deletion. In this diagram, backups are run on premises first, and a secondary copy is sent directly to Amazon S3. A backup represents a point-in-time copy of the data.

The workflow consists of three primary components that are required for tiering or copying backups to Amazon S3, and one optional component:
+ Veeam Backup & Replication (1) – The backup server that is responsible for coordinating, controlling, and managing backup infrastructure, settings, jobs, recovery tasks, and other processes.
+ Veeam gateway server (not shown in the diagram) – An optional on-premises gateway server that is required if the Veeam backup server doesn’t have outbound connectivity to Amazon S3.
+ Scale-out backup repository (2) – Repository system with horizontal scaling support for multi-tier storage of data. The scale-out backup repository consists of one or more backup repositories that provide fast access to data and can be expanded with Amazon S3 object storage repositories for long-term storage (capacity tier) and archiving (archive tier). Veeam uses the scale-out backup repository to tier data automatically between local (performance tier) and Amazon S3 object storage (capacity and archive tiers). 
**Note**  
Starting with Veeam Backup & Replication v12.2, the Direct to S3 Glacier feature makes the S3 capacity tier optional. A SOBR can be configured with a performance tier and an S3 Glacier archive tier. This configuration is useful for users who have significant investments in local (on-premises) storage for the capacity tier and who require only long-term archive retention in the cloud. For more information, see the [Veeam Backup & Replication documentation](https://helpcenter.veeam.com/docs/backup/vsphere/archive_tier.html?ver=120).
+ Amazon S3 (3) – AWS object storage service that offers scalability, data availability, security, and performance.

**Target architecture: DTO**

The following diagram shows the direct-to-object (DTO) architecture.

![\[DTO architecture for backing up data from Veeam to Amazon S3\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/images/pattern-img/7f3a36f7-31dc-45c8-87b2-c5dd37f7a01e/images/9debe53a-d70a-43fa-844c-f93fa22124eb.png)


In this diagram, backup data goes directly to Amazon S3 without being stored on premises first. Secondary copies can be stored in S3 Glacier.

**Automation and scale**

You can automate the creation of IAM resources and S3 buckets by using the CloudFormation templates provided in the [VeeamHub GitHub repository](https://github.com/VeeamHub/veeam-aws-cloudformation/tree/master/veeam-backup-and-replication). The templates include both standard and immutable options.

## Tools
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-tools"></a>

**Tools and AWS services**
+ [Veeam Backup & Replication](https://www.veeam.com/vm-backup-recovery-replication-software.html) is a solution from Veeam for protecting, backing up, replicating, and restoring your virtual and physical workloads.
+ [CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html) helps you model and set up your AWS resources, provision them quickly and consistently, and manage them throughout their lifecycle. You can use a template to describe your resources and their dependencies, and launch and configure them together as a stack, instead of managing resources individually. You can manage and provision stacks across multiple AWS accounts and AWS Regions.
+ [Amazon Elastic Compute Cloud (Amazon EC2)](https://docs.aws.amazon.com/ec2/) provides scalable computing capacity in the AWS Cloud. You can use Amazon EC2 to launch as many or as few virtual servers as you need, and you can scale out or scale in.
+ [AWS Identity and Access Management (IAM)](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) is a web service for securely controlling access to AWS services. With IAM, you can centrally manage users, security credentials such as access keys, and permissions that control which AWS resources users and applications can access.
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is an object storage service. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web.
+ [Amazon S3 Glacier (S3 Glacier)](https://docs.aws.amazon.com/amazonglacier/latest/dev/introduction.html) is a secure and durable service for low-cost data archiving and long-term backup.
+ [Amazon Virtual Private Cloud (Amazon VPC)](https://docs.aws.amazon.com/vpc/) provisions a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you've defined. This virtual network closely resembles a traditional network that you'd operate in your own data center, with the benefits of using the scalable infrastructure of AWS.

**Code **

Use the CloudFormation templates provided in the [VeeamHub GitHub repository](https://github.com/VeeamHub/veeam-aws-cloudformation/tree/master/veeam-backup-and-replication) to automatically create the IAM resources and S3 buckets for this pattern. If you prefer to create these resources manually, follow the steps in the *Epics* section.

## Best practices
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-best-practices"></a>

In accordance with IAM best practices, we strongly recommend that you regularly rotate long-term IAM user credentials, such as the IAM user that you use for writing Veeam Backup & Replication backups to Amazon S3. For more information, see [Security best practices](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#rotate-credentials) in the IAM documentation.

## Epics
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-epics"></a>

### Configure Amazon S3 storage in your account
<a name="configure-s3-storage-in-your-account"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create an IAM user. | Follow the [instructions in the IAM documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html) to create an IAM user. This user should not have AWS console access, and you will need to create an access key for this user. Veeam uses this entity to authenticate with AWS to read and write to your S3 buckets. You must grant least privilege (that is, grant only the permissions required to perform a task) so the user doesn’t have more authority than it needs. For example IAM policies to attach to your Veeam IAM user, see the [Additional information](#back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-additional) section.Alternatively, you can use the CloudFormation templates provided in the [VeeamHub GitHub repository](https://github.com/VeeamHub/veeam-aws-cloudformation/tree/master/veeam-backup-and-replication) to create an IAM user and S3 bucket for this pattern. | AWS administrator | 
| Create an S3 bucket. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication.html)For more information, see [Creating a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) in the Amazon S3 documentation. | AWS administrator | 

### Add Amazon S3 and S3 Glacier Flexible Retrieval (or S3 Glacier Deep Archive) to Veeam Backup & Replication
<a name="add-s3-and-s3-storage-class-glacier-or-s3-storage-class-deep-archive-to-veeam-backup-amp-replication"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Launch the New Object Repository wizard. | Before you set up the object storage and scale-out backup repositories in Veeam, you must add the Amazon S3 and S3 Glacier storage repositories that you want to use for the capacity and archive tiers. In the next epic, you’ll connect these storage repositories to your scale-out backup repository.[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication.html) | AWS administrator, App owner | 
| Add Amazon S3 storage for the capacity tier. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication.html) | AWS administrator, App owner | 
| Add S3 Glacier storage for the archive tier. | If you want to create an archive tier, use the IAM permissions detailed in the [Additional information](#back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-additional) section. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication.html) | AWS administrator, App owner | 

### Add scale-out backup repositories
<a name="add-scale-out-backup-repositories"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Launch the New Scale-Out Backup Repository wizard. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication.html) | App owner, AWS systems administrator | 
| Add a scale-out backup repository and configure capacity and archive tiers. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication.html) | App owner, AWS systems administrator | 

## Related resources
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-resources"></a>
+ [Creating an IAM user in your AWS account](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html) (IAM documentation)
+ [Creating a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) (Amazon S3 documentation)
+ [Blocking public access to your Amazon S3 storage](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-control-block-public-access.html) (Amazon S3 documentation)
+ [Using S3 Object Lock](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock.html) (Amazon S3 documentation)
+ [Veeam technical documentation](https://www.veeam.com/documentation-guides-datasheets.html)
+ [How to Create Secure IAM Policy for Connection to S3 Object Storage](https://www.veeam.com/kb3151) (Veeam documentation)

## Additional information
<a name="back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-additional"></a>

The following sections provide sample IAM policies you can use when you create an IAM user in the [Epics](#back-up-and-archive-data-to-amazon-s3-with-veeam-backup-replication-epics) section of this pattern.

**IAM policy for capacity tier**

**Note**  
Change the name of the S3 buckets in the example policy from `<yourbucketname>` to the name of the S3 bucket that you want to use for Veeam capacity tier backups. Also note that the policy should be restricted to the specific resources used for Veeam (indicated by the `Resource` specification in the following policy), and that the first part of the policy disables client-side encryption, as discussed in the AWS blog post [Preventing unintended encryption of Amazon S3 objects](https://aws.amazon.com/blogs/security/preventing-unintended-encryption-of-amazon-s3-objects/).

```
{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "RestrictSSECObjectUploads",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<your-bucket-name>/*",
            "Condition": {
                "Null": {
                    "s3:x-amz-server-side-encryption-customer-algorithm": "false"
                }
            }
        },
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObjectVersion",
                "s3:ListBucketVersions",
                "s3:ListBucket",
                "s3:PutObjectLegalHold",
                "s3:GetBucketVersioning",
                "s3:GetObjectLegalHold",
                "s3:GetBucketObjectLockConfiguration",
                "s3:PutObject*",
                "s3:GetObject*",
                "s3:GetEncryptionConfiguration",
                "s3:PutObjectRetention",
                "s3:PutBucketObjectLockConfiguration",
                "s3:DeleteObject*",
                "s3:DeleteObjectVersion",
                "s3:GetBucketLocation"

            ],
            "Resource": [
                "arn:aws:s3:::<yourbucketname>",
                "arn:aws:s3:::<yourbucketname>/*"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets",
                "s3:ListBucket"
            ],
            "Resource": "arn:aws:s3:::*"
        }
    ]
}
```

**IAM policy for archive tier**

**Note**  
 Change the name of the S3 buckets in the example policy from `<yourbucketname>` to the name of the S3 bucket that you want to use for Veeam archive tier backups.

**To use your existing VPC, subnet, and security groups:**

```
{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "S3Permissions",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:GetObject",
                "s3:RestoreObject",
                "s3:ListBucket",
                "s3:AbortMultipartUpload",
                "s3:GetBucketVersioning",
                "s3:ListAllMyBuckets",
                "s3:GetBucketLocation",
                "s3:GetBucketObjectLockConfiguration",
                "s3:PutObjectRetention",
                "s3:GetObjectVersion",
                "s3:PutObjectLegalHold",
                "s3:GetObjectRetention",
                "s3:DeleteObjectVersion",
                "s3:ListBucketVersions"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket-name>",
                "arn:aws:s3:::<bucket-name>/*"
            ]
        }
    ]
}

{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "EC2Permissions",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:CreateKeyPair",
                "ec2:DescribeKeyPairs",
                "ec2:RunInstances",
                "ec2:DeleteKeyPair",
                "ec2:DescribeVpcAttribute",
                "ec2:CreateTags",
                "ec2:DescribeSubnets",
                "ec2:TerminateInstances",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeImages",
                "ec2:DescribeVpcs"
            ],
            "Resource": "arn:aws:ec2:<region>:<account-id>:*"
        }
    ]
}
```

**To create new VPC, subnet, and security groups:**

```
{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "S3Permissions",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:GetObject",
                "s3:RestoreObject",
                "s3:ListBucket",
                "s3:AbortMultipartUpload",
                "s3:GetBucketVersioning",
                "s3:ListAllMyBuckets",
                "s3:GetBucketLocation",
                "s3:GetBucketObjectLockConfiguration",
                "s3:PutObjectRetention",
                "s3:GetObjectVersion",
                "s3:PutObjectLegalHold",
                "s3:GetObjectRetention",
                "s3:DeleteObjectVersion",
                "s3:ListBucketVersions"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket-name>",
                "arn:aws:s3:::<bucket-name>/*"
            ]
        }
    ]
}

{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "EC2Permissions",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeInstances",
                "ec2:CreateKeyPair",
                "ec2:DescribeKeyPairs",
                "ec2:RunInstances",
                "ec2:DeleteKeyPair",
                "ec2:DescribeVpcAttribute",
                "ec2:CreateTags",
                "ec2:DescribeSubnets",
                "ec2:TerminateInstances",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeImages",
                "ec2:DescribeVpcs",
                "ec2:CreateVpc",
                "ec2:CreateSubnet",
                "ec2:DescribeAvailabilityZones",
                "ec2:CreateRoute",
                "ec2:CreateInternetGateway",
                "ec2:AttachInternetGateway",
                "ec2:ModifyVpcAttribute",
                "ec2:CreateSecurityGroup",
                "ec2:DeleteSecurityGroup",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:DescribeRouteTables",
                "ec2:DescribeInstanceTypes"
            ],
            "Resource": "*"
        }
    ]
}
```

# Copy data from an Amazon S3 bucket to another account and Region by using the AWS CLI
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli"></a>

*Appasaheb Bagali and Purushotham G K, Amazon Web Services*

## Summary
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-summary"></a>

This pattern describes how to migrate data from a source Amazon Simple Storage Service (Amazon S3) bucket in an AWS account to a destination Amazon S3 bucket in another AWS account, either in the same AWS Region or in a different Region.

The source Amazon S3 bucket allows AWS Identity and Access Management (IAM) access by using an attached resource policy. A user in the destination account has to assume a role that has `PutObject` and `GetObject` permissions for  the source bucket. Finally, you run `copy` and `sync` commands to transfer data from the source Amazon S3 bucket to the destination Amazon S3 bucket.

Accounts own the objects that they upload to Amazon S3 buckets. If you copy objects across accounts and Regions, you grant the destination account ownership of the copied objects. You can change the ownership of an object by changing its [access control list (ACL)](https://docs.aws.amazon.com/AmazonS3/latest/dev/S3_ACLs_UsingACLs.html) to `bucket-owner-full-control`. However, we recommend that you grant programmatic cross-account permissions to the destination account because ACLs can be difficult to manage for multiple objects.

**Warning**  
This scenario requires IAM users with programmatic access and long-term credentials, which present a security risk. To help mitigate this risk, we recommend that you provide these users with only the permissions they require to perform the task and that you remove these users when they are no longer needed. Access keys can be updated if necessary. For more information, see [Updating access keys](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_RotateAccessKey) in the IAM documentation.

## Prerequisites and limitations
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-prereqs"></a>

*Prerequisites*
+ Two active AWS accounts in the same or different AWS Regions.
+ An existing Amazon S3 bucket in the source account. 
+ If your source or destination Amazon S3 bucket has [default encryption](https://docs.aws.amazon.com/AmazonS3/latest/dev/bucket-encryption.html) enabled, you must modify the AWS Key Management Service (AWS KMS) key permissions. For more information, see the [AWS re:Post article](https://repost.aws/knowledge-center/s3-bucket-access-default-encryption) on this topic. 
+ Familiarity with cross-account permissions.

*Limitations*
+ This pattern covers one-time migration. For scenarios that require continuous and automatic migration of new objects from a source bucket to a destination bucket, you can use [Amazon S3 Batch Replication](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-batch-replication-batch.html).
+ This patterns uses session credentials (`AccessKeyId`, `SecretAccessKey`, and `SessionToken`) that are temporary and non-persistent. The expiration timestamp in the output indicates when these credentials expire. The role is configured with the maximum session duration. The copy job will be canceled if the session expires.

## Architecture
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-architecture"></a>

 
![\[Copying Amazon S3 data to another account or Region\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/images/pattern-img/a574c26b-fdd9-4472-842b-b34c3eb2bfe9/images/5e4dec53-dfc8-478b-a7c4-503d63c8ac4e.png)


## Tools
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-tools"></a>
+ [AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) is an open-source tool that helps you interact with AWS services through commands in your command line shell.
+ [AWS Identity and Access Management (IAM)](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) helps you securely manage access to your AWS resources by controlling who is authenticated and authorized to use them.
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.

## Best practices
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-best-practices"></a>
+ [Security best practices in IAM](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html) (IAM documentation)
+ [Applying least-privilege permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#grant-least-privilege) (IAM documentation)

## Epics
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-epics"></a>

### Create an IAM user and role in the destination AWS account
<a name="create-an-iam-user-and-role-in-the-destination-aws-account"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create an IAM user and get the access key. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli.html) | AWS DevOps | 
| Create an IAM identity-based policy. | Create an IAM identity-based policy named `S3MigrationPolicy` by using the following permissions. Modify the source and destination bucket names according to your use case. This identity-based policy allows the user who is assuming this role to access the source bucket and destination bucket. For detailed instructions, see [Creating IAM policies](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create-console.html) in the IAM documentation. <pre>{<br />    "Version": "2012-10-17",		 	 	 <br />    "Statement": [<br />        {<br />            "Effect": "Allow",<br />            "Action": [<br />                "s3:ListBucket",<br />                "s3:ListObjectsV2",<br />                "s3:GetObject",<br />                "s3:GetObjectTagging",<br />                "s3:GetObjectVersion",<br />                "s3:GetObjectVersionTagging"<br />            ],<br />            "Resource": [<br />                "arn:aws:s3:::amazon-s3-demo-source-bucket",<br />                "arn:aws:s3:::amazon-s3-demo-source-bucket/*"<br />            ]<br />        },<br />        {<br />            "Effect": "Allow",<br />            "Action": [<br />                "s3:ListBucket",<br />                "s3:PutObject",<br />                "s3:PutObjectAcl",<br />                "s3:PutObjectTagging",<br />                "s3:GetObjectTagging",<br />                "s3:ListObjectsV2",<br />                "s3:GetObjectVersion",<br />                "s3:GetObjectVersionTagging"<br />            ],<br />            "Resource": [<br />                "arn:aws:s3:::amazon-s3-demo-destination-bucket",<br />                "arn:aws:s3:::amazon-s3-demo-destination-bucket/*"<br />            ]<br />        }<br />    ]<br />}</pre> | AWS DevOps | 
| Create an IAM role. | Create an IAM role named `S3MigrationRole` by using the following trust policy. Modify the Amazon Resource Name (ARN) of the destination IAM role or user name in the trust policy according to your use case. This trust policy allows the newly created IAM user to assume `S3MigrationRole`. Attach the previously created `S3MigrationPolicy`. For detailed steps, see [Creating a role to delegate permissions to an IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user.html) in the IAM documentation.<pre>{<br />    "Version": "2012-10-17",		 	 	 <br />    "Statement": [<br />        {<br />            "Effect": "Allow",<br />            "Principal": {<br />                "AWS": "arn:aws:iam::<destination_account>:user/<user_name>"<br />            },<br />            "Action": "sts:AssumeRole",<br />            "Condition": {}<br />        }<br />    ]<br />}</pre> | AWS DevOps | 

### Create and attach the Amazon S3 bucket policy in the source account
<a name="create-and-attach-the-s3-bucket-policy-in-the-source-account"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create and attach an Amazon S3 bucket policy. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli.html) | Cloud administrator | 

### Configure the destination Amazon S3 bucket
<a name="configure-the-destination-s3-bucket"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create a destination Amazon S3 bucket. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli.html) | Cloud administrator | 

### Copy data to the destination Amazon S3 bucket
<a name="copy-data-to-the-destination-s3-bucket"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Configure the AWS CLI with the newly created user credentials. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli.html) | AWS DevOps | 
| Assume the Amazon S3 migration role. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli.html)For more information, see [How do I use the AWS CLI to assume an IAM role?](https://repost.aws/knowledge-center/iam-assume-role-cli) | AWS administrator | 
| Copy and synchronize data from the source bucket to the destination bucket. | When you have assumed the role `S3MigrationRole` you can copy the data using the [copy](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/cp.html) (`cp`) or [synchronize](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/sync.html) (`sync`) command.Copy:<pre>aws s3 cp s3://amazon-s3-demo-source-bucket/ \<br />    s3://amazon-s3-demo-destination-bucket/ \<br />    --recursive --source-region SOURCE-REGION-NAME --region DESTINATION-REGION-NAME</pre>Synchronize:<pre>aws s3 sync s3://amazon-s3-demo-source-bucket/ \<br />    s3://amazon-s3-demo-destination-bucket/ \<br />    --source-region SOURCE-REGION-NAME --region DESTINATION-REGION-NAME</pre> | Cloud administrator | 

## Troubleshooting
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-troubleshooting"></a>


| Issue | Solution | 
| --- | --- | 
| An error occurred (`AccessDenied`) when calling the `ListObjects` operation | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli.html) | 

## Related resources
<a name="copy-data-from-an-s3-bucket-to-another-account-and-region-by-using-the-aws-cli-resources"></a>
+ [Creating an Amazon S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) (Amazon S3 documentation)
+ [Amazon S3 bucket policies and user policies](https://docs.aws.amazon.com/AmazonS3/latest/dev/using-iam-policies.html) (Amazon S3 documentation)
+ [IAM identities (users, groups, and roles)](https://docs.aws.amazon.com/IAM/latest/UserGuide/id.html?icmpid=docs_iam_console) (IAM documentation)
+ [cp command](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/cp.html) (AWS CLI documentation)
+ [sync command](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/sync.html) (AWS CLI documentation)

# Enable DB2 log archiving directly to Amazon S3 in an IBM Db2 database
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database"></a>

*Ambarish Satarkar, Amazon Web Services*

## Summary
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database-summary"></a>

This pattern describes how to use Amazon Simple Storage Service (Amazon S3) as catalog storage for archive logs that are generated by IBM Db2, without using a staging area. 

You can specify [DB2REMOTE](https://www.ibm.com/docs/en/db2/12.1.0?topic=storage-db2remote-identifiers) Amazon S3 storage for the [logarchmeth1](https://www.ibm.com/docs/en/db2/12.1.0?topic=parameters-logarchmeth1-primary-log-archive-method) and [logarchmeth2](https://www.ibm.com/docs/en/db2/12.1.0?topic=parameters-logarchmeth2-secondary-log-archive-method) log archive method configuration parameters. You can use the `logarchmeth1` parameter to specify the primary destination for logs that are archived from the current log path. With this capability, you can archive and retrieve transaction logs to and from Amazon S3 directly, without using a staging area.

[Amazon S3](https://aws.amazon.com/s3/) stores the data that’s uploaded to it across at least three devices in a single AWS Region. Millions of customers of all sizes and industries use Amazon S3 for storing enterprise backups given its high availability, flexible storage options, lifecycle policies, and security.

## Prerequisites and limitations
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database-prereqs"></a>

**Prerequisites **
+ An active AWS account.
+ IBM Db2 database running on an Amazon Elastic Compute Cloud (Amazon EC2) instance.
+ AWS Command Line Interface (AWS CLI) installed
+ [libcurl](https://curl.se/libcurl/) and [libxml2](https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home) installed on the Db2 EC2 instance.

**Limitations **
+ Only [Db2 11.5.7](https://www.ibm.com/docs/en/db2/11.5.x?topic=new-1157) or later allows log archiving directly to Amazon S3 storage.
+ Some AWS services aren’t available in all AWS Regions. For Region availability, see [AWS Services by Region](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/). For specific endpoints, see [Service endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/aws-service-information.html), and choose the link for the service.
+ In all configurations, the following limitations exist for Amazon S3:
  + AWS Key Management Service (AWS KMS) is not supported.
  + AWS role-based (AWS Identity and Access Management (IAM)) or token-based (AWS Security Token Service (AWS STS)) credentials are not supported.

**Product versions**
+ AWS CLI version 2 or later
+ IBM Db2 11.5.7 or later
+ Linux SUSE Linux Enterprise Server (SLES) 11 or later
+ Red Hat Enterprise Linux (RHEL) 6 or later
+ Windows Server 2008 R2, 2012 (R2), 2016, or 2019

## Architecture
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database-architecture"></a>

The following diagram shows the components and workflow for this pattern.

![\[Workflow to use Amazon S3 for catalog storage for archive logs generated by Db2.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/images/pattern-img/7a10333e-07be-4144-9913-45c60a2f51ea/images/0437d348-1688-4c3e-9aa5-43535afe08c6.png)


The architecture on the AWS Cloud includes the following:
+ **Virtual private cloud (VPC)** – A logically isolated section of the AWS Cloud where you launch resources.
+ **Availability Zone** – Provides high availability by running the Db2 LUW (Linux, Unix, Windows) workload in an isolated data center within the AWS Region.
+ **Public subnet** – Provides RDP (Remote Desktop Protocol) access for administrators and internet connectivity through a NAT gateway.
+ **Private subnet** – Hosts the Db2 LUW database. The Db2 LUW instance is configured with the `LOGARCHMETH1` parameter. The parameter writes database log archive files directly to an Amazon S3 path through the gateway endpoint.

The following AWS services provide support:
+ **Amazon S3** – Serves as the durable, scalable storage location for Db2 log archive files.
+ **Amazon Elastic File System (Amazon EFS)** – Provides a shared, fully managed file system that Db2 can use for database backups and staging. Db2 can also use Amazon EFS as a mount point for log files before they are archived to Amazon S3.
+ **Amazon CloudWatch** – Collects and monitors metrics, logs, and events from Db2 and the underlying EC2 instances. You can use CloudWatch to create alarms, dashboards, and automated responses to performance or availability issues.

**Automation and scale**
+ This pattern provides a fully automated solution to store Db2 log archive backup.
+ You can use the same Amazon S3 bucket to enable log archive of multiple Db2 databases.

## Tools
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database-tools"></a>

**AWS services**
+ [Amazon CloudWatch](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html) helps you monitor the metrics of your AWS resources and the applications you run on AWS in real time.
+ [AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html) is an open source tool that helps you interact with AWS services through commands in your command-line shell.
+ [Amazon Elastic Compute Cloud (Amazon EC2)](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html) provides scalable computing capacity in the AWS Cloud. You can launch as many virtual servers as you need and quickly scale them up or down.
+ [Amazon Elastic File System (Amazon EFS)](https://docs.aws.amazon.com/efs/latest/ug/whatisefs.html) helps you create and configure shared file systems in the AWS Cloud.
+ [AWS IAM Identity Center](https://docs.aws.amazon.com/singlesignon/latest/userguide/what-is.html) helps you centrally manage single sign-on (SSO) access to all of your AWS accounts and cloud applications.
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.
+ [Amazon Virtual Private Cloud (Amazon VPC)](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) helps you launch AWS resources into a virtual network that you’ve defined. This virtual network resembles a traditional network that you’d operate in your own data center, with the benefits of using the scalable infrastructure of AWS.

**Other tools**
+ [libcurl](https://curl.se/libcurl/) is a free client-side URL transfer library.
+ [libxml2](https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home) is a free XML C parser and toolkit.

## Best practices
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database-best-practices"></a>
+ Follow the principle of least privilege and grant the minimum permissions required to perform a task. For more information, see [Grant least privilege](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html#grant-least-priv) and [Security best practices](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html) in the IAM documentation.

## Epics
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database-epics"></a>

### Configure AWS services
<a name="configure-aws-services"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Set up the AWS CLI. | To [download and install the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.htmlhttps://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html), use the following commands:<pre>i) curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"<br />ii) unzip awscliv2.zip<br />iii) sudo ./aws/install</pre> | AWS systems administrator, AWS administrator | 
| Configure the AWS CLI. | To [configure the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html), use the following commands:<pre>$ aws configure<br />AWS Access Key ID [None]:*******************************<br />AWS Secret Access Key [None]: ***************************<br />Default region name [None]: < aws region ><br />Default output format [None]: text</pre> | AWS systems administrator, AWS administrator | 
| Create IAM user. | To create an IAM user to use later for the Db2 database connection with Amazon S3, use the following command:`aws iam create-user --user-name <unique username>`Following is an example of the command:`aws iam create-user --user-name db_backup_user`This scenario requires IAM users with programmatic access and long-term credentials, which presents a security risk. To mitigate this risk, we recommend that you provide these users with only the permissions they require to perform the task and that you remove these users when they are no longer needed. Access keys can be updated if necessary. For more information, see [AWS security credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html#access-keys-and-secret-access-keys) and [Manage access keys for IAM users](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_RotateAccessKey) in the IAM documentation. | AWS systems administrator | 
| Create Amazon S3 bucket. | To create an Amazon S3 bucket for storing the database backup, use the following command:`aws s3api create-bucket --bucket <unique bucket name> --region <aws region>`Following is an example command:`aws s3api create-bucket --bucket myfirstbucket --region af-south-1` | AWS systems administrator | 
| Authorize the IAM user. | To authorize the newly created IAM user to have Amazon S3 permissions, use the following steps:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database.html) | AWS systems administrator, AWS administrator | 
| Create access key. | To generate an access key to programmatically access Amazon S3 from the DB2 instance, use the following command:`aws iam create-access-key --user-name <username>`Following is an example of the command:`aws iam create-access-key --user-name db_backup_user`This scenario requires IAM users with programmatic access and long-term credentials, which presents a security risk. To mitigate this risk, we recommend that you provide these users with only the permissions they require to perform the task and that you remove these users when they are no longer needed. Access keys can be updated if necessary. For more information, see [AWS security credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html#access-keys-and-secret-access-keys) and [Manage access keys for IAM users](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_RotateAccessKey) in the IAM documentation. | AWS systems administrator | 
| Create a PKCS keystore. | To create a PKCS keystore to store the key and create a secret access key to transfer the data to Amazon S3, use the following command: <pre>gsk8capicmd_64 -keydb -create -db "/db2/db2<sid>/.keystore/db6-s3.p12" -pw "<password>" -type pkcs12 -stash</pre> | AWS systems administrator | 
| Configure DB2 to use the keystore. | To configure DB2 to use the keystore with the `keystore_location` and `keystore_type` parameters, use the following commands:<pre>db2 "update dbm cfg using keystore_location /db2/db2<sid>/.keystore/db6-s3.p12 keystore_type pkcs12"</pre> | AWS systems administrator | 
| Create a DB2 storage access alias. | A storage access alias specifies the Amazon S3 bucket to use. It also provides the connection details such as the username and password that are stored in the local keystore in an encrypted format. For more information, see [CATALOG STORAGE ACCESS command](https://www.ibm.com/docs/en/db2/12.1.0?topic=commands-catalog-storage-access) in the IBM Db2 documentation.To create a storage access alias, use the following syntax:<pre>db2 "catalog storage access alias <alias_name> vendor S3 server <S3 endpoint> user '<access_key>' password '<secret_access_key>' container '<bucket_name>'"</pre>Following is an example:<pre>db2 "catalog storage access alias DB2BKPS3 vendor S3 server s3.us-west-2.amazonaws.com user '*******************' password '*********************' container 'myfirstbucket'"</pre> | AWS systems administrator | 

### Update logarchmeth1 location in DB2 and restart DB2
<a name="update-logarchmeth1-location-in-db2-and-restart-db2"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Update the `LOGARCHMETH1` location. | To use the storage access alias that you defined earlier, update the `LOGARCHMETH1` database parameters, use the following command:<pre>db2 update db cfg for <DBNAME> using LOGARCHMETH1 'DB2REMOTE://<storage_alias_name>//<sub folder>'</pre>To separate the logs from other files, specify a subdirectory (that is, the Amazon S3 bucket prefix) `TESTDB_LOGS` in which to save the logs within the S3 bucket.Following is an example:<pre>db2 update db cfg for ABC using LOGARCHMETH1 'DB2REMOTE://DB2BKPS3//TESTDB_LOGS/'</pre>You should see the following message: `DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.` | AWS systems administrator | 
| Restart DB2. | Restart the DB2 instance after reconfiguring it for log archiving.However, if `LOGARCHMETH1 `was previously set to any file system location, then restart is not required. | AWS administrator, AWS systems administrator | 

### Check the archive log path in Amazon S3 and db2diag.log
<a name="check-the-archive-log-path-in-s3-and-db2diag-log"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Check the archive log in Amazon S3. | At this point, your database is completely configured to archive the transaction logs directly to the Amazon S3 storage. To confirm the configuration, start executing transactional activities on the database to start consuming (and archiving) the log space. Then, check the archive logs in Amazon S3. | AWS administrator, AWS systems administrator | 
| Check archive log configuration in `db2diag.log`. | After you check the archive log in Amazon S3, look for the following message in the DB2 diagnostic log `db2diag.log`:`MESSAGE : ADM1846I  Completed archive for log file "S0000079.LOG" to Completed archive for log file S0000080.LOG to DB2REMOTE://<AWS S3 Bucket Name>/<SID>/log1/db2<sid>/<SID>/NODE0000/LOGSTREAM0000/C0000001/ from /db2/<SID>/log_dir/NODE0000/LOGSTREAM0000/. MESSAGE : ADM1846I  Completed archive for log file "S0000080.LOG" to Completed archive for log file S0000081.LOG to DB2REMOTE://<AWS S3 Bucket Name> /<SID>/log1/db2<sid>/<SID>/NODE0000/LOGSTREAM0000/C0000001/ from /db2/<SID>/log_dir/NODE0000/LOGSTREAM0000/. `This message confirms that the closed DB2 transaction log files are being archived to the (remote) Amazon S3 storage. | AWS systems administrator | 

## Related resources
<a name="enable-db2-logarchive-directly-to-amazon-s3-in-ibm-db2-database-resources"></a>

**AWS service documentation**
+ [AWS security credentials ](https://docs.aws.amazon.com/IAM/latest/UserGuide/security-creds.html#access-keys-and-secret-access-keys)(IAM documentation)
+ [Grant least privilege](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html#grant-least-priv) (IAM documentation)
+ [Manage access keys for IAM users](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_RotateAccessKey) (IAM documentation)
+ [Security best practices in IAM](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html) (IAM documentation)

**IBM resources**
+ [IBM Db2 Database](https://www.ibm.com/products/db2-database)
+ [logarchmeth1 - Primary log archive method configuration parameter](https://www.ibm.com/docs/en/db2/12.1.0?topic=parameters-logarchmeth1-primary-log-archive-method)
+ [logarchmeth2 - Secondary log archive method configuration parameter](https://www.ibm.com/docs/en/db2/12.1.0?topic=parameters-logarchmeth2-secondary-log-archive-method)
+ [Remote storage](https://www.ibm.com/docs/en/db2/12.1.0?topic=databases-remote-storage) 

# Migrate data from an on-premises Hadoop environment to Amazon S3 using DistCp with AWS PrivateLink for Amazon S3
<a name="migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3"></a>

*Jason Owens, Andres Cantor, Jeff Klopfenstein, Bruno Rocha Oliveira, and Samuel Schmidt, Amazon Web Services*

## Summary
<a name="migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3-summary"></a>

This pattern demonstrates how to migrate nearly any amount of data from an on-premises Apache Hadoop environment to the Amazon Web Services (AWS) Cloud by using the Apache open-source tool [DistCp](https://hadoop.apache.org/docs/r1.2.1/distcp.html) with AWS PrivateLink for Amazon Simple Storage Service (Amazon S3). Instead of using the public internet or a proxy solution to migrate data, you can use [AWS PrivateLink for Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/privatelink-interface-endpoints.html) to migrate data to Amazon S3 over a private network connection between your on-premises data center and an Amazon Virtual Private Cloud (Amazon VPC). If you use DNS entries in Amazon Route 53 or add entries in the **/etc/hosts** file in all nodes of your on-premises Hadoop cluster, then you are automatically directed to the correct interface endpoint.

This guide provides instructions for using DistCp for migrating data to the AWS Cloud. DistCp is the most commonly used tool, but other migration tools are available. For example, you can use offline AWS tools like [AWS Snowball](https://docs.aws.amazon.com/whitepapers/latest/how-aws-pricing-works/aws-snow-family.html#aws-snowball) or [AWS Snowmobile](https://docs.aws.amazon.com/whitepapers/latest/how-aws-pricing-works/aws-snow-family.html#aws-snowmobile), or online AWS tools like [AWS Storage Gateway](https://docs.aws.amazon.com/storagegateway/latest/userguide/migrate-data.html) or [AWS DataSync](https://aws.amazon.com/about-aws/whats-new/2021/11/aws-datasync-hadoop-aws-storage-services/). Additionally, you can use other open-source tools like [Apache NiFi](https://nifi.apache.org/).

## Prerequisites and limitations
<a name="migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3-prereqs"></a>

**Prerequisites**
+ An active AWS account with a private network connection between your on-premises data center and the AWS Cloud
+ [Hadoop](https://hadoop.apache.org/releases.html), installed on premises with [DistCp](https://hadoop.apache.org/docs/r1.2.1/distcp.html)
+ A Hadoop user with access to the migration data in the Hadoop Distributed File System (HDFS)
+ AWS Command Line Interface (AWS CLI), [installed](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)
+ [Permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_rw-bucket-console.html) to put objects into an S3 bucket

**Limitations**

Virtual private cloud (VPC) limitations apply to AWS PrivateLink for Amazon S3. For more information, see [Interface endpoint properties and limitations](https://docs.aws.amazon.com/vpc/latest/privatelink/vpce-interface.html#vpce-interface-limitations) and [AWS PrivateLink quotas](https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-limits-endpoints.html) (AWS PrivateLink documentation).

AWS PrivateLink for Amazon S3 doesn't support the following:
+ [Federal Information Processing Standard (FIPS) endpoints](https://aws.amazon.com/compliance/fips/)
+ [Website endpoints](https://docs.aws.amazon.com/AmazonS3/latest/userguide/WebsiteEndpoints.html)
+ [Legacy global endpoints](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#deprecated-global-endpoint)

## Architecture
<a name="migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3-architecture"></a>

**Source technology stack**
+ Hadoop cluster with DistCp installed

**Target technology stack**
+ Amazon S3
+ Amazon VPC

**Target architecture**

![\[Hadoop cluster with DistCp copies data from on-premises environment through Direct Connect to S3.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/images/pattern-img/8d2b47ae-e854-4e5d-8f19-b9c2606f2c59/images/b8a249bd-307b-41ec-b939-5039d0ae7123.png)


The diagram shows how the Hadoop administrator uses DistCp to copy data from an on-premises environment through a private network connection, such as AWS Direct Connect, to Amazon S3 through an Amazon S3 interface endpoint.

## Tools
<a name="migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3-tools"></a>

**AWS services**
+ [AWS Identity and Access Management (IAM)](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html) helps you securely manage access to your AWS resources by controlling who is authenticated and authorized to use them.
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.
+ [Amazon Virtual Private Cloud (Amazon VPC)](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) helps you launch AWS resources into a virtual network that you’ve defined. This virtual network resembles a traditional network that you’d operate in your own data center, with the benefits of using the scalable infrastructure of AWS.

**Other tools**
+ [Apache Hadoop DistCp](https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html) (distributed copy) is a tool used for copying large inter-clusters and intra-clusters. DistCp uses Apache MapReduce for distribution, error handling and recovery, and reporting.

## Epics
<a name="migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3-epics"></a>

### Migrate data to the AWS Cloud
<a name="migrate-data-to-the-aws-cloud"></a>


| Task | Description | Skills required | 
| --- | --- | --- | 
| Create an endpoint for AWS PrivateLink for Amazon S3. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3.html) | AWS administrator | 
| Verify the endpoints and find the DNS entries. | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3.html) | AWS administrator | 
| Check the firewall rules and routing configurations. | To confirm that your firewall rules are open and that your networking configuration is correctly set up, use Telnet to test the endpoint on port 443. For example:<pre>$ telnet vpce-<your-VPC-endpoint-ID>.s3.us-east-2.vpce.amazonaws.com 443<br /><br />Trying 10.104.88.6...<br /><br />Connected to vpce-<your-VPC-endpoint-ID>.s3.us-east-2.vpce.amazonaws.com.<br /><br />...<br /><br />$ telnet vpce-<your-VPC-endpoint-ID>.s3.us-east-2.vpce.amazonaws.com 443<br /><br />Trying 10.104.71.141...<br /><br />Connected to vpce-<your-VPC-endpoint-ID>.s3.us-east-2.vpce.amazonaws.com.</pre>If you use the Regional entry, a successful test shows that the DNS is alternating between the two IP addresses that you can see on the **Subnets** tab for your selected endpoint in the Amazon VPC console. | Network administrator, AWS administrator | 
| Configure the name resolution. | You must configure the name resolution to allow Hadoop to access the Amazon S3 interface endpoint. You can’t use the endpoint name itself. Instead, you must resolve `<your-bucket-name>.s3.<your-aws-region>.amazonaws.com` or `*.s3.<your-aws-region>.amazonaws.com`. For more information on this naming limitation, see [Introducing the Hadoop S3A client](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Introducing_the_Hadoop_S3A_client.) (Hadoop website).Choose one of the following configuration options:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-data-from-an-on-premises-hadoop-environment-to-amazon-s3-using-distcp-with-aws-privatelink-for-amazon-s3.html) | AWS administrator | 
| Configure authentication for Amazon S3. | To authenticate to Amazon S3 through Hadoop, we recommend that you export temporary role credentials to the Hadoop environment. For more information, see [Authenticating with S3](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Authenticating_with_S3) (Hadoop website). For long-running jobs, you can create a user and assign a policy that has permissions to put data into an S3 bucket only. The access key and secret key can be stored on Hadoop, accessible only to the DistCp job itself and to the Hadoop administrator. For more information on storing secrets, see [Storing secrets with Hadoop Credential Providers](https://hadoop.apache.org/docs/r3.1.1/hadoop-aws/tools/hadoop-aws/index.html#hadoop_credential_providers) (Hadoop website). For more information on other authentication methods, see [How to get credentials of an IAM role for use with CLI access to an AWS account](https://docs.aws.amazon.com/singlesignon/latest/userguide/howtogetcredentials.html) in the documentation for AWS IAM Identity Center (successor to AWS Single Sign-On).To use temporary credentials, add the temporary credentials to your credentials file, or run the following commands to export the credentials to your environment:<pre>export AWS_SESSION_TOKEN=SECRET-SESSION-TOKEN<br />export AWS_ACCESS_KEY_ID=SESSION-ACCESS-KEY<br />export AWS_SECRET_ACCESS_KEY=SESSION-SECRET-KEY</pre>If you have a traditional access key and secret key combination, run the following commands:<pre>export AWS_ACCESS_KEY_ID=my.aws.key<br />export AWS_SECRET_ACCESS_KEY=my.secret.key</pre>If you use an access key and secret key combination, then change the credentials provider in the DistCp commands from `"org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider"` to `"org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"`. | AWS administrator | 
| Transfer data by using DistCp. | To use DistCp to transfer data, run the following commands:<pre>hadoop distcp -Dfs.s3a.aws.credentials.provider=\<br />"org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider" \<br />-Dfs.s3a.access.key="${AWS_ACCESS_KEY_ID}" \<br />-Dfs.s3a.secret.key="${AWS_SECRET_ACCESS_KEY}" \<br />-Dfs.s3a.session.token="${AWS_SESSION_TOKEN}" \<br />-Dfs.s3a.path.style.access=true \<br />-Dfs.s3a.connection.ssl.enabled=true \<br />-Dfs.s3a.endpoint=s3.<your-aws-region>.amazonaws.com \<br />hdfs:///user/root/ s3a://<your-bucket-name></pre>The AWS Region of the endpoint isn’t automatically discovered when you use the DistCp command with AWS PrivateLink for Amazon S3. Hadoop 3.3.2 and later versions resolve this issue by enabling the option to explicitly set the AWS Region of the S3 bucket. For more information, see [S3A to add option fs.s3a.endpoint.region to set AWS region](https://issues.apache.org/jira/browse/HADOOP-17705) (Hadoop website).For more information on additional S3A providers, see [General S3A Client configuration](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#General_S3A_Client_configuration) (Hadoop website). For example, if you use encryption, you can add the following option to the series of commands above depending on your type of encryption:<pre>-Dfs.s3a.server-side-encryption-algorithm=AES-256 [or SSE-C or SSE-KMS]</pre>To use the interface endpoint with S3A, you must create a DNS alias entry for the S3 Regional name (for example, `s3.<your-aws-region>.amazonaws.com`) to the interface endpoint. See the *Configure authentication for Amazon S3* section for instructions. This workaround is required for Hadoop 3.3.2 and earlier versions. Future versions of S3A won’t require this workaround.If you have signature issues with Amazon S3, add an option to use Signature Version 4 (SigV4) signing:<pre>-Dmapreduce.map.java.opts="-Dcom.amazonaws.services.s3.enableV4=true"</pre> | Migration engineer, AWS administrator | 

# More patterns
<a name="storageandbackup-more-patterns-pattern-list"></a>

**Topics**
+ [Access AWS services from IBM z/OS by installing the AWS CLI](access-aws-services-from-ibm-z-os-by-installing-aws-cli.md)
+ [Automatically archive items to Amazon S3 using DynamoDB TTL](automatically-archive-items-to-amazon-s3-using-dynamodb-ttl.md)
+ [Automatically back up SAP HANA databases using Systems Manager and EventBridge](automatically-back-up-sap-hana-databases-using-systems-manager-and-eventbridge.md)
+ [Back up and archive mainframe data to Amazon S3 using BMC AMI Cloud Data](back-up-and-archive-mainframe-data-to-amazon-s3-using-bmc-ami-cloud-data.md)
+ [Build an ETL service pipeline to load data incrementally from Amazon S3 to Amazon Redshift using AWS Glue](build-an-etl-service-pipeline-to-load-data-incrementally-from-amazon-s3-to-amazon-redshift-using-aws-glue.md)
+ [Configure model invocation logging in Amazon Bedrock by using AWS CloudFormation](configure-bedrock-invocation-logging-cloudformation.md)
+ [Convert and unpack EBCDIC data to ASCII on AWS by using Python](convert-and-unpack-ebcdic-data-to-ascii-on-aws-by-using-python.md)
+ [Convert VARCHAR2(1) data type for Oracle to Boolean data type for Amazon Aurora PostgreSQL](convert-varchar2-1-data-type-for-oracle-to-boolean-data-type-for-amazon-aurora-postgresql.md)
+ [Create an Amazon ECS task definition and mount a file system on EC2 instances using Amazon EFS](create-an-amazon-ecs-task-definition-and-mount-a-file-system-on-ec2-instances-using-amazon-efs.md)
+ [Deliver DynamoDB records to Amazon S3 using Kinesis Data Streams and Firehose with AWS CDK](deliver-dynamodb-records-to-amazon-s3-using-kinesis-data-streams-and-amazon-data-firehose-with-aws-cdk.md)
+ [Deploy a Lustre file system for high-performance data processing by using Terraform and DRA](deploy-lustre-file-system-for-high-performance-data-processing-with-terraform-dra.md)
+ [Estimate storage costs for an Amazon DynamoDB table](estimate-storage-costs-for-an-amazon-dynamodb-table.md)
+ [Identify public Amazon S3 buckets in AWS Organizations by using Security Hub CSPM](identify-public-s3-buckets-in-aws-organizations-using-security-hub.md)
+ [Migrate an on-premises SFTP server to AWS using AWS Transfer for SFTP](migrate-an-on-premises-sftp-server-to-aws-using-aws-transfer-for-sftp.md)
+ [Migrate an Oracle partitioned table to PostgreSQL by using AWS DMS](migrate-an-oracle-partitioned-table-to-postgresql-by-using-aws-dms.md)
+ [Migrate data from Microsoft Azure Blob to Amazon S3 by using Rclone](migrate-data-from-microsoft-azure-blob-to-amazon-s3-by-using-rclone.md)
+ [Migrate Oracle CLOB values to individual rows in PostgreSQL on AWS](migrate-oracle-clob-values-to-individual-rows-in-postgresql-on-aws.md)
+ [Migrate shared file systems in an AWS large migration](migrate-shared-file-systems-in-an-aws-large-migration.md)
+ [Migrate small sets of data from on premises to Amazon S3 using AWS SFTP](migrate-small-sets-of-data-from-on-premises-to-amazon-s3-using-aws-sftp.md)
+ [Monitor Amazon Aurora for instances without encryption](monitor-amazon-aurora-for-instances-without-encryption.md)
+ [Move mainframe files directly to Amazon S3 using Transfer Family](move-mainframe-files-directly-to-amazon-s3-using-transfer-family.md)
+ [Remove Amazon EC2 entries in the same AWS account from AWS Managed Microsoft AD by using AWS Lambda automation](remove-amazon-ec2-entries-in-the-same-aws-account-from-aws-managed-microsoft-ad.md)
+ [Run stateful workloads with persistent data storage by using Amazon EFS on Amazon EKS with AWS Fargate](run-stateful-workloads-with-persistent-data-storage-by-using-amazon-efs-on-amazon-eks-with-aws-fargate.md)
+ [Successfully import an S3 bucket as an AWS CloudFormation stack](successfully-import-an-s3-bucket-as-an-aws-cloudformation-stack.md)
+ [Synchronize data between Amazon EFS file systems in different AWS Regions by using AWS DataSync](synchronize-data-between-amazon-efs-file-systems-in-different-aws-regions-by-using-aws-datasync.md)
+ [View EBS snapshot details for your AWS account or organization](view-ebs-snapshot-details-for-your-aws-account-or-organization.md)