Summary Prerequisites and limitations Architecture Tools Epics Related resources Additional information

Automate backups for Amazon RDS for PostgreSQL DB instances by using AWS Batch

Created by Kirankumar Chandrashekar (AWS)

Environment: PoC or pilot	Technologies: Containers & microservices; Databases; DevOps	Workload: All other workloads
AWS services: Amazon RDS; AWS Batch; Amazon CloudWatch; AWS Lambda; Amazon S3

Summary

Backing up your PostgreSQL databases is an important task and can typically be completed with the pg_dump utility, which uses the COPY command by default to create a schema and data dump of a PostgreSQL database. However, this process can become repetitive if you require regular backups for multiple PostgreSQL databases. If your PostgreSQL databases are hosted in the cloud, you can also take advantage of the automated backup feature provided by Amazon Relational Database Service (Amazon RDS) for PostgreSQL as well. This pattern describes how to automate regular backups for Amazon RDS for PostgreSQL DB instances using the pg_dump utility.

Note: The instructions assume that you're using Amazon RDS. However, you can also use this approach for PostgreSQL databases that are hosted outside Amazon RDS. To take backups, the AWS Lambda function must be able to access your databases.

A time-based Amazon CloudWatch Events event initiates a Lambda function that searches for specific backup tags applied to the metadata of the PostgreSQL DB instances on Amazon RDS. If the PostgreSQL DB instances have the bkp:AutomatedDBDump = Active tag and other required backup tags, the Lambda function submits individual jobs for each database backup to AWS Batch.

AWS Batch processes these jobs and uploads the backup data to an Amazon Simple Storage Service (Amazon S3) bucket. This pattern uses a Dockerfile and an entrypoint.sh file to build a Docker container image that is used to make backups in the AWS Batch job. After the backup process is complete, AWS Batch records the backup details to an inventory table on Amazon DynamoDB. As an additional safeguard, a CloudWatch Events event initiates an Amazon Simple Notification Service (Amazon SNS) notification if a job fails in AWS Batch.

Prerequisites and limitations

Prerequisites

An active AWS account.
An existing managed or unmanaged compute environment. For more information, see Managed and unmanaged compute environments in the AWS Batch documentation.
AWS Command Line Interface (CLI) version 2 Docker image, installed and configured.
Existing Amazon RDS for PostgreSQL DB instances.
An existing S3 bucket.
Docker, installed and configured on Linux, macOS, or Windows.
Familiarity with coding in Lambda.

Architecture

Technology stack

Amazon CloudWatch Events
Amazon DynamoDB
Amazon Elastic Container Registry (Amazon ECR)
Amazon RDS
Amazon SNS
Amazon S3
AWS Batch
AWS Key Management Service (AWS KMS)
AWS Lambda
AWS Secrets Manager
Docker

Tools

Amazon CloudWatch Events – CloudWatch Events delivers a near real-time stream of system events that describe changes in AWS resources.
Amazon DynamoDB – DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.
Amazon ECR – Amazon Elastic Container Registry (Amazon ECR) is a managed AWS container image registry service that is secure, scalable, and reliable.
Amazon RDS – Amazon Relational Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a relational database in the AWS Cloud.
Amazon SNS – Amazon Simple Notification Service (Amazon SNS) is a managed service that provides message delivery from publishers to subscribers.
Amazon S3 – Amazon Simple Storage Service (Amazon S3) is storage for the internet.
AWS Batch – AWS Batch helps you run batch computing workloads on the AWS Cloud.
AWS KMS – AWS Key Management Service (AWS KMS) is a managed service that makes it easy for you to create and control the encryption keys used to encrypt your data.
AWS Lambda – Lambda is a compute service that helps you run code without provisioning or managing servers.
AWS Secrets Manager – Secrets Manager helps you replace hardcoded credentials in your code, including passwords, with an API call to Secrets Manager to retrieve the secret programmatically.
Docker – Docker helps developers easily pack, ship, and run any application as a lightweight, portable, and self-sufficient container.

Your PostgreSQL DB instances on Amazon RDS must have tags applied to their metadata. The Lambda function searches for tags to identify DB instances that should be backed up, and the following tags are typically used.

Tag	Description
bkp:AutomatedDBDump = Active	Identifies an Amazon RDS DB instance as a candidate for backups.
bkp:AutomatedBackupSecret = <secret_name >	Identifies the Secrets Manager secret that contains the Amazon RDS login credentials.
bkp:AutomatedDBDumpS3Bucket = <s3_bucket_name>	Identifies the S3 bucket to send backups to.
bkp:AutomatedDBDumpFrequency bkp:AutomatedDBDumpTime	Identify the frequency and times when databases should be backed up.
bkp:pgdumpcommand = <pgdump_command>	Identifies the databases for which the backups need to be taken.

Epics

Task	Description	Skills required
Create a table in DynamoDB.	Sign in to the AWS Management Console, open the Amazon DynamoDB console, and create a table. For help with this and other stories, see the Related resources section.	Cloud administrator, Database administrator
Confirm that the table was created.	Run the `aws dynamodb describe-table --table-name <table-name> \| grep TableStatus` command. If the table exists, the command will return the `"TableStatus": "ACTIVE",` result.	Cloud administrator, Database administrator

Task	Description	Skills required
Create an SNS topic.	Open the Amazon SNS console, choose Topics, and create an SNS topic with the name `JobFailedAlert`. Subscribe an active email address to the topic, and check your email inbox to confirm the SNS subscription email from AWS Notifications.	Cloud administrator
Create a failed job event rule for AWS Batch.	Open the Amazon CloudWatch console, choose Events, and then choose Create rule. Choose Show advanced options, and choose Edit. For Build a pattern that selects events for processing by your targets, replace the existing text with the “Failed job event” code from the Additional information section. This code defines a CloudWatch Events rule that initiates when AWS Batch has a `Failed` event.	Cloud administrator
Add event rule target.	In Targets, choose Add targets, and choose the `JobFailedAlert` SNS topic. Configure the remaining details and create the Cloudwatch Events rule.	Cloud administrator

Task	Description	Skills required
Create an Amazon ECR repository.	Open the Amazon ECR console and choose the AWS Region in which you want to create your repository. Choose Repositories, and then choose Create repository. Configure the repository according to your requirements.	Cloud administrator
Write a Dockerfile.	Sign in to Docker and use the “Sample Dockerfile” and “Sample entrypoint.sh file” from the Additional information section to build a Dockerfile.	DevOps engineer
Create a Docker image and push it to the Amazon ECR repository.	Build the Dockerfile into a Docker image and push it to the Amazon ECR repository. For help with this story, see the Related resources section.	DevOps engineer

Task	Description	Skills required
Create an AWS Batch job definition.	Open the AWS Batch console and create a job definition that includes the Amazon ECR repository’s Uniform Resource Identifier (URI) as the property `Image`.	Cloud administrator
Configure the AWS Batch job queue.	On the AWS Batch console, choose Job queues, and then choose Create queue. Create a job queue that will store jobs until AWS Batch runs them on the resources within your compute environment. Important: Make sure you write logic for AWS Batch to record the backup details to the DynamoDB inventory table.	Cloud administrator

Task	Description	Skills required
Create a Lambda function to search for tags.	Create a Lambda function that searches for tags on your PostgreSQL DB instances and identifies backup candidates. Make sure your Lambda function can identify the `bkp:AutomatedDBDump = Active` tag and all other required tags. Important: The Lambda function must also be able to add jobs to the AWS Batch job queue.	DevOps engineer
Create a time-based CloudWatch Events event.	Open the Amazon CloudWatch console and create a CloudWatch Events event that uses a cron expression to run your Lambda function on a regular schedule. Important: All scheduled events use the UTC time zone.	Cloud administrator

Task	Description	Skills required
Create an Amazon KMS key.	Open the Amazon KMS console and create a KMS key that can be used to encrypt the Amazon RDS credentials stored in AWS Secrets Manager.	Cloud administrator
Create an AWS Secrets Manager secret.	Open the AWS Secrets Manager console and store your Amazon RDS for PostgreSQL database credentials as a secret.	Cloud administrator
Add the required tags to the PostgreSQL DB instances.	Open the Amazon RDS console and add tags to the PostgreSQL DB instances that you want to automatically back up. You can use the tags from the table in the Tools section. If you require backups from multiple PostgreSQL databases within the same Amazon RDS instance, then use `-d test:-d test1` as the value for the `bkp:pgdumpcommand` tag. Important: `test` and `test1` are database names. Make sure that there is no space after the colon (:).	Cloud administrator
Verify the backup automation.	To verify the backup automation, you can either invoke the Lambda function or wait for the backup schedule to begin. After the backup process is complete, check that the DynamoDB inventory table has a valid backup entry for your PostgreSQL DB instances. If they match, then the backup automation process is successful.	Cloud administrator

Related resources

Create an inventory table in DynamoDB

Create an Amazon DynamoDB table

Create an SNS topic for failed job events in AWS Batch

Build a Docker image and push it to an Amazon ECR repository

Create the AWS Batch components

Create a Lambda function

Create a CloudWatch Events event

Test the backup automation

Additional information

Failed job event:


{
  "detail-type": [
    "Batch Job State Change"
  ],
  "source": [
    "aws.batch"
  ],
  "detail": {
    "status": [
      "FAILED"
    ]
  }
}

Sample Dockerfile:


FROM alpine:latest
RUN apk --update add py-pip postgresql-client jq bash && \
pip install awscli && \
rm -rf /var/cache/apk/*
ADD entrypoint.sh /usr/bin/
RUN chmod +x /usr/bin/entrypoint.sh
ENTRYPOINT ["entrypoint.sh"]

Sample entrypoint.sh file:


 #!/bin/bash
set -e
DATETIME=`date +"%Y-%m-%d_%H_%M"`
FILENAME=RDS_PostGres_dump_${RDS_INSTANCE_NAME}
FILE=${FILENAME}_${DATETIME}

aws configure --profile new-profile set role_arn arn:aws:iam::${TargetAccountId}:role/${TargetAccountRoleName}
aws configure --profile new-profile set credential_source EcsContainer

echo "Central Account access provider IAM role is: "
aws sts get-caller-identity

echo "Target Customer Account access provider IAM role is: "
aws sts get-caller-identity --profile new-profile

securestring=$(aws secretsmanager get-secret-value --secret-id $SECRETID --output json --query 'SecretString' --region=$REGION --profile new-profile)

if [[ ${securestring} ]]; then
    echo "successfully accessed secrets manager and got the credentials"
    export PGPASSWORD=$(echo $securestring | jq --raw-output | jq -r '.DB_PASSWORD')
    PGSQL_USER=$(echo $securestring | jq --raw-output | jq -r '.DB_USERNAME')
    echo "Executing pg_dump for the PostGres endpoint ${PGSQL_HOST}"
    # pg_dump -h $PGSQL_HOST -U $PGSQL_USER -n dms_sample | gzip -9 -c  | aws s3 cp - --region=$REGION  --profile new-profile s3://$BUCKET/$FILE
    # in="-n public:-n private"
    IFS=':' list=($EXECUTE_COMMAND);
    for command in "${list[@]}";
      do
        echo $command;
        pg_dump -h $PGSQL_HOST -U $PGSQL_USER ${command} | gzip -9 -c  | aws s3 cp - --region=$REGION --profile new-profile s3://${BUCKET}/${FILE}-${command}".sql.gz"
        echo $?;
        if  [[ $? -ne 0 ]]; then
            echo "Error occurred in database backup process. Exiting now....."
            exit 1
        else
            echo "Postgresql dump was successfully taken for the RDS endpoint ${PGSQL_HOST} and is uploaded to the following S3 location s3://${BUCKET}/${FILE}-${command}.sql.gz"
            #write the details into the inventory table in central account
            echo "Writing to DynamoDB inventory table"
            aws dynamodb put-item --table-name ${RDS_POSTGRES_DUMP_INVENTORY_TABLE} --region=$REGION --item '{ "accountId": { "S": "'"${TargetAccountId}"'" }, "dumpFileUrl": {"S": "'"s3://${BUCKET}/${FILE}-${command}.sql.gz"'" }, "DumpAvailableTime": {"S": "'"`date +"%Y-%m-%d::%H::%M::%S"` UTC"'"}}'
            echo $?
            if  [[ $? -ne 0 ]]; then
                echo "Error occurred while putting item to DynamoDb Inventory Table. Exiting now....."
                exit 1
            else
                echo "Successfully written to DynamoDb Inventory Table ${RDS_POSTGRES_DUMP_INVENTORY_TABLE}"
            fi
        fi
      done;
else
    echo "Something went wrong {$?}"
    exit 1
fi

exec "$@"

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Activate mTLS in App Mesh on Amazon EKS

Automate deployment of Node Termination Handler