Integrating operational recommendations into your application with AWS CloudFormation - AWS Resilience Hub

Integrating operational recommendations into your application with AWS CloudFormation

After you choose Create CloudFormation template in the Operational recommendations page, AWS Resilience Hub creates an AWS CloudFormation template that describes the specific alarm, standard operating procedure (SOP), or AWS FIS experiment for your application. The AWS CloudFormation template is stored in an Amazon S3 bucket, and you can check the S3 path to the template in the Template details tab on the Operational recommendations page.

For example, the listing below shows a JSON-formatted AWS CloudFormation template that describes an alarm recommendation rendered by AWS Resilience Hub. It's a Read Throttling Alarm for a DynamoDB table called Employees.

The Resources section of the template describes the AWS::CloudWatch::Alarm alarm that's activated when the number of read throttle events for the DynamoDB table exceeds 1. And the two AWS::SSM::Parameter resources define metadata that allow AWS Resilience Hub to identify installed resources without having to scan the actual application.

{ "AWSTemplateFormatVersion" : "2010-09-09", "Parameters" : { "SNSTopicARN" : { "Type" : "String", "Description" : "The ARN of the SNS topic to which alarm status changes are to be sent. This must be in the same region being deployed.", "AllowedPattern" : "^arn:(aws|aws-cn|aws-iso|aws-iso-[a-z]{1}|aws-us-gov):sns:([a-z]{2}-((iso[a-z]{0,1}-)|(gov-)){0,1}[a-z]+-[0-9]):[0-9]{12}:[A-Za-z0-9/][A-Za-z0-9:_/+=,@.-]{1,256}$" } }, "Resources" : { "ReadthrottleeventsthresholdexceededEmployeesONDEMAND0DynamoDBTablePXBZQYH3DCJ9Alarm" : { "Type" : "AWS::CloudWatch::Alarm", "Properties" : { "AlarmDescription" : "An Alarm by AWS Resilience Hub that alerts when the number of read-throttle events are greater than 1.", "AlarmName" : "ResilienceHub-ReadThrottleEventsAlarm-2020-04-01_Employees-ON-DEMAND-0-DynamoDBTable-PXBZQYH3DCJ9", "AlarmActions" : [ { "Ref" : "SNSTopicARN" } ], "MetricName" : "ReadThrottleEvents", "Namespace" : "AWS/DynamoDB", "Statistic" : "Sum", "Dimensions" : [ { "Name" : "TableName", "Value" : "Employees-ON-DEMAND-0-DynamoDBTable-PXBZQYH3DCJ9" } ], "Period" : 60, "EvaluationPeriods" : 1, "DatapointsToAlarm" : 1, "Threshold" : 1, "ComparisonOperator" : "GreaterThanOrEqualToThreshold", "TreatMissingData" : "notBreaching", "Unit" : "Count" }, "Metadata" : { "AWS::ResilienceHub::Monitoring" : { "recommendationId" : "dynamodb:alarm:health-read_throttle_events:2020-04-01" } } }, "dynamodbalarmhealthreadthrottleevents20200401EmployeesONDEMAND0DynamoDBTablePXBZQYH3DCJ9AlarmSSMParameter" : { "Type" : "AWS::SSM::Parameter", "Properties" : { "Name" : "/ResilienceHub/Alarm/3f904525-4bfa-430f-96ef-58ec9b19aa73/dynamodb-alarm-health-read-throttle-events-2020-04-01_Employees-ON-DEMAND-0-DynamoDBTable-PXBZQYH3DCJ9", "Type" : "String", "Value" : { "Fn::Sub" : "${ReadthrottleeventsthresholdexceededEmployeesONDEMAND0DynamoDBTablePXBZQYH3DCJ9Alarm}" }, "Description" : "SSM Parameter for identifying installed resources." } }, "dynamodbalarmhealthreadthrottleevents20200401EmployeesONDEMAND0DynamoDBTablePXBZQYH3DCJ9AlarmInfoSSMParameter" : { "Type" : "AWS::SSM::Parameter", "Properties" : { "Name" : "/ResilienceHub/Info/Alarm/3f904525-4bfa-430f-96ef-58ec9b19aa73/dynamodb-alarm-health-read-throttle-events-2020-04-01_Employees-ON-DEMAND-0-DynamoDBTable-PXBZQYH3DCJ9", "Type" : "String", "Value" : { "Fn::Sub" : "{\"alarmName\":\"${ReadthrottleeventsthresholdexceededEmployeesONDEMAND0DynamoDBTablePXBZQYH3DCJ9Alarm}\",\"referenceId\":\"dynamodb:alarm:health_read_throttle_events:2020-04-01\",\"resourceId\":\"Employees-ON-DEMAND-0-DynamoDBTable-PXBZQYH3DCJ9\",\"relatedSOPs\":[\"dynamodb:sop:update_provisioned_capacity:2020-04-01\"]}" }, "Description" : "SSM Parameter for identifying installed resources." } } } }

Modifying the AWS CloudFormation template

The easiest way to integrate an alarm, SOP, or AWS FIS resource into your main application is to simply add it as another resource in the template that describes your application template. The JSON-formatted file provided below provides a basic outline of how a DynamoDB table is described in an AWS CloudFormation template. A real application is likely to include several more resources, such as additional tables.

{ "AWSTemplateFormatVersion": "2010-09-09T00:00:00.000Z", "Description": "Application Stack with Employees Table", "Outputs": { "DynamoDBTable": { "Description": "The DynamoDB Table Name", "Value": {"Ref": "Employees"} } }, "Resources": { "Employees": { "Type": "AWS::DynamoDB::Table", "Properties": { "BillingMode": "PAY_PER_REQUEST", "AttributeDefinitions": [ { "AttributeName": "USER_ID", "AttributeType": "S" }, { "AttributeName": "RANGE_ATTRIBUTE", "AttributeType": "S" } ], "KeySchema": [ { "AttributeName": "USER_ID", "KeyType": "HASH" }, { "AttributeName": "RANGE_ATTRIBUTE", "KeyType": "RANGE" } ], "PointInTimeRecoverySpecification": { "PointInTimeRecoveryEnabled": true }, "Tags": [ { "Key": "Key", "Value": "Value" } ], "LocalSecondaryIndexes": [ { "IndexName": "resiliencehub-index-local-1", "KeySchema": [ { "AttributeName": "USER_ID", "KeyType": "HASH" }, { "AttributeName": "RANGE_ATTRIBUTE", "KeyType": "RANGE" } ], "Projection": { "ProjectionType": "ALL" } } ], "GlobalSecondaryIndexes": [ { "IndexName": "resiliencehub-index-1", "KeySchema": [ { "AttributeName": "USER_ID", "KeyType": "HASH" } ], "Projection": { "ProjectionType": "ALL" } } ] } } } }

To allow the alarm resource to be deployed with your application, you now need to replace the hardcoded resources with a dynamic reference in the application stacks.

So, in the AWS::CloudWatch::Alarm resource definition, change the following:

"Value" : "Employees-ON-DEMAND-0-DynamoDBTable-PXBZQYH3DCJ9"

to the below:

"Value" : {"Ref": "Employees"}

And under in the AWS::SSM::Parameter resource definition, change the following:

"Fn::Sub" : "{\"alarmName\":\"${ReadthrottleeventsthresholdexceededDynamoDBEmployeesONDEMAND0DynamoDBTablePXBZQYH3DCJ9Alarm}\",\"referenceId\":\"dynamodb:alarm:health_read_throttle_events:2020-04-01\",\"resourceId\":\"Employees-ON-DEMAND-0-DynamoDBTable-PXBZQYH3DCJ9\",\"relatedSOPs\":[\"dynamodb:sop:update_provisioned_capacity:2020-04-01\"]}"

to the below:

"Fn::Sub" : "{\"alarmName\":\"${ReadthrottleeventsthresholdexceededEmployeesONDEMAND0DynamoDBTablePXBZQYH3DCJ9Alarm}\",\"referenceId\":\"dynamodb:alarm:health_read_throttle_events:2020-04-01\",\"resourceId\":\"${Employees}\",\"relatedSOPs\":[\"dynamodb:sop:update_provisioned_capacity:2020-04-01\"]}"

When modifying AWS CloudFormation templates for SOPs and AWS FIS experiments, you will take the same approach, replacing hardcoded reference IDs with dynamic references that continue to work even after hardware changes.

By using a reference to the DynamoDB table, you allow AWS CloudFormation to do the following:

  • Create the database table first.

  • Always use the actual ID of the generated resource in the alarm, and update the alarm dynamically if AWS CloudFormation needs to replace the resource.

Note

You can choose more advanced methods for managing your application resources with AWS CloudFormation such as nesting stacks or referring to resource outputs in a separate AWS CloudFormation stack. (But if you want to keep the recommendation stack separate from the main stack, you need to configure a way to pass information between the two stacks.)

In addition, third-party tools, such as Terraform by HashiCorp, can also be used to provision Infrastructure as Code (IaC).