How AWS Resilience Hub works - AWS Resilience Hub

How AWS Resilience Hub works

AWS Resilience Hub helps you proactively prepare and protect your AWS applications from disruptions. The AWS Resilience Hub offers resiliency assessment and validation that integrate into your software development lifecycle to uncover resiliency weaknesses. AWS Resilience Hub helps you to estimate whether or not the recovery time objective (RTO) and recovery point objective (RPO) targets for your applications can be met, and helps resolve issues before they are released into production.

After you deploy an AWS application into production, you can use AWS Resilience Hub to continue tracking the resiliency posture of your application. If an outage occurs, AWS Resilience Hub sends a notification to the operator to launch the associated recovery process.


    Flowchart that shows how AWS Resilience Hub works.

The following steps provide a high-level outline of how AWS Resilience Hub works.

1. Describe the existing AWS application that you want to protect from disruptions as an AWS Resilience Hub application and then set resiliency objectives for the application.

When you describe the application, you import resources from AWS CloudFormation stacks, Terraform state files, AWS Resource Groups, Amazon Elastic Kubernetes Service (Amazon EKS) clusters, or an AppRegistry to form the structural basis of an application in AWS Resilience Hub. You can also use an existing application to build off an existing structure. Then, you attach a resiliency policy to the application.

An AWS Resilience Hub resiliency policy contains the information and objectives that are used to assess whether your application can recover from a disruption type, such as software or hardware disruption. When you create a resiliency policy, you define the RTO and RPO targets for the disruption types. These objectives are used to estimate whether the application meets the resiliency policy.

2. Assess the application to learn whether it meets your objectives.

After you describe your application and attach a resiliency policy to it, run a resiliency assessment. The assessment evaluates your application configuration against the resiliency policy that is attached to the application and generates a report. The report shows how your application measures against the objectives in your resiliency policy.

3. Receive recommendations to improve resiliency.

To improve resiliency, update your application and resiliency policy according to the recommendations from the assessment report. Recommendations include configurations of components, alarms, tests, and recovery SOPs. Then, you can run another assessment and compare the results with the previous report to see how much resiliency improves. Reiterate this process until your estimated workload RTO and estimated workload RPO meets your RTO and RPO targets.

4. Validate objectives and disaster recovery procedures.

Run tests to measure the resiliency of your AWS resources and the amount of time it takes to recover from application, infrastructure, Availability Zone, and AWS Region outages. To measure resiliency, these tests simulate outages of your AWS resources. Examples of outages include network unavailable errors, failovers, stopped processes, Amazon RDS boot recovery, and problems with your Availability Zone.

5. View and track your application resiliency over time.

After you deploy an AWS application into production, you can use AWS Resilience Hub to continue tracking the resiliency posture of the application. If an outage occurs, the operator can view the outage in AWS Resilience Hub and launch the associated recovery process.

6. Start recovery if there is a disruption.

If an application disruption occurs, AWS Resilience Hub helps identify the type of disruption and alerts the operator. Then, the operator can launch the associated SOP for recovery.