REL08-BP03 Integrate resiliency testing as part of your deployment - AWS Well-Architected Framework (2023-04-10)

REL08-BP03 Integrate resiliency testing as part of your deployment

Resiliency tests (using the principles of chaos engineering) are run as part of the automated deployment pipeline in a pre-production environment.

These tests are staged and run in the pipeline in a pre-production environment. They should also be run in production as part of game days.

Level of risk exposed if this best practice is not established: Medium

Implementation guidance

  • Integrate resiliency testing as part of your deployment. Use Chaos Engineering, the discipline of experimenting on a workload to build confidence in the workload’s capability to withstand turbulent conditions in production.

    • Resiliency tests inject faults or resource degradation to assess that your workload responds with its designed resilience.

    • These tests can be run regularly in pre-production environments in automated deployment pipelines.

    • They should also be run in production, as part of scheduled game days.

    • Using Chaos Engineering principles, propose hypotheses about how your workload will perform under various impairments, then test your hypotheses using resiliency testing.

Resources

Related documents:

Related examples: