Performance and capacity management
Monitor workload performance and ensure that capacity meets current and future demands.
To ensure that your applications fulfil their business purpose, it’s essential to measure performance and ensure that you do not reach capacity limits. Although the AWS Cloud can allow you to scale to unparalleled levels, it’s important to understand that there are performance considerations and service quotas that need to be measured and acted upon.
Start
Many AWS services publish metrics to Amazon CloudWatch that should
form the basis of the absolute minimum metrics you should be
monitoring and alerting upon. As stated in the observability
section of this whitepaper, you should also monitor and alert upon
metrics collected from AWS using the
CloudWatch
agent or
AWS Distro for OpenTelemetry
While the cloud offers virtually infinite scalability, even for the largest organizations and applications, it’s important to remember that managed services have quotas (formerly referred to as limits) that are designed to help guarantee the availability of AWS resources, and prevent accidental provisioning of more resources than needed. You must anticipate these quotas by running load tests in pre-production environments, to anticipate demand in production. These tests are vital to ensure that you do not encounter any unanticipated service quotas or hit any limits encountered by the design of your application.
Use Amazon CloudWatch metrics that are provided by AMS, as well as metrics provided by your EC2 instances through the CloudWatch agent, to ensure that your application will respond according to your business requirements. It is vital that you test these against service quotas in pre-production before deploying to production, but you must also continuously monitor these metrics in production. It is possible, and often desirable, that your actual demand will outstrip your anticipated demand. If this is the case, you need to have mechanisms that alert you to such changes so that you can respond accordingly.
AWS Service Quotas is a service that enables you to view and manage your quotas for AWS services from a central location. Along with looking up the quota values, you can also request a quota increase, monitor the usage of specific services API actions, and create alerts for them directly from the Service Quotas console. AWS Trusted Advisor gives you additional insight as to whether or not you are approaching or breaching limits.
Advance
To make full use of the tools available to you to measure performance, you need to monitor your metrics and make use of AWS services that give you enhanced insights. To gain full visibility into your applications’ performance, you need to implement distributed tracing, one of the three pillars of observability.
AWS X-Ray is a distributed tracing system that can help you gain insights into how your applications communicate between each other, measure performance of functions or lines of code, and provide analytics capabilities correlated with other signals, such as metrics and logs. Additionally, CloudWatch Log Insights, Lambda Insights, Container Insights, and Metric Insights enable you to use enhanced insights into how your application is performing.
While load testing gives you an idea of how well your application sustains its performance at certain amount of traffic and its associated infrastructure capacity, your infrastructure will probably not always be performing at that level. This is why it’s vital that you implement Auto Scaling for horizontal scaling wherever possible to allow your infrastructure to scale according to the traffic. AWS Auto Scaling helps your application scale by monitoring and adjusting capacity using metrics and user-defined thresholds.
Multiple AWS services provide
serverless
Excel
Use testing to evaluate performance and capacity limits, and use scaling mechanisms to help you sustain your customer traffic and growth. You should adopt flexible architectures that enable you to scale globally. Extending your infrastructure to multiple regions can give you that extra mile of capacity. Using the tools discussed in the Observability section of this whitepaper, such as CloudWatch RUM, you can understand the performance impact on your customers around the globe and deploy accordingly.
Testing your applications should not be a one-time event before
going to production. Continuously testing your applications helps
you detect customer impact when issues occur and can surface
issues before technical metrics become available. Use
CloudWatch
Synthetics and CloudWatch RUM as described in the
Observability section to continuously monitor application
performance, including when you have no active users. Build
experimentations and design your applications around failures to
help recover quickly. AWS FIS is a fully managed service to help
you run experiments safely and easily implement
chaos
engineering