PERF02-BP06 Continually evaluate compute needs based on metrics
Use a data-driven approach to continually evaluate and optimize the compute resources for your workload over time.
Desired outcome: Use system-level metrics to actively monitor the behavior and requirements of your workload over time. Evaluate the demands of your workload against available resources based on the collected data, and make changes to your compute environment to best match your workload's profile. For example, a workload might be observed over time to be more memory-intensive than initially specified, so moving to a different instance family or size could improve both performance and efficiency.
Common anti-patterns:
-
Monitoring system-level metrics to gain insight into your workload and not re-evaluating compute needs.
-
Architecting your compute needs for peak workload requirements.
-
Oversizing the existing compute solution to meet scaling or performance requirements when moving to an alternative compute solution would more efficiently match your workload characteristics.
Benefits of establishing this best practice: Optimized compute resources based on real-world data and your desired balance of cost and performance.
Level of risk exposed if this best practice is not established: Low
Implementation guidance
Use a data-driven approach to optimize compute resources based on observed workload behavior. To achieve maximum performance and efficiency, use the data gathered over time from your workload to continually tune and optimize your resources. Look at the trends in your workload's usage of current resources and determine where you can make changes to better match your workload's needs. When resources are over-committed, system performance degrades, and when resources are not adequately used, the system is operating less efficiently and at a higher cost.
To optimize performance and resource utilization, you need a unified operational view, real-time granular data, and a historical reference. You can create automated dashboards to visualize this data and derive operational and utilization insights.
Implementation steps
-
Collect compute-related metrics over time.
-
Compare workload metrics against available resources in your selected compute solution.
-
Determine any required configuration changes by right-sizing the existing solution or evaluating alternative compute solutions.
Resources
Related best practices:
Related documents:
Related videos:
Related examples: