PERF02-BP02 Understand the available compute configuration options - Performance Efficiency Pillar

PERF02-BP02 Understand the available compute configuration options

Each compute solution has options and configurations available to you to support your workload characteristics. Learn how various options complement your workload, and which configuration options are best for your application. Examples of these options include instance family, sizes, features (GPU, I/O), bursting, time-outs, function sizes, container instances, and concurrency.

Desired outcome: The workload characteristics including CPU, memory, network throughput, GPU, IOPS, traffic patterns, and data access patterns are documented and used to configure the compute solution to match the workload characteristics. Each of these metrics plus custom metrics specific to your workload are recorded, monitored, and then used to optimize the compute configuration to best meet the requirements.

Common anti-patterns:

  • Using the same compute solution that was being used on premises.

  • Not reviewing the compute options or instance family to match workload characteristics.

  • Oversizing the compute to ensure bursting capability.

  • You use multiple compute management platforms for the same workload.

Benefits of establishing this best practice: Be familiar with the AWS compute offerings so that you can determine the correct solution for each of your workloads. After you have selected the compute offerings for your workload, you can quickly experiment with those compute offerings to determine how well they meet your workload needs. A compute solution that is optimized to meet your workload characteristics will increase your performance, lower your cost and increase your reliability.

Level of risk exposed if this best practice is not established: High

Implementation guidance

If your workload has been using the same compute option for more than four weeks and you anticipate that the characteristics will remain the same in the future, you can use AWS Compute Optimizer to provide a recommendation to you based on your compute characteristics. If AWS Compute Optimizer is not an option due to lack of metrics, a non-supported instance type or a foreseeable change in your characteristics then you must predict your metrics based on load testing and experimentation. 

Implementation steps:

  1. Are you running on EC2 instances or containers with the EC2 Launch Type?

    1. Can your workload use GPUs to increase performance?

      1. Accelerated Computing instances are GPU-based instances that provide the highest performance for machine learning training, inference and high performance computing.

    2. Does your workload run machine learning inference applications?

      1. AWS Inferentia (Inf1) — Inf1 instances are built to support machine learning inference applications. Using Inf1 instances, customers can run large-scale machine learning inference applications, such as image recognition, speech recognition, natural language processing, personalization, and fraud detection. You can build a model in one of the popular machine learning frameworks, such as TensorFlow, PyTorch, or MXNet and use GPU instances, to train your model. After your machine learning model is trained to meet your requirements, you can deploy your model on Inf1 instances by using AWS Neuron, a specialized software development kit (SDK) consisting of a compiler, runtime, and profiling tools that optimize the machine learning inference performance of Inferentia chips.

    3. Does your workload integrate with the low-level hardware to improve performance? 

      1. Field Programmable Gate Arrays (FPGA) — Using FPGAs, you can optimize your workloads by having custom hardware-accelerated operation for your most demanding workloads. You can define your algorithms by leveraging supported general programming languages such as C or Go, or hardware-oriented languages such as Verilog or VHDL.

    4. Do you have at least four weeks of metrics and can predict that your traffic pattern and metrics will remain about the same in the future?

      1. Use Compute Optimizer to get a machine learning recommendation on which compute configuration best matches your compute characteristics.

    5. Is your workload performance constrained by the CPU metrics? 

      1. Compute-optimized instances are ideal for the workloads that require high performing processors. 

    6. Is your workload performance constrained by the memory metrics? 

      1. Memory-optimized instances deliver large amounts of memory to support memory intensive workloads.

    7. Is your workload performance constrained by IOPS?

      1. Storage-optimized instances are designed for workloads that require high, sequential read and write access (IOPS) to local storage.

    8. Do your workload characteristics represent a balanced need across all metrics?

      1. Does your workload CPU need to burst to handle spikes in traffic?

        1. Burstable Performance instances are similar to Compute Optimized instances except they offer the ability to burst past the fixed CPU baseline identified in a compute-optimized instance.

      2. General Purpose instances provide a balance of all characteristics to support a variety of workloads.

    9. Is your compute instance running on Linux and constrained by network throughput on the network interface card?

      1. Review Performance Question 5, Best Practice 2: Evaluate available networking features to find the right instance type and family to meet your performance needs.

    10. Does your workload need consistent and predictable instances in a specific Availability Zone that you can commit to for a year? 

      1. Reserved Instances confirms capacity reservations in a specific Availability Zone. Reserved Instances are ideal for required compute power in a specific Availability Zone. 

    11. Does your workload have licenses that require dedicated hardware?

      1. Dedicated Hosts support existing software licenses and help you meet compliance requirements.

    12. Does your compute solution burst and require synchronous processing?

      1. On-Demand Instances let you use the compute capacity by the hour or second with no long-term commitment. These instances are good for bursting above performance baseline needs.

    13. Is your compute solution stateless, fault-tolerant, and asynchronous? 

      1. Spot Instances let you take advantage of unused instance capacity for your stateless, fault-tolerant workloads. 

  2. Are you running containers on Fargate?

    1. Is your task performance constrained by the memory or CPU?

      1. Use the Task Size to adjust your memory or CPU.

    2. Is your performance being affected by your traffic pattern bursts?

      1. Use the Auto Scaling configuration to match your traffic patterns.

  3. Is your compute solution on Lambda?

    1. Do you have at least four weeks of metrics and can predict that your traffic pattern and metrics will remain about the same in the future?

      1. Use Compute Optimizer to get a machine learning recommendation on which compute configuration best matches your compute characteristics.

    2. Do you not have enough metrics to use AWS Compute Optimizer?

      1. If you do not have metrics available to use Compute Optimizer, use AWS Lambda Power Tuning to help select the best configuration.

    3. Is your function performance constrained by the memory or CPU?

      1. Configure your Lambda memory to meet your performance needs metrics.

    4. Is your function timing out when running?

      1. Change the timeout settings

    5. Is your function performance constrained by bursts of activity and concurrency? 

      1. Configure the concurrency settings to meet your performance requirements.

    6. Does your function run asynchronously and is failing on retries?

      1. Configure the maximum age of the event and the maximum retry limit in the asynchronous configuration settings.

Level of effort for the implementation plan: 

To establish this best practice, you must be aware of your current compute characteristics and metrics. Gathering those metrics, establishing a baseline and then using those metrics to identify the ideal compute option is a low to moderate level of effort. This is best validated by load tests and experimentation.

Resources

Related documents:

Related videos:

Related examples: