FSISUS15-BP01 Minimize the bit count while maintaining precision

FSISUS15: What is your testing process for workloads that require floating point precision?

FSISUS15-BP01 Minimize the bit count while maintaining precision

Prescriptive guidance

Floating point precision is a way to represent real numbers in a finite binary format. It stores a number in a fixed-width field with the intent to reduce the memory bandwidth and storage requirements compared to double-precision arithmetic results. Although double-precision can sometimes lead to more accurate results, single-precision calculations can be faster and thus

reduce overall energy consumption for particular workloads. Determine which of your workloads is suitable for use of floating-point accuracy, performance, and efficiency. Consider testing with a cluster of instances to see how well it performs at scale.

Implementation guidance:

For intensive financial simulations and calculations, test the number of bits that are required to achieve your floating point precision and consider reducing number of bits by selecting different floating-point formats, including bfloat16, that's supported by AWS Graviton.
Using floating point Quantization, you can represent numbers using lower bit-count integers or floating point numbers without incurring a significant loss in accuracy. Specifically, you can reduce resource usage by replacing the parameters in your workload with (1) half-precision (16 bit), (2) bfloat16 (16 bit, but the same dynamic range as 32 bit), or 8-bit integers instead of the usual single-precision floating-point (32 bit) values.
Service recommendations: Use the following services to achieve your goal.

Test generative AI models with reduced precision (quantization) to maintain accuracy while reducing resource consumption.
Validate generative AI model performance with different floating-point precisions.
Use mixed-precision training for generative AI models to optimize resource usage.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

FSISUS14: Do you have multi-architecture images for grid computing systems?

Process and culture