Design Principles - Analytics Lens

Design Principles

When designing for analytics workloads in the cloud, a number of principles help you achieve performance efficiency:

  • Use data profiling to improve performance. Store prepared data in an appropriate environment based on data access and query retrieval patterns. Use business and application requirements to define performance and cost optimization goals. Published data should have service design goals (such as data refresh rate.) Included in a data profile are data statistics (sums, averages), column skew (disproportional frequency of a value), and missing data. These characteristics of the data profile can affect query performance and partitioning strategies, and should be monitored.

  • Continuously optimize data storage. Data storage optimization—especially compression, partitioning columns, distribution keys, and sort keys—need to be continuously evaluated and improved as query patterns change.