1 – Monitor the health of the analytics application workload
How do you measure the health of your analytics workload? Data analytics workloads often involve multiple systems and process steps working in coordination. It is imperative that you monitor not only individual components but also the interaction of dependent processes to ensure a healthy data analytics workload.
ID | Priority | Best practice |
---|---|---|
☐ BP 1.1 |
Required | Validate the data quality of source systems before transferring data for analytics. |
☐ BP 1.2 |
Required | Monitor operational metrics of data processing jobs and the availability of source data. |
For more details, refer to the following information:
-
AWS Big Data Blog: Monitor data pipelines in a serverless data lake
-
AWS Compute Blog: Monitoring and troubleshooting serverless data analytics applications
-
AWS Big Data Blog: Building a serverless data quality and analysis framework with Deequ and AWS
Glue