Batch and ETL Job Monitoring - Analytics Lens

Batch and ETL Job Monitoring

Many analytics applications include batch and ETL jobs to run complex calculations or aggregations, simulations, forecasting, or to train machine learning models on huge datasets in an offline scenario. These jobs can often take hours or even days.

These jobs should ordinarily be orchestrated and scheduled in an automated fashion. In addition to relying on compute and storage utilization metrics for the batch and ETL jobs, you should also instrument your job orchestration and scheduling systems to provide end to end metrics and alerts for jobs. Although this is not an exhaustive list, you should consider monitoring job execution time, job SLAs, and overall job compute utilization.