Monitoring AWS Backup metrics with CloudWatch - AWS Backup

Monitoring AWS Backup metrics with CloudWatch

Monitor metrics with CloudWatch

You can use CloudWatch to monitor AWS Backup metrics. The Backup namespace allows you to track the following metrics. AWS Backup emits updated metrics to CloudWatch every 5 minutes.

The purpose of this documentation page is to provide you with the reference materials to use CloudWatch to monitor AWS Backup. To learn how to monitor a metric using CloudWatch, see the blog Amazon CloudWatch Events and Metrics for AWS Backup or Focus on Metrics and Alarms in a Single AWS Service in the CloudWatch User Guide. To set alarms, see Using Amazon CloudWatch Alarms in the CloudWatch User Guide.

Category Metrics Example dimensions Example use case
Jobs

Number of backup, restore, and copy jobs across each state, including CREATED, PENDING, RUNNING, ABORTED, COMPLETED, FAILED, and EXPIRED.

Different job types have different available states.

Resource type, vault name.

The vault name of copy jobs is that of their destination vault.

Monitor the number of failed backup jobs within one or more specific backup vaults. When there are more than five failed jobs within 1 hour, send an email or SMS using Amazon SNS or open a ticket to the engineering team to investigate.

Reporting criteria: There is a nonzero value

Recovery points Number of warm and cold recovery points across each state: MODIFIED, COMPLETED, PARTIAL, EXPIRED, DELETED. Resource type, vault name.

Track the number of deleted recovery points for your Amazon EBS volumes, and separately track the number of warm and cold recovery points in each backup vault.

Reporting criteria: There is a nonzero value

The following table lists all the metrics available to you.

Metric Description
NumberOfBackupJobsCreated The number of backup jobs that AWS Backup created.
NumberOfBackupJobsPending The number of backup jobs about to run in AWS Backup.
NumberOfBackupJobsRunning The number of backup jobs currently running in AWS Backup.
NumberOfBackupJobsAborted The number of user cancelled backup jobs.
NumberOfBackupJobsCompleted The number of backup jobs that AWS Backup finished.
NumberOfBackupJobsFailed The number of backup jobs that AWS Backup scheduled but did not start. Often caused by scheduling a backup job during or 4 hours before a database resource's maintenance window or automated backup window. AWS Backup will not perform your scheduled job to maintain your data integrity.
NumberOfBackupJobsExpired The number of backup jobs that AWS Backup attempted to delete based on your backup retention lifecycle, but could not delete. You are billed for the storage that expired backups consume and should delete them manually.
NumberOfCopyJobsCreated The number of cross-account and cross-Region copy jobs that AWS Backup created.
NumberOfCopyJobsRunning The number of cross-account and cross-Region copy jobs currently running in AWS Backup.
NumberOfCopyJobsCompleted The number of cross-account and cross-Region copy jobs that AWS Backup finished.
NumberOfCopyJobsFailed The number of cross-account and cross-Region copy jobs that AWS Backup attempted but could not complete.
NumberOfRestoreJobsPending The number of restore jobs about to run in AWS Backup.
NumberOfRestoreJobsRunning The number of restore jobs currently running in AWS Backup.
NumberOfRestoreJobsCompleted The number of restore jobs that AWS Backup finished.
NumberOfRestoreJobsFailed The number of restore jobs that AWS Backup attempted but could not complete.
NumberOfRecoveryPointsCompleted The number of recovery points that AWS Backup created.
NumberOfRecoveryPointsPartial The number of recovery points that AWS Backup started to create but could not finish. AWS retries the process later, but because the retry occurs at the later time, it retains the partial recovery point.
NumberOfRecoveryPointsExpired The number of recovery points that AWS Backup attempted to delete based on your backup retention lifecycle, but could not delete. You are billed for the storage that expired backups consume and should delete them manually.
NumberOfRecoveryPointsDeleting The number of recovery points that AWS Backup is deleting.
NumberOfRecoveryPointsCold The number of recovery points that AWS Backup tiered to cold storage.

More dimensions are available beyond those listed in the table. To view all the dimensions of a metric, type the name of that metric into the Backup namespace of the Metrics section of the CloudWatch console.

Differences with the AWS Backup dashboard

The AWS Backup console has its own dashboard, which you can view by choosing Dashboard in the navigation pane. This dashboard shows metrics for the last 24 hours. The CloudWatch dashboard shows metrics over a longer period of time. For specifics, see What is the retention period of all metrics? in the CloudWatch FAQ.

The AWS Backup dashboard also shows you metrics at a point in time. CloudWatch shows you metrics over a period of time. For example, suppose that you have nine jobs completed and one job in progress over the last 4 hours. The AWS Backup dashboard would show you nine jobs completed and one job in progress. CloudWatch would show you 10 jobs in progress if you view running jobs metrics during the last 4 hours.

We recommend that you use the dashboard that enables you to most easily detect potential issues.