Monitoring relational databases using DevOps Guru - Amazon DevOps Guru

Monitoring relational databases using DevOps Guru

DevOps Guru pulls from two primary data sources to look for insights and anomalies in relational databases. For Amazon RDS and Amazon Redshift, CloudWatch vended metrics are analyzed for all instance types. For Amazon RDS, Performance Insights data is also ingested for the following engine types: RDS for PostgreSQL, Aurora PostgreSQL, and Aurora MySQL.

Monitoring database operations in Amazon RDS

This section includes specific information about use cases and metrics monitored in DevOps Guru for RDS, including data from CloudWatch vended metrics and Performance Insights. For more information about DevOps Guru for RDS, including key concepts, configurations, and benefits, see Working with anomalies in DevOps Guru for RDS.

Monitoring RDS using data from CloudWatch vended metrics

DevOps Guru is capable of monitoring every type of RDS instance by ingesting default CloudWatch metrics, such as CPU utilization and read and write operation latency. Because these metrics are vended by default, when you monitor your RDS instances with DevOps Guru, no further configuration is required to gain insights. DevOps Guru automatically establishes a baseline for these metrics based on historical patterns and compares them to real-time data to detect anomalies and potential issues in your database.

The following table shows a list of potential reactive insights for Amazon RDS from CloudWatch vended metrics.

AWS resource monitored by DevOps Guru Scenario that DevOps Guru identifies CloudWatch metrics monitored

Amazon RDS (all instance types)

CPU or memory reaching limits

DBLoad, DBLoadCPU

RDS for PostgreSQL

High replication slot lag

OldestReplicationSlotLag

Additional CloudWatch vended metrics from Amazon RDS instances that DevOps Guru monitors:

  • CPUUtilization

  • DatabaseConnections

  • DiskQueueDepth

  • FailedSQLServerAgentJobsCount

  • ReadLatency

  • ReadThroughput

  • ReplicaLag

  • WriteLatency

Monitoring RDS using data from Performance Insights

For certain types of Amazon RDS instances, such as Aurora PostgreSQL, Aurora MySQL, and RDS for PostgreSQL, you unlock more capability from DevOps Guru monitoring by ensuring that Performance Insights is enabled on those instances.

DevOps Guru provides reactive insights for a variety of situations, including the following scenarios:

Scenario that DevOps Guru identifies to generate a reactive insight

Locking contention issue

Missing index

Misconfiguration of application pool

Suboptimal JDBC defaults

DevOps Guru provides proactive insights for a variety of situations, including the following scenarios:

AWS resource monitored by DevOps Guru Scenario that DevOps Guru identifies to generate a proactive insight

Aurora MySQL

InnoDB history list growing too large, which can lead to degraded performance such as lengthy database shutdown time

Aurora MySQL

An increase in temporary tables created on disk that can impact database performance

RDS for PostgreSQL, Aurora PostgreSQL

A connection that has been idle in transaction for too long, potential impact of holding locks, blocking other queries, and preventing vacuum (including autovacuum) from cleaning up dead rows

Monitoring database operations in Amazon Redshift

DevOps Guru is capable of monitoring your Amazon Redshift resources by ingesting default CloudWatch metrics, including CPU utilization and the percentage of disk space used. Because these metrics are vended by default, no further configuration is required for DevOps Guru to automatically monitor your Amazon Redshift resources. DevOps Guru establishes a baseline for these metrics based on historical patterns and compares them to real-time data to detect anomalies.

Scenario that DevOps Guru identifies CloudWatch metrics monitored

Detect high CPU utilization of an Amazon Redshift instance caused by factors such as cluster workload, skewed and unsorted data, or leader node tasks

CPUUtilization

Detect when an Amazon Redshift instance is running out of disk space due to issues with query processing, distribution and sort key, maintenance operations, or tombstone blocks

PercentageDiskSpaceUsed

Additional CloudWatch vended metrics from Amazon Redshift instances that DevOps Guru monitors:

  • DatabaseConnections

  • HealthStatus

  • MaintenanceMode

  • NumExceededSchemaQuotas

  • PercentageQuotaUsed

  • QueryDuration

  • QueryRuntimeBreakdown

  • ReadIOPS

  • ReadLatency

  • WLMQueueLength

  • WLMQueueWaitTime

  • WLMQueryDuration

  • WriteLatency