Monitoring tools - AWS Prescriptive Guidance

Monitoring tools

This section discusses monitoring tools from Amazon and Oracle that you can use during the post-migration phase to maintain a reliable, highly available, performant, and cost-optimized database environment.

Amazon CloudWatch

Amazon CloudWatch is a monitoring and observability service that provides a unified view of operational health and gives you complete visibility into the AWS resources, applications, and services running on AWS and on premises. You can use CloudWatch to detect anomalous behavior in your environments, set alarms, visualize logs and metrics side by side, take automated actions, troubleshoot issues, and discover insights to keep your applications running smoothly. The best analogy for CloudWatch metrics resolution and retention is a pyramid structure that's illustrated in the following diagram. The top level represents the most granular frequency (up to 1 second) but also the lowest retention of metrics. As users explore more historical monitoring data, the less granular the data points will be. For example, for maximum retention (between 63 days and 15 months), granularity will be one hour, as illustrated in the bottom level of the pyramid.

Metrics retention and resolution in CloudWatch

As the following diagram shows, you can set up alarms for CloudWatch metrics. For example, you might create an alarm that is activated when the CPU utilization for an instance exceeds 70 percent.

Using CloudWatch to monitor Oracle Database on AWS

You can configure Amazon Simple Notification Service (Amazon SNS) to send an email or SMS whenever the threshold is passed. You can also use Amazon SNS to notify additional protocols or services such as Amazon Simple Queue Service (Amazon SQS), AWS Lambda, or HTTP/HTTPS. For example, you might create an alarm that is activated if the total IOPS used exceeds 90 percent of the maximum that's configured for the instance. The alarm action might be a Lambda function that increases the amount of provisioned IOPS (PIOPS) if the alarm state is Alarm. For additional information, see the presentation Take a load off: Diagnose & resolve performance issues with Amazon RDS (AWS re:Invent 2023).

Enhanced Monitoring

Some users who migrate from Oracle Exadata are used to having OS-level visibility into physical devices that are mapped into their ASM disk groups, and viewing granular OS-level metrics such as huge pages, swap activity, and process/thread list details. Amazon CloudWatch doesn't provide that level of visibility, but Amazon RDS and Amazon Aurora offer Enhanced Monitoring, which provides granular, OS-level monitoring for your databases. Enhanced Monitoring provides a default retention of 30 days and a one-minute sampling frequency, but both settings are configurable.

For more information, see the Monitoring OS metrics with Enhanced Monitoring sections of the Amazon RDS and Aurora documentation.

Note

Enhanced Monitoring doesn't currently support Oracle databases on Amazon EC2. For these databases, you can use third-party partner solutions or native solutions such as Oracle Enterprise Manager, as discussed in a later section.

Performance Insights

Both Amazon CloudWatch and Amazon RDS Enhanced Monitoring are great tools for instance-level and OS-level monitoring. However, these tools don't provide database engine-level deep-dive performance diagnostics capabilities. Database engine metrics help DBAs identify database bottlenecks such as intensive SQL queries and clearly visualize database load over time. In Amazon RDS and Amazon Aurora, the Performance Insights dashboard displays database load by using a metric named average active sessions (AAS).

The following example shows a maximum of two vCPUs in the monitored Amazon RDS instance. However, two major spikes exceed the number of vCPUs and could indicate a performance bottleneck. One spike represents a major CPU load, shown in green, and the other spike represents a major SQL statements bottleneck, shown in red.

Using Performance Insights to monitor Oracle Database on AWS

Performance Insights provides that level of visibility by sampling every second of database sessions, looking for active sessions, and ignoring idle sessions. For each active session, Performance Insights collects the following:

  • SQL statements

  • Wait events such as CPU, I/O, locks, and commit log waits

  • Additional dimensions such as hosts and users

Based on this data, you can visualize your database workload and troubleshoot performance issues easily. You can also filter the activity by various dimensions such as hosts and users for additional root-cause analysis. Each database engine has its own set of supported dimensions.

One of the key benefits of Performance Insights is that it doesn't rely on the Oracle Diagnostics Pack, so you can use it to monitor Oracle Database SE2 and other non-Enterprise editions running on Amazon RDS. For more information, see the Performance Insights sections of the Amazon RDS and Aurora documentation.

Note

Performance Insights doesn't currently support Oracle databases on Amazon EC2. For these databases, you can use third-party partner solutions or native solutions such as Oracle Enterprise Manager, as discussed in the next section.

Oracle Enterprise Manager

In some cases, Oracle Exadata users might prefer to work with Oracle Enterprise Manager (OEM). Amazon RDS supports OEM through the following options:

Option

Option ID

Supported OEM releases

Supported Oracle Database releases

OEM Database Express

OEM

OEM Database Express 12c

Oracle Database 19c (non-CDB only) and Oracle Database 12c

OEM Management Agent

OEM_AGENT

  • OEM Cloud Control for 13c

  • OEM Cloud Control for 12c

Oracle Database 19c (non-CDB only) and Oracle Database 12c