Amazon Neptune
User Guide (API Version 2017-11-29)

Monitoring Neptune with CloudWatch

Amazon Neptune and Amazon CloudWatch are integrated, so you can gather and analyze performance metrics. You can monitor these metrics using the CloudWatch console, the AWS Command Line Interface (AWS CLI), or the CloudWatch API.

CloudWatch also lets you set alarms so that you can be notified if a metric value breaches a threshold that you specify. You can even set up CloudWatch Events to take corrective action if a breach occurs. For more information about using CloudWatch and alarms, see the CloudWatch Documentation.

Viewing CloudWatch Data (Console)

To view CloudWatch data for a Neptune cluster (console)

  1. Sign in to the AWS Management Console and open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/.

  2. In the navigation pane, click Metrics.

  3. In the All Metrics pane, choose Neptune, and then choose DBClusterIdentifier. Then in the upper pane, scroll down to view the full list of metrics for your cluster.

    The available Neptune metric options appear in the Viewing list.

To select or deselect an individual metric, in the results pane, select the check box next to the resource name and metric. Graphs showing the metrics for the selected items appear at the bottom of the console. To learn more about CloudWatch graphs, see Graph Metrics in the Amazon CloudWatch User Guide.

Viewing CloudWatch Data (AWS CLI)

To view CloudWatch data for a Neptune cluster (AWS CLI)

  1. Install the AWS CLI. For instructions, see the AWS Command Line Interface User Guide.

  2. Use the AWS CLI to fetch information. The relevant CloudWatch parameters for Neptune are listed in Neptune Metrics.

    The following example retrieves CloudWatch metrics for the number of Gremlin requests per second for the gremlin-cluster cluster.

    aws cloudwatch get-metric-statistics \ --namespace AWS/Neptune --metric-name GremlinRequestsPerSec \ --dimensions Name=DBClusterIdentifier,Value=gremlin-cluster \ --start-time 2018-03-03T00:00:00Z --end-time 2018-03-04T00:00:00Z \ --period 60 --statistics=Average

Viewing CloudWatch Data (API)

CloudWatch also supports a Query action, so you can request information programmatically. For more information, see the CloudWatch Query API documentation and Amazon CloudWatch API Reference.

When a CloudWatch action requires a parameter that is specific to Neptune monitoring, such as MetricName, use the values listed in Neptune Metrics.

The following example shows a low-level CloudWatch request, using the following parameters:

  • Statistics.member.1 = Average

  • Dimensions.member.1 = DBClusterIdentifier=gremlin-cluster

  • Namespace = AWS/Neptune

  • StartTime = 2013-11-14T00:00:00Z

  • EndTime = 2013-11-16T00:00:00Z

  • Period = 60

  • MetricName = GremlinRequestsPerSec

Here is what the CloudWatch request looks like. However, this is just to show the form of the request; you must construct your own request based on your metrics and timeframe.

http://monitoring.amazonaws.com/ ?SignatureVersion=2 &Action=GremlinRequestsPerSec &Version=2010-08-01 &StartTime=2018-03-03T00:00:00 &EndTime=2018-03-04T00:00:00 &Period=60 &Statistics.member.1=Average &Dimensions.member.1=DBClusterIdentifier=gremlin-cluster &Namespace=AWS/Neptune &MetricName=GremlinRequests &Timestamp=2018-03-04T17%3A48%3A21.746Z &AWSAccessKeyId=<AWS Access Key ID> &Signature=<Signature>

Neptune Metrics

The following metrics are available from Amazon Neptune. Neptune sends metrics to CloudWatch only when they have a non-zero value.

Note

For all Neptune metrics, the aggregation granularity is five minutes.

Metric Description
CPUUtilization The percentage of CPU utilization.
ClusterReplicaLag For a read replica, the amount of lag when replicating updates from the primary instance, in milliseconds.
ClusterReplicaLagMaximum The maximum amount of lag between the primary instance and each Neptune DB instance in the DB cluster, in milliseconds.
ClusterReplicaLagMinimum The minimum amount of lag between the primary instance and each Neptune DB instance in the DB cluster, in milliseconds.
EngineUptime The amount of time that the instance has been running, in seconds.
FreeableMemory The amount of available random access memory, in bytes.
FreeLocaStorage

The amount of storage available for temporary tables and logs, in bytes.

This metric reports the amount of storage available to each DB instance for temporary tables and logs. This value depends on the DB instance class (for pricing information, see the Amazon Neptune pricing page). You can increase the amount of free storage space for an instance by choosing a larger DB instance class for your instance.

GremlinHttp1xx Number of HTTP 1xx errors for the Gremlin endpoint per second.

We recommend that you use the new Http1xx combined metric instead.

GremlinHttp2xx Number of HTTP 2xx errors for the Gremlin endpoint per second.

We recommend that you use the new Http2xx combined metric instead.

GremlinHttp4xx Number of HTTP 4xx errors for the Gremlin endpoint per second.

We recommend that you use the new Http4xx combined metric instead.

GremlinHttp5xx Number of HTTP 5xx errors for the Gremlin endpoint per second.

We recommend that you use the new Http5xx combined metric instead.

GremlinErrors Number of errors in Gremlin traversals.
GremlinRequests Number of requests to Gremlin engine.
GremlinRequestsPerSec Number of requests to Gremlin engine per second.
GremlinWebSocketSuccess Number of successful WebSocket connections to the Gremlin endpoint per second.
GremlinWebSocketClientErrors Number of WebSocket client errors on the Gremlin endpoint per second.
GremlinWebSocketServerErrors Number of WebSocket server errors on the Gremlin endpoint per second.
GremlinWebSocketAvailableConnections Number of potential WebSocket connections curently available.
Http1xx Number of HTTP 1xx errors for the endpoint per second.
Http2xx Number of HTTP 2xx errors for the endpoint per second.
Http4xx Number of HTTP 4xx errors for the endpoint per second.
Http5xx Number of HTTP 5xx errors for the endpoint per second.
Http100 Number of HTTP 100 errors for the endpoint per second.

We recommend that you use the new Http1xx combined metric instead.

Http101 Number of HTTP 101 errors for the endpoint per second.

We recommend that you use the new Http1xx combined metric instead.

Http200 Number of HTTP 200 errors for the endpoint per second.

We recommend that you use the new Http2xx combined metric instead.

Http400 Number of HTTP 400 errors for the endpoint per second.

We recommend that you use the new Http4xx combined metric instead.

Http403 Number of HTTP 403 errors for the endpoint per second.

We recommend that you use the new Http4xx combined metric instead.

Http405 Number of HTTP 405 errors for the endpoint per second.

We recommend that you use the new Http4xx combined metric instead.

Http413 Number of HTTP 413 errors for the endpoint per second.

We recommend that you use the new Http4xx combined metric instead.

Http429 Number of HTTP 429 errors for the endpoint per second.

We recommend that you use the new Http5xx combined metric instead.

Http500 Number of HTTP 500 errors for the endpoint per second.

We recommend that you use the new Http5xx combined metric instead.

Http501 Number of HTTP 501 errors for the endpoint per second.

We recommend that you use the new Http5xx combined metric instead.

LoaderErrors Number of errors from Loader requests.
LoaderRequests Number of Loader Requests.
NetworkReceiveThroughput The incoming (Receive) network traffic on the DB instance, including both customer database traffic and Neptune traffic used for monitoring and replication, in bytes/second.
NetworkThroughput The amount of network throughput both received from and transmitted to clients by each instance in the Neptune DB cluster, in bytes per second. This throughput doesn't include network traffic between instances in the DB cluster and the cluster volume.
NetworkTransmitThroughput The outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Neptune traffic used for monitoring and replication, in bytes/second.
SparqlHttp1xx Number of HTTP 1xx errors for the SPARQL endpoint per second.

We recommend that you use the new Http1xx combined metric instead.

SparqlHttp2xx Number of HTTP 2xx errors for the SPARQL endpoint per second.

We recommend that you use the new Http2xx combined metric instead.

SparqlHttp4xx Number of HTTP 4xx errors for the SPARQL endpoint per second.

We recommend that you use the new Http4xx combined metric instead.

SparqlHttp5xx Number of HTTP 5xx errors for the SPARQL endpoint per second.

We recommend that you use the new Http5xx combined metric instead.

SparqlErrors Number of errors in the SPARQL queries.
SparqlRequests Number of requests to the SPARQL engine.
SparqlRequestsPerSec Number of requests to the SPARQL engine per second.
StatusErrors Number of errors from the status endpoint.
StatusRequests Number of requests to the status endpoint.
VolumeBytesUsed The amount of storage used by your Neptune DB instance, in bytes. This value affects the cost of the Neptune DB cluster.
VolumeReadIOPs The average number of billed read I/O operations from a cluster volume, reported at 5-minute intervals. Billed read operations are calculated at the cluster volume level, aggregated from all instances in the Neptune DB cluster, and then reported at 5-minute intervals.
VolumeWriteIOPs The average number of write disk I/O operations to the cluster volume, reported at 5-minute intervals.

Neptune Dimensions

The metrics for Neptune are qualified by the values for the account, graph name, or operation. You can use the CloudWatch console to retrieve Neptune data along with any of the dimensions in the following table.

Dimension Description
DBClusterIdentifier Filters the data you request for a specific database instance within a cluster.
DBClusterIdentifier, Role

Filters the data you request for a specific Neptune DB cluster, aggregating the metric by instance role (WRITER/READER). For example, you can aggregate metrics for all READER instances that belong to a cluster.

DatabaseClass Filters the data you request for all instances in a database class. For example, you can aggregate metrics for all instances that belong to the database class db.r4.large
DBClusterIdentifier, EngineName Filters the data by the cluster. The engine name for all Neptune instances is neptune.
EngineName The engine name for all Neptune instances is neptune.