Amazon DocumentDB
Developer Guide

The AWS Documentation website is getting a new look!
Try it now and let us know what you think. Switch to the new look >>

You can return to the original look by selecting English in the language selector above.

Best Practices for Amazon DocumentDB

Learn best practices for working with Amazon DocumentDB (with MongoDB compatibility). This section is continually updated as new best practices are identified.

Basic Operational Guidelines

The following are basic operational guidelines that everyone should follow when working with Amazon DocumentDB. The Amazon DocumentDB Service Level Agreement requires that you follow these guidelines.

  • Deploy a cluster consisting of two or more Amazon DocumentDB instances in two AWS Availability Zones. For production workloads, we recommend deploying a cluster consisting of three or more Amazon DocumentDB instances in three Availability Zones.

  • Connect to your Amazon DocumentDB cluster in replica set mode to minimize the impact of failover on your application.

  • Choose a driver read preference setting that maximizes read scaling while meeting your application's read consistency requirements. The secondaryPreferred read preference enables replica reads and frees up the primary instance to do more work.

  • Use the service within the stated service limits. For more information, see Amazon DocumentDB Limits.

  • Monitor your memory, CPU, and storage usage. To help you maintain system performance and availability, set up Amazon CloudWatch to notify you when usage patterns change or when you approach the capacity of your deployment.

  • Scale up your instances when you are approaching capacity limits. You should have some buffer in memory to accommodate unforeseen increases in demand from your applications.

  • Set your backup retention period to align with your recovery point objective.

  • Test failover for your cluster to understand how long the process takes for your use case.

  • Design your application to be resilient in the event of network and database errors. Use your driver's error mechanism to distinguish between transient errors and persistent errors. Retry transient errors using an exponential backoff mechanism when appropriate. Ensure that your application considers data consistency when implementing retry logic.

Instance Memory Recommendations

Amazon DocumentDB performance best practice is to allocate enough RAM so that your working set resides in memory. To determine whether your working set resides in memory, check the BufferCacheHitRatio metric (using Amazon CloudWatch) while an instance is under load. The value of BufferCacheHitRatio should be as close to 100 percent as possible.

If scaling up the DB instance class—to a class with more RAM—results in a dramatic increase in BufferCacheHitRatio, your working set did not fit in memory. Continue to scale up until BufferCacheHitRatio no longer increases dramatically after a scaling operation. Roughly 2/3rd of an instance’s RAM is available for working set memory. For information about monitoring an instance's metrics, see Viewing CloudWatch Data.

Building Indexes

When importing data into Amazon DocumentDB, it is better to create your indexes before importing large datasets. You can use the Amazon DocumentDB Index Tool to extract indexes from a running MongoDB instance or mongodump directory, and create those indexes in an Amazon DocumentDB cluster. For more guidance on migrations, see Migrating to Amazon DocumentDB.

Security Best Practices

Use AWS Identity and Access Management (IAM) accounts to control access to Amazon DocumentDB actions and resources. There should be tight control on the ability to perform critical actions, such as creating, modifying, or deleting Amazon DocumentDB resources (clusters, instances, security groups, or parameter groups), and administrative actions like taking snapshots or restoring clusters.

  • Assign an individual IAM account to each person who manages Amazon DocumentDB resources. Do not use the AWS account root user to manage Amazon DocumentDB resources. Create an IAM user for everyone, including yourself.

  • Grant each user the minimum set of permissions that are required to perform his or her duties.

  • Use IAM groups to effectively manage permissions for multiple users. For more information about IAM, see the IAM User Guide. For information about IAM best practices, see IAM Best Practices.

  • Regularly rotate your IAM credentials.

  • Use Transport Layer Security (TLS) and encryption at rest to encrypt your data.

  • Regularly change your Amazon DocumentDB master user password using the AWS Management Console or the AWS Command Line Interface (AWS CLI), as follows:

    • Console—In the upper-right corner of the Amazon DocumentDB console, choose your account. In the list, choose My Security Credentials. In the screen that appears, enter and confirm your new master password.

      Console screenshot showing My Security Credentials menu item.
    • AWS CLI—Use the modify-db-cluster operation to change the value of --master-user-password.

Cost Management

The following best practices can help you manage and minimize your costs when using Amazon DocumentDB. For pricing information, see Amazon DocumentDB (with MongoDB compatibility) pricing and Amazon DocumentDB (with MongoDB compatibility) FAQs.

  • Create billing alerts at thresholds of 50 percent and 75 percent of your expected bill for the month. For more information about creating billing alerts, see Creating a Billing Alarm.

  • For development and test scenarios, stop a cluster when it is no longer needed and start the cluster when development resumes. For more information, see Stopping and Starting an Amazon DocumentDB Cluster.

  • Amazon DocumentDB's architecture separates storage and compute, so even a single-instance cluster is highly durable. The cluster storage volume replicates data six ways across three Availability Zones, providing extremely high durability regardless of the number of instances in the cluster. A typical production cluster has three or more instances to provide high availability. However, you can optimize costs by using a single instance development cluster when high availability is not required.

Using Metrics to Identify Performance Issues

To identify performance issues caused by insufficient resources and other common bottlenecks, you can monitor the metrics available for your Amazon DocumentDB cluster.

Viewing Performance Metrics

Monitor performance metrics on a regular basis to see the average, maximum, and minimum values for a variety of time ranges. This helps you identify when performance is degraded. You can also set Amazon CloudWatch alarms for particular metric thresholds so that you are alerted if they are reached.

To troubleshoot performance issues, it’s important to understand the baseline performance of the system. After you set up a new cluster and get it running with a typical workload, capture the average, maximum, and minimum values of all the performance metrics at different intervals (for example, 1 hour, 24 hours, 1 week, 2 weeks). This gives you an idea of what is normal. It helps to get comparisons for both peak and off-peak hours of operation. You can then use this information to identify when performance is dropping below standard levels.

You can view performance metrics using the AWS Management Console or AWS CLI. For more information, see the following:

Setting a CloudWatch Alarm

To set a CloudWatch alarm, see Using Amazon CloudWatch Alarms in the Amazon CloudWatch User Guide.

Evaluating Performance Metrics

An instance has several different categories of metrics. How you determine acceptable values depends on the metric.

CPU

  • CPU Utilization — The percentage of the computer processing capacity used.

Memory

  • Freeable Memory — How much RAM is available on the instance.

  • Swap Usage — How much swap space is used by the instance, in megabytes.

Input/output operations

  • Read IOPS, Write IOPS — The average number of disk read or write operations per second.

  • Read Latency, Write Latency — The average time for a read or write operation in milliseconds.

  • Read Throughput, Write Throughput — The average number of megabytes read from or written to disk per second.

  • Disk Queue Depth — The number of I/O operations that are waiting to be written to or read from disk.

Network traffic

  • Network Receive Throughput, Network Transmit Throughput — The rate of network traffic to and from the instance in megabytes per second.

Database connections

  • DB Connections — The number of client sessions that are connected to the instance.

Generally speaking, acceptable values for performance metrics depend on what your baseline looks like and what your application is doing. Investigate consistent or trending variances from your baseline.

The following are recommendations and advice about specific types of metrics:

  • High CPU consumption — High values for CPU consumption might be appropriate, provided that they are in keeping with your goals for your application (like throughput or concurrency) and are expected. If your CPU consumption is consistently over 80 percent, consider scaling up your instances.

  • High RAM consumption — If your FreeableMemory metric frequently dips below one-third of the total instance memory, consider scaling up your instances.

  • Swap usage — This metric should remain at or near zero. If your swap usage is significant, consider scaling up your instances.

  • Network traffic — For network traffic, talk with your system administrator to understand what the expected throughput is for your domain network and internet connection. Investigate network traffic if throughput is consistently lower than expected.

  • Database connections — Consider constraining database connections if you see high numbers of user connections together with decreases in instance performance and response time. The best number of user connections for your DB instance varies based on your instance class and the complexity of the operations being performed. For issues with any performance metrics, one of the first things you can do to improve performance is tune the most used and most expensive queries to see if that lowers the pressure on system resources.

If your queries are tuned and an issue persists, consider upgrading your Amazon DocumentDB instance class to one with more of the resource (CPU, RAM, disk space, network bandwidth, I/O capacity) that is related to the issue you're experiencing.

Tuning Queries

One of the best ways to improve cluster performance is to tune your most commonly used and most resource-intensive queries to make them less expensive to run.

You can use the explain command to learn how to analyze a query plan. Use this information to modify a query or underlying collection to improve your query performance.

Working with Cluster Parameter Groups

We recommend that you try out cluster parameter group changes on a test cluster before applying the changes to your production clusters.

For information about backing up your cluster, see Backing Up and Restoring Amazon DocumentDB.