Assess query performance for migrating SQL Server databases to MongoDB Atlas on AWS - AWS Prescriptive Guidance

Assess query performance for migrating SQL Server databases to MongoDB Atlas on AWS

Created by Battulga Purevragchaa (AWS), Krishnakumar Sathyanarayana (PeerIslands US Inc), and Babu Srinivasan (MongoDB)

Environment: PoC or pilot

Source: Microsoft SQL Server

Target: MongoDB Atlas or MongoDB Enterprise Advanced

R Type: Replatform

Workload: Microsoft

Technologies: Databases; Migration

Summary

This pattern provides guidance for loading MongoDB with near real-world data and assessing MongoDB query performance that is as close to the production scenario as possible. The assessment provides input to help you plan your migration to MongoDB from a relational database. The pattern uses PeerIslands Test Data Generator and Performance Analyzer to test query performance.

This pattern is particularly useful for Microsoft SQL Server migration to MongoDB, because performing schema transformations and loading data from current SQL Server instances to MongoDB can be very complex. Instead, you can load near real-world data into MongoDB, understand MongoDB performance, and fine-tune the schema design before you start the actual migration.

Prerequisites and limitations

Prerequisites

  • An active AWS account

  • Familiarity with MongoDB Atlas

  • Target MongoDB schema

  • Typical query patterns

Limitations

  • Data load times and performance will be limited by the MongoDB cluster instance size. We recommend that you choose instances that are recommended for production use to understand real-world performance.

  • PeerIslands Test Data Generator and Performance Analyzer currently supports only online data loads and queries. Offline batch processing (for example, loading data into MongoDB by using Spark connectors) isn’t yet supported.

  • PeerIslands Test Data Generator and Performance Analyzer supports field relations within a collection. It doesn’t support relationships across collections.

Product editions

Architecture

Target technology stack

  • MongoDB Atlas or MongoDB Enterprise Advanced

Architecture

Architecture for assessing query performance for migrating SQL Server databases to MongoDB Atlas on AWS

PeerIslands Test Data Generator and Performance Analyzer is built by using Java and Angular, and stores its generated data on Amazon Elastic Block Store (Amazon EBS). The tool consists of two workflows: test data generation and performance testing.

  • In test data generation, you create a template, which is the JSON representation of the data model that has to be generated. After you create the template, you can generate the data in a target collection, as defined by the load generation configuration.

  • In performance testing, you create a profile. A profile is a multi-stage testing scenario where you can configure create, read, update, and delete (CRUD) operations, aggregation pipelines, the weightage for each operation, and the duration of each stage. After you create the profile, you can run performance testing on the target database based on the configuration.

PeerIslands Test Data Generator and Performance Analyzer stores its data on Amazon EBS, so you can connect Amazon EBS to MongoDB by using any MongoDB-supported connection mechanism, including peering, allow lists, and private endpoints. By default, the tool doesn’t include operational components; however, it can be configured with Amazon Managed Service for Prometheus, Amazon Managed Grafana, Amazon CloudWatch, and AWS Secrets Manager if required.

Tools

  • PeerIslands Test Data Generator and Performance Analyzer includes two components. The Test Data Generator component helps you generate highly customer-specific, real-world data based on your MongoDB schema. The tool is fully UI-driven with a rich data library and can be used to quickly generate billions of records on MongoDB. The tool also provides capabilities to implement relationships between fields in the MongoDB schema. The Performance Analyzer component helps you generate highly customer-specific queries and aggregations, and perform realistic performance testing on MongoDB. You can use the Performance Analyzer to test MongoDB performance with rich load profiles and parameterized queries for your specific use case.

Best practices

See the following resources:

Epics

TaskDescriptionSkills required

Understand the database footprint of the current SQL Server source.

Understand your current SQL Server footprint. This can be achieved by running queries against the INFORMATION schema of the database. Determine the number of tables and size of each table. Analyze the index associated with each table. For more information about SQL analysis, see the blog post SQL2Mongo: Data Migration Journey on the PeerIslands website.

DBA

Understand the source schema.

Determine the table schema and the business representation of the data (for example, zip codes, names, and currency). Use your existing entity relationship (ER) diagram or generate the ER diagram from the existing database. For more information, see the blog post SQL2Mongo: Data Migration Journey on the PeerIslands website.

DBA

Understand query patterns.

Document the top 10 SQL queries you use. You can use the performance_schema.events_statements_summary_by_digest tables that are available in the database to understand the top queries. For more information, see the blog post SQL2Mongo: Data Migration Journey on the PeerIslands website.

DBA

Understand SLA commitments.

Document the target service-level agreements (SLAs) for database operations. Typical measures include query latency and queries per second. The measures and their targets are typically available in non-functional requirements (NFR) documents.

DBA
TaskDescriptionSkills required

Define the target schema.

Define various options for the target MongoDB schema. For more information, see Schemas in the MongoDB Atlas documentation. Consider the best practices and design patterns based on the table relations. See Data Model Examples and Patterns in the MongoDB documentation for details.

MongoDB engineer

Define target query patterns.

Define MongoDB queries and aggregation pipelines. These queries are the equivalent of the top queries you captured for your SQL Server workload. To understand how to construct MongoDB aggregation pipelines, see the MongoDB documentation.

MongoDB engineer

Define the MongoDB instance type.

Determine the size of the instance that you plan to use for testing. For guidance, see the MongoDB documentation.

MongoDB engineer
TaskDescriptionSkills required

Set up the MongoDB Atlas cluster.

To set up a MongoDB cluster on AWS, follow the instructions in the MongoDB documentation.

MongoDB engineer

Create users in the target database.

Configure the MongoDB Atlas cluster for access and network security by following the instructions in the MongoDB documentation.

MongoDB engineer

Create appropriate roles in AWS and configure role-based access control for Atlas.

If required, set up additional users by following the instructions in the MongoDB documentation. Configure authentication and authorization through AWS roles.

MongoDB engineer

Set up Compass for MongoDB Atlas access.

Set up the MongoDB Compass GUI utility for ease of navigation and access.

MongoDB engineer
TaskDescriptionSkills required

Install Test Data Generator.

Install PeerIsland Test Data Generator in your environment.

MongoDB engineer

Configure Test Data Generator to generate the appropriate data.

Create a template by using the data library to generate specific data for each field in the MongoDB schema. For more information, see the MongoDB Data Generator & Perf. Analyzer video.

MongoDB engineer

Horizontally scale Test Data Generator to generate the required load.

Use the template you created to start the load generation against the target collection by configuring the required parallelism. Determine the time frames and scale to generate the necessary data.

MongoDB engineer

Validate the load in MongoDB Atlas.

Check the data loaded into MongoDB Atlas.

MongoDB engineer

Generate required indexes on MongoDB.

Define indexes as required, based on query patterns. For best practices, see the MongoDB documentation.

MongoDB engineer
TaskDescriptionSkills required

Set up load profiles in Performance Analyzer.

Create a performance testing profile in Performance Analyzer by configuring specific queries and their corresponding weightage, duration of the test run, and stages. For more information, see the MongoDB Data Generator & Perf. Analyzer video.

MongoDB engineer

Run performance testing.

Use the performance testing profile you created to start the test against the target collection by configuring the required parallelism. Horizontally scale the performance test tool to run queries against MongoDB Atlas.

MongoDB engineer

Record test results.

Record P95, P99 latency for the queries.

MongoDB engineer

Tune your schema and query patterns.

Modify indexes and query patterns to address any performance issues.

MongoDB engineer
TaskDescriptionSkills required

Shut down temporary AWS resources.

Delete all temporary resources that you used for Test Data Generator and Performance Analyzer.

AWS administrator

Update performance test results.

Understand MongoDB query performance and compare it against your SLAs. If necessary, fine-tune the MongoDB schema and rerun the process.

MongoDB engineer

Conclude the project.

Close out the project and provide feedback.

MongoDB engineer

Related resources

Additional resources: