Selecting a database for a SaaS application
For many multi-tenant SaaS applications, selecting an operational database can be distilled into a choice between relational and non-relational databases, or a combination of the two. To make your decision, consider these high-level application data requirements and characteristics:
-
Data model of the application
-
Access patterns for the data
-
Database latency requirements
-
Data integrity and transactional integrity requirements (atomicity, consistency, isolation, and durability, or ACID)
-
Cross-Region availability and recovery requirements
The following table lists application data requirements and characteristics, and discusses them in the context of AWS database offerings: Aurora PostgreSQL-Compatible and Amazon RDS for PostgreSQL (relational), and Amazon DynamoDB (non-relational). You can reference this matrix when you’re trying to decide between relational and non-relational operational database offerings.
Databases | SaaS application data requirements and characteristics | ||||
---|---|---|---|---|---|
Data model | Access patterns | Latency requirements | Data and transactional integrity | Cross-Region availability and recovery | |
Relational (Aurora PostgreSQL-Compatible and Amazon RDS for PostgreSQL) |
Relational or highly normalized. | Doesn’t have to be thoroughly planned beforehand. | Preferably higher latency tolerance; can achieve lower latencies by default with Aurora and by implementing read replicas, caching, and similar features. | High data and transactional integrity maintained by default. | In Amazon RDS, you can create a read replica for cross-Region scaling and failover.
Aurora mostly automates this process |
Non-relational (Amazon DynamoDB) |
Usually denormalized. These databases take advantage of patterns for modeling many-to-many relationships, large items, and time series data. | All access patterns (queries) for data must be thoroughly understood before a data model is produced. | Very low latency with options such as Amazon DynamoDB Accelerator (DAX) able to improve performance even further. | Optional transactional integrity at the cost of performance. Data integrity concerns are shifted to the application. | Easy cross-Region recovery and active-active configuration with global tables. (ACID compliance is achievable only in a single AWS Region.) |
Some multi-tenant SaaS applications might have unique data models or special circumstances that are better served by databases not included in the previous table. For example, time series datasets, highly connected datasets, or maintaining a centralized transaction ledger might necessitate using a different type of database. Analyzing all possibilities is beyond the scope of this guide. For a comprehensive list of AWS database offerings and how they can fulfill different use cases at a high level, see the Database section of the Overview of Amazon Web Services whitepaper.
The remainder of this guide focuses on AWS relational database services that support
PostgreSQL: Amazon RDS and Aurora PostgreSQL-Compatible. DynamoDB requires a different approach to
optimize for SaaS applications, which is beyond the scope of this guide. For more information
about DynamoDB, see the AWS blog post Partitioning Pooled
Multi-Tenant SaaS Data with Amazon DynamoDB
Choosing between Amazon RDS and Aurora
In most cases, we recommend using Aurora PostgreSQL-Compatible over Amazon RDS for PostgreSQL. The following table shows the factors that you should consider when deciding between these two options.
DBMS component | Amazon RDS for PostgreSQL | Aurora PostgreSQL-Compatible |
---|---|---|
Scalability | Replication lag of minutes, maximum of 5 read replicas | Replication lag under a minute (typically less than 1 second with global databases), maximum of 15 read replicas |
Crash recovery | Checkpoints 5 minutes apart (by default), can slow database performance | Asynchronous recovery with parallel threads for rapid recovery |
Failover | 60-120 seconds in addition to crash recovery time | Usually about 30 seconds (including crash recovery) |
Storage | Maximum IOPS of 256,000 | IOPS constrained only by Aurora instance size and capacity |
High availability and disaster recovery | Two Availability Zones with a standby instance, cross-Region failover to read replica or copied backups | Three Availability Zones by default, cross-Region failover with Aurora global databases, write forwarding across AWS Regions for active-active configurations |
Backup | During backup window, can impact performance | Automatic incremental backups, no performance impact |
Database instance classes | See list of Amazon RDS instance classes | See list of Aurora instance classes |
In all the categories described in the previous table, Aurora PostgreSQL-Compatible is usually the better option. However, Amazon RDS for PostgreSQL might still make sense for small to medium workloads, because it has a greater selection of instance classes that might provide a more cost-effective option at the expense of Aurora’s more robust feature set.