Menu
Amazon Relational Database Service
User Guide (API Version 2014-10-31)

Preview: Amazon Aurora with PostgreSQL Compatibility

Amazon Relational Database Service (Amazon RDS) is previewing Aurora (PostgreSQL), Amazon Aurora with PostgreSQL compatibility. Aurora is a fully managed, PostgreSQL- and MySQL-compatible, relational database engine. It combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. For a complete overview of Aurora, see Overview of Amazon Aurora.

Aurora (PostgreSQL) is a drop-in replacement for PostgreSQL. You can use the code, tools, and applications you use today with your existing PostgreSQL databases with Aurora (PostgreSQL).

To register for the Aurora (PostgreSQL), go to the preview signup link at https://pages.awscloud.com/amazon-aurora-with-postgresql-compatibility-preview-form.html. When you're registered, you have access to the sandboxed AWS region where you can create your preview Aurora (PostgreSQL) instance. There is no charge for your preview database instances or storage used by your preview databases during the preview period.

The preview of Aurora (PostgreSQL) have the following limitations:

  • There are no availability or uptime commitments during the preview program.

  • The preview is compatible with PostgreSQL 9.6.2.

  • The preview currently supports the following DB instance classes: db.r3.8xlarge, db.r4.4xlarge, db.r4.8xlarge, and db.r4.16xlarge.

  • You can create up to three instances per account.

  • Read nodes are now supported.

  • Two-phase commit is not yet supported.

  • All extensions that are currently supported for PostgreSQL 9.6.2 are supported in the preview, with no exceptions.

  • Compared with PostgreSQL using Amazon EBS volumes on Amazon EC2, your performance should range from one to two times as fast.

Several Amazon Aurora features are not currently supported during this preview but will be available when Aurora (PostgreSQL) is generally available. The features include:

  • "Survivable" cache warming

  • DB Snapshot import from an Amazon RDS for PostgreSQL DB instance

Creating an Aurora (PostgreSQL) DB Cluster

An Aurora (PostgreSQL) DB cluster is made up of instances that are compatible with PostgreSQL and a cluster volume that represents data copied across three Availability Zones as a single, virtual volume. There are two types of instances in a DB cluster: a primary instance and Aurora Replicas.

The primary instance performs all of the data modifications to the DB cluster and also supports read workloads. Each DB cluster has one primary instance. An Aurora Replica supports only read workloads. Each DB instance can have up to 15 Aurora Replicas. You can connect to any instance in the DB cluster using an endpoint address.

Following, you can learn how to create an Aurora (PostgreSQL) DB cluster and then add an Aurora Replica for that DB cluster. You must complete the tasks in the Setting Up for Amazon RDS section before you can create a DB cluster.

The instructions following describe how to create an Aurora (PostgreSQL) DB cluster using the AWS Management Console. For simple instructions on connecting to your Aurora DB cluster, see Connecting to an Amazon Aurora DB Cluster. For a detailed guide on connecting to an Aurora (PostgreSQL) DB cluster, see RDS Aurora Connectivity.

DB Cluster Prerequisites

The following are prerequisites to create a DB cluster.

VPC

An Aurora (PostgreSQL) DB cluster can only be created in an Amazon Virtual Private Cloud (Amazon VPC) with at least two subnets in at least two Availability Zones. By distributing your cluster instances across at least two Availability Zones, you ensure that there will be instances available in your DB cluster in the unlikely case of an Availability Zone failure. The cluster volume for your Aurora DB cluster will always span three Availability Zones to provide durable storage with less possibility of data loss.

If you are using the Amazon RDS console to create your Aurora (PostgreSQL) DB cluster, then you can have Amazon RDS automatically create a VPC for you. Alternatively, you can use an existing VPC or create a new VPC for your Aurora DB cluster. Your VPC must have at least two subnets for you to use it with an Aurora (PostgreSQL) DB cluster. For more information, see How to Create a VPC for Use with Amazon Aurora. For information on VPCs, see Amazon Virtual Private Cloud (VPCs) and Amazon RDS.

Note

You can communicate with an EC2 instance that is not in a VPC and an Amazon Aurora DB cluster using ClassicLink. For more information, see A DB Instance in a VPC Accessed by an EC2 Instance Not in a VPC.

If you don't have a default VPC or you haven't created a VPC, you can have Amazon RDS automatically create a VPC for you when you create an Aurora (PostgreSQL) DB cluster using the RDS console. Otherwise, you must do the following:

  • Create a VPC with at least two subnets in at least two Availability Zones.

  • Specify a VPC security group that authorizes connections to your Aurora DB cluster. For information, see Working with a DB Instance in a VPC.

  • Specify an RDS DB subnet group that defines at least two subnets in the VPC that can be used by the Aurora DB cluster. For information, see the Working with DB Subnet Groups section in Amazon Virtual Private Cloud (VPCs) and Amazon RDS.

Additional Prerequisites

  • If you are connecting to AWS using AWS Identity and Access Management (IAM) credentials, your IAM account must have IAM policies that grant the permissions required to perform Amazon RDS operations. For more information, see Authentication and Access Control for Amazon RDS.

    If you are using an IAM account to access the Amazon Aurora console, you must first sign in on to the AWS Management Console with your IAM account. Then you can go to the Amazon RDS console link that you were provided when you registered for the preview.

  • If you want to tailor the configuration parameters for your DB cluster, you must specify a DB parameter group with the required parameter settings. For information about creating or modifying a DB parameter group, see Working with DB Parameter Groups.

  • You must determine the TCP/IP port number you will specify for your DB cluster. The firewalls at some companies block connections to the default Aurora (PostgreSQL) port (5432). If your company firewall blocks the default port, choose another port for your DB cluster. All instances in a DB cluster use the same port.

Using the AWS Management Console to Launch an Aurora (PostgreSQL) DB Cluster and Create an Aurora Replica

Launching an Aurora DB Cluster

The following procedures describe how to use the AWS Management Console to launch an Aurora (PostgreSQL) DB cluster and how to create an Aurora Replica during the preview.

To launch an Aurora (PostgreSQL) DB cluster using the console

  1. Open the link to the Amazon RDS console that you were provided when you registered for the preview. Choose Get Started Now.

  2. In the top-right corner of the AWS Management Console, choose the region in which you want to create the DB cluster.

  3. In the navigation pane, choose Instances.

  4. Choose Launch DB Instance to start the Launch DB Instance wizard. The wizard opens on the Select Engine page.

  5. On the Select Engine page, choose the Select for the Aurora (PostgreSQL) DB engine.

  6. On the Specify DB Details page, specify your DB cluster information. The following table shows settings for an Aurora (PostgreSQL) DB instance.

    For This Option Do this

    DB Instance Class

    Select a DB instance class that defines the processing and memory requirements for each instance in the DB cluster. For this preview, Amazon (PostgreSQL) supports the db.r4.4xlarge, db.r4.8xlarge, db.r4.16xlarge, and db.r3.8xlarge DB instance classes. For more information about DB instance class options, see DB Instance Class.

    Multi-AZ Deployment

    Determine if you want to create Aurora Replicas in other Availability Zones for failover support. If you select Create Replica in Different Zone, then Amazon RDS will create an Aurora Replica for you in a different Availability Zone than the primary instance for your DB cluster.

    For more information about multiple Availability Zones, see Regions and Availability Zones.

    DB Instance Identifier

    Type a name for the primary instance in your DB cluster. This identifier is used in the endpoint address for the primary instance of your DB cluster.

    The DB instance identifier has the following constraints:

    • It must contain from 1 to 63 alphanumeric characters or hyphens.

    • Its first character must be a letter.

    • It cannot end with a hyphen or contain two consecutive hyphens.

    • It must be unique for all DB instances for each AWS account in a region.

    Master Username

    Type a name using alphanumeric characters that you will use as the master user name to log on to your DB cluster.

    Master Password

    Type a password that contains from 8 to 41 printable ASCII characters (excluding /,", and @) for your master user password.

  7. Choose Next.

  8. On the Configure Advanced Settings page, you can customize additional settings for your Aurora (PostgreSQL) DB cluster. The following table shows the advanced settings for a DB cluster.

    For This Option Do This

    VPC

    Select the VPC that will host the DB cluster. Select Create a New VPC to have Amazon RDS create a VPC for you. For more information, see DB Cluster Prerequisites earlier in this topic.

    Subnet Group

    Select the DB subnet group to use for the DB cluster. Select Create a New DB Subnet Group to have Amazon RDS create a DB subnet group for you. For more information, see DB Cluster Prerequisites earlier in this topic.

    Publicly Accessible

    Select Yes to give the DB cluster a public IP address; otherwise, select No. The instances in your DB cluster can be a mix of both public and private DB instances. For more information about hiding instances from public access, see Hiding a DB Instance in a VPC from the Internet.

    Availability Zone

    Determine if you want to specify a particular Availability Zone. For more information about Availability Zones, see Regions and Availability Zones.

    VPC Security Group(s)

    Select one or more VPC security groups to secure network access to the DB cluster. Select Create a New VPC Security Group to have Amazon RDS create a VPC security group for you. For more information, see DB Cluster Prerequisites earlier in this topic.

    DB Cluster Identifier

    Type a name for your DB cluster that is unique for your account in the region you selected. This identifier is used in the cluster endpoint address for your DB cluster. For information on the cluster endpoint, see Aurora Endpoints.

    The DB cluster identifier has the following constraints:

    • It must contain from 1 to 63 alphanumeric characters or hyphens.

    • Its first character must be a letter.

    • It cannot end with a hyphen or contain two consecutive hyphens.

    • It must be unique for all DB clusters per AWS account, per region.

    Database Name

    Type a name for your default database of up to 8 alpha-numeric characters. If you don't provide a name, Amazon RDS will not create a database on the DB cluster you are creating.

    To create additional databases, connect to the DB cluster and use the SQL command CREATE DATABASE. For more information about connecting to the DB cluster, see Connecting to an Amazon Aurora DB Cluster.

    Database Port

    Specify the port that applications and utilities will use to access the database. Aurora (PostgreSQL) DB clusters default to the default PostgreSQL port, 5432. The firewalls at some companies block connections to the default PostgreSQL port. If your company firewall blocks the default port, choose another port for the new DB cluster.

    Parameter Group

    Select a parameter group. Aurora has a default parameter group you can use, or you can create your own parameter group. For more information about parameter groups, see Working with DB Parameter Groups.

    Option Group

    Select an option group. Aurora has a default option group you can use, or you can create your own option group. For more information about option groups, see Working with Option Groups.

    Enable Encryption

    Select No for this preview. For more information, see Encrypting Amazon RDS Resources.

    Priority

    Choose a failover priority for the instance. If you don't select a value, the default is tier-1. This priority determines the order in which Aurora Replicas are promoted when recovering from a primary instance failure. For more information, see Fault Tolerance for an Aurora DB Cluster.

    Backup Retention Period

    Select the length of time, from 1 to 35 days, that Aurora will retain backup copies of the database. Backup copies can be used for point-in-time restores (PITR) of your database down to the second.

    Enable Enhanced Monitoring

    Choose Yes to enable gathering metrics in real time for the operating system that your DB cluster runs on. For more information, see Enhanced Monitoring.

    Granularity

    Only available if Enable Enhanced Monitoring is set to Yes. Set the interval, in seconds, between when metrics are collected for your DB cluster.

    Auto Minor Version Upgrade

    Select No for this preview.

    Maintenance Window

    Select the weekly time range during which system maintenance can occur.

  9. Choose Launch DB Instance to launch your Aurora (PostgreSQL) DB instance, and then choose Close to close the wizard.

    On the Amazon RDS console, the new DB instance appears in the list of DB instances. The DB instance has a status of creating until the DB instance is created and ready for use. When the state changes to available, you can connect to the primary instance for your DB cluster. Depending on the DB instance class and store allocated, it can take several minutes for the new instance to be available.

    To view the newly created cluster, choose the Clusters view in the Amazon RDS console. For more information, see Viewing an Amazon Aurora DB Cluster.

    
                                    Amazon Aurora DB Instances List

    Note the port and the endpoint of the cluster. Use the endpoint and port of the cluster in your JDBC and ODBC connection strings for any application that performs write or read operations.

Creating an Aurora Replica Using the Console

After creating the primary instance for your Aurora DB cluster, you can add up to 15 Aurora Replicas by using the Create Aurora Replica wizard.

To create an Aurora Replica by using the AWS Management Console

  1. Open the link to the Amazon RDS console that you were provided when you registered for the preview.

  2. In the left navigation pane, choose Instances.

  3. Select the check box to the left of the primary instance for your Aurora DB cluster.

  4. Choose Instance Actions, and then choose Create Aurora Replica.

  5. On the Create Aurora Replica page, specify options for your Aurora Replica. The following table shows settings for an Aurora Replica.

    For This Option Do This

    DB Instance Class

    Select a DB instance class that defines the processing and memory requirements for the Aurora Replica. For the preview, Aurora (PostgreSQL) supports the db.r4.4xlarge, db.r4.8xlarge, db.r4.16xlarge, and db.r3.8xlarge DB instance classes. For more information about DB instance class options, see DB Instance Class.

    Aurora Replica Source

    Select the identifier of the primary instance to create an Aurora Replica for it.

    DB Instance Identifier

    Type a name for the instance that is unique for your account in the region you selected. You might choose to add some intelligence to the name such as including the region and DB engine you selected, for example aurora-read-instance1.

    Publicly Accessible

    Select Yes to give the Aurora Replica a public IP address; otherwise, select No. For more information about hiding Aurora Replicas from public access, see Hiding a DB Instance in a VPC from the Internet.

    Availability Zone

    Determine if you want to specify a particular Availability Zone. The list includes only those Availability Zones that are mapped by the DB subnet group you specified earlier. For more information about Availability Zones, see Regions and Availability Zones.

    Priority

    Choose a failover priority for the instance. If you don't select a value, the default is tier-1. This priority determines the order in which Aurora Replicas are promoted when recovering from a primary instance failure. For more information, see Fault Tolerance for an Aurora DB Cluster.

    Database Port

    The port for an Aurora Replica is the same as the port for the DB cluster.

    Auto Minor Version Upgrade

    Select No for this preview.

  6. Choose Create Aurora Replica to create the Aurora Replica.

Note the endpoint of the Aurora Replica. Use the endpoint of the Aurora Replica in your JDBC and ODBC connection strings for any application that performs only read operations.

Connecting to an Amazon Aurora DB Cluster

You can connect to an Aurora DB instance using the same tools that you use to connect to a PostgreSQL database. As part of this, you use the same public key for Secure Sockets Layer (SSL) connections. You can use the endpoint and port information from the primary instance or Aurora Replicas in your Amazon Aurora DB cluster in the connection string of any script, utility, or application that connects to a PostgreSQL DB instance. In the connection string, specify the DNS address from the primary instance or Aurora Replica endpoint as the host parameter, and specify the port number from the endpoint as the port parameter.

Once you have a connection to your Amazon Aurora DB cluster, you can run any SQL command that is compatible with PostgreSQL version 9.6.2.

For a helpful and detailed guide on connecting to an Amazon Aurora DB cluster, see RDS Aurora Connectivity.

In the details view for your DB cluster you can find the cluster endpoint. You use this endpoint in your PostgreSQL connection string. The endpoint is made up of the domain name and port for your DB cluster. For example, if an endpoint value is mycluster.cluster-123456789012.us-east-1.rds.amazonaws.com:3306, then you specify the following values in a PostgreSQL connection string:

  • For host or host name, specify mycluster.cluster-123456789012.us-east-1.rds.amazonaws.com

  • For port, specify 5432 or the port value you used when you created the DB cluster.

The cluster endpoint connects you to the primary instance for the DB cluster. You can perform both read and write operations using the cluster endpoint. Your DB cluster can also have up to 15 Aurora Replicas that support read-only access to the data in your DB cluster. The primary instance and each Aurora Replica each has a unique endpoint that is independent of the cluster endpoint. This unique endpoint allows you to connect to a specific DB instance in the cluster directly. The cluster endpoint always points to the primary instance. If the primary instance fails and is replaced, the cluster endpoint points to the new primary instance.


                    Amazon Aurora Launch DB Instance Wizard Create Aurora Replica DB
                        Instance

Troubleshooting Aurora (PostgreSQL) Connection Failures

Note

For a helpful and detailed guide on connecting to an Amazon Aurora DB cluster, see RDS Aurora Connectivity.

Common causes of connection failures to a new Aurora DB cluster are as follows:

  • The DB cluster was created using a VPC that doesn't allow connections from your device. To fix this failure, modify the VPC to allow connections from your device, or create a new VPC for your DB cluster that allows connections from your device. For an example, see Create a VPC and Subnets.

  • The DB cluster was created using a port value that your company has firewall rules blocking connections to that port from devices in your company network. To fix this failure, recreate the instance with a different port.

Fast Failover Best Practices for Aurora (PostgreSQL)

There are several things you can do to make a failover perform faster with Aurora (PostgreSQL). This section discusses each of the following ways:

  • Aggressively set TCP keepalives to ensure that longer running queries that are waiting for a server response will be killed before the read timeout expires in the event of a failure.

  • Set the Java DNS caching timeouts aggressively to ensure the Aurora Read-Only Endpoint can properly cycle through read-only nodes on subsequent connection attempts.

  • Set the timeout variables used in the JDBC connection string as low as possible. Use separate connection objects for short and long running queries.

  • Use the provided read and write Aurora endpoints to establish a connection to the cluster.

  • Use RDS APIs to test application response on server side failures and use a packet dropping tool to test application response for client-side failures.

Setting TCP keepalives Parameters

The TCP keepalive process is simple: when you set up a TCP connection, you associate a set of timers. When the keepalive timer reaches zero, you send a keepalive probe packet. If you receive a reply to your keepalive probe, you can assume that the connection is still up and running.

Enabling TCP keepalive parameters and setting them aggressively ensures that if your client is no longer able to connect to the database, then any active connections are quickly closed. This action allows the application to react appropriately, such as by picking a new host to connect to.

The following TCP keepalive parameters should be set:

  • tcp_keepalive_time controls the time, in seconds, after which a keepalive packet is sent when no data has been sent by the socket (ACKs are not considered data). We recommend the following setting:

    tcp_keepalive_time = 1

  • tcp_keepalive_intvl controls the time, in seconds, between sending subsequent keepalive packets after the initial packet is sent (set using the tcp_keepalive_time parameter). We recommend the following setting:

    tcp_keepalive_intvl = 1

  • tcp_keepalive_probes is the number of unacknowledged keepalive probes that occur before the application is notified. We recommend the following setting:

    tcp_keepalive_probes = 5

These settings should notify the application within five seconds when the database stops responding. A higher tcp_keepalive_probes value can be set if keepalive packets are often dropped within the application's network. Setting the tcp_keepalive_probes value to a higher value will increase the time it takes to detect an actual failure, but will help reduce unneeded failovers in less reliable networks.

Setting TCP keepalive parameters on Linux

  1. When testing how to configure the TCP keepalive parameters, we recommend doing so via the command line with the following commands. Note that this suggested configuration is system wide, meaning that it affects all other applications that create sockets with the SO_KEEPALIVE option on.

    Copy
    sudo sysctl net.ipv4.tcp_keepalive_time=1 sudo sysctl net.ipv4.tcp_keepalive_intvl=1 sudo sysctl net.ipv4.tcp_keepalive_probes=5
  2. Once you've found a configuration that works for your application, these settings must be persisted by adding the following lines (including any changes you made) to /etc/sysctl.conf:

    Copy
    tcp_keepalive_time = 1 tcp_keepalive_intvl = 1 tcp_keepalive_probes = 5

For information on setting TCP keepalive parameters on Windows, see Things You May Want to Know About TCP Keepalive.

Configuring Your Application for Fast Failover

This section discusses several Aurora (PostgreSQL) specific configuration changes you can make. Documentation for general setup and configuration of the JDBC driver is available from the PostgreSQL JDBC site.

Reducing DNS Cache Timeouts

When your application tries to establish a connection after a failover, the new Aurora (PostgreSQL) primary (read-write) instance is chosen from one of the read-only instances in the cluster. The new primary that is chosen can be found using the Aurora reader endpoint before DNS updates have fully propagated. Setting the java DNS TTL to a low value helps cycle between reader nodes on subsequent connection attempts.

Copy
// Sets internal TTL to match the Aurora RO Endpoint TTL java.security.Security.setProperty("networkaddress.cache.ttl" , "1"); // If the lookup fails, default to something like small to retry java.security.Security.setProperty("networkaddress.cache.negative.ttl" , "3");

Setting an Aurora (PostgreSQL) Connection String for Fast Failover

To make use of Aurora (PostgreSQL) fast failover, your application's connection string should have a list of hosts (highlighted in bold in the following example) instead of just a single host. Here is an example connection string you could use to connect to an Aurora (PostgreSQL) cluster:

Copy
jdbc:postgresql://myauroracluster.cluster-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432, myauroracluster.cluster-ro-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432 /postgres?user=<masteruser>&password=<masterpw>&loginTimeout=2 &connectTimeout=2&cancelSignalTimeout=2&socketTimeout=60 &tcpKeepAlive=true&targetServerType=master&loadBalanceHosts=true

For best availability and to avoid a dependency on the RDS API, the best option for connecting is to maintain a file with a host string that your application reads from when you establish a connection to the database. This host string would have all the Aurora endpoints available for the cluster. For more information about Aurora endpoints, see Aurora Endpoints. For example, you could store the endpoints in a file locally like the following:

Copy
myauroracluster.cluster-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432, myauroracluster.cluster-ro-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432

Your application would read from this file to populate the host section of the JDBC connection string. Renaming the DB Cluster causes these endpoints to change; ensure that your application handles that event should it occur.

Another option is to use a list of DB instance nodes:

Copy
my-node1.cksc6xlmwcyw.us-east-1-beta.rds.amazonaws.com:5432, my-node2.cksc6xlmwcyw.us-east-1-beta.rds.amazonaws.com:5432, my-node3.cksc6xlmwcyw.us-east-1-beta.rds.amazonaws.com:5432, my-node4.cksc6xlmwcyw.us-east-1-beta.rds.amazonaws.com:5432

The benefit of this approach is that the PostgreSQL JDBC connection driver will loop through all nodes on this list to find a valid connection, whereas when using the Aurora endpoints only two nodes will be tried per connection attempt. The downside of using DB instance nodes is that if you add or remove nodes from your cluster and the list of instance endpoints becomes stale, the connection driver may never find the correct host to connect to.

Setting the following parameters aggressively helps ensure that your application doesn't wait too long to connect to any one host.

  • loginTimeout - Controls how long your application waits to login to the database after a socket connection has been established.

  • connectTimeout - Controls how long the socket waits to establish a connection to the database.

Other application parameters can be modified to speed up the connection process, depending on how aggressive you want your application to be.

  • cancelSignalTimeout - In some applications, you may want to send a "best effort" cancel signal on a query that has timed out. If this cancel signal is in your failover path, you should consider setting it aggressively to avoid sending this signal to a dead host.

  • socketTimeout - This parameter controls how long the socket waits for read operations. This parameter can be used as a global "query timeout" to ensure no query waits longer than this value. A good practice is to have one connection handler that runs short lived queries and sets this value lower, and to have another connection handler for long running queries with this value set much higher. Then, you can rely on TCP keepalive parameters to kill long running queries if the server goes down.

  • tcpKeepAlive - Enable this parameter to ensure the TCP keepalive parameters that you set are respected.

  • targetServerType- This parameter can be used to control whether the driver connects to a read (slave) or write (master) node. Possible values are: any, master, slave and preferSlave. The preferSlave value attempts to establish a connection to a reader first but falls back and connects to the writer if no reader connection can be established.

  • loadBalanceHosts - When set to true, this parameter has the application connect to a random host chosen from a list of candidate hosts.

Other Options for Obtaining The Host String

You can get the host string from several sources, including the replica_host_status table and by using the Amazon RDS API.

Your application can connect to any DB instance in the DB Cluster and query the replica_host_status table to determine who the writer of the cluster is, or to find any other reader nodes in the cluster. You can use this status table to reduce the amount of time it takes to find a host to connect to, though the replica_host_status table may show out of date or incomplete information in certain network failure scenarios.

A good way to ensure your application can find a node to connect to is to attempt to connect to the cluster writerendpoint and then the cluster readerendpoint until you can establish a readable connection. These endpoints do not change unless you rename your DB Cluster, and thus can generally be left as static members of your application or stored in a resource file that your application reads from.

Once you establish a connection using one of these endpoints, you can query the status table to get information about the rest of the cluster. For example, the following command retrieves information from the replica_host_status table.

Copy
postgres=> select server_id, session_id, vdl, highest_lsn_rcvd, cur_replay_latency, now(), last_update_time from replica_host_status; server_id | session_id | vdl | highest_lsn_rcvd | cur_replay_latency | now | last_update_time -----------------------------------+--------------------------- -----------+-----------+------------------+--------------------+- ------------------------------+------- mynode-1 | 3e3c5044-02e2-11e7-b70d-95172646d6ca | 594220999 | 594221001 | 201421 | 2017-03-07 19:50:24.695322+00 | 2017-03-07 19:50:23+00 mynode-2 | 1efdd188-02e4-11e7-becd-f12d7c88a28a | 594220999 | 594221001 | 201350 | 2017-03-07 19:50:24.695322+00 | 2017-03-07 19:50:23+00 mynode-3 | MASTER_SESSION_ID | 594220999 | | | 2017-03-07 19:50:24.695322+00 | 2017-03-07 19:50:23+00 (3 rows)

The hosts section of your connection string could start with both the writer and reader cluster endpoints:

Copy
myauroracluster.cluster-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432, myauroracluster.cluster-ro-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432

In this scenario, your application would attempt to establish a connection to any node type, master or slave. Once connected, a good practice is to first examine the read-write status of the node by running the command SHOW transaction_read_only.

If the return value of the query is OFF, then you've successfully connected to the master node. If the return value is ON, and your application requires a read-write connection, you can then query the replica_host_status table to determine the server_id that has session_id='MASTER_SESSION_ID'. This query gives you the name of the master node. You can use this in conjunction with the 'endpointPostfix' described below.

One thing to watch out for is when you connect to a replica that has data that has become stale. When this happens, the replica_host_status table may show out-of-date information. A threshold for staleness can be set at the application level and examined by looking at the difference between the server time and the last_update_time. In general, your application should be sure to avoid flip-flopping between two hosts due to conflicting information in the replica_host_status table. That is, your application should err on the side of trying all known hosts first instead of blindly following the replica_host_status table.

Using the Amazon RDS API to Find the Host String

You can programatically find the list of instances by using the AWS Java SDK, specifically the DescribeDbClusters API. Here's a small example of how you might do this in java 8:

Copy
AmazonRDS client = AmazonRDSClientBuilder.defaultClient(); DescribeDBClustersRequest request = new DescribeDBClustersRequest() .withDBClusterIdentifier(clusterName); DescribeDBClustersResult result = rdsClient.describeDBClusters(request); DBCluster singleClusterResult = result.getDBClusters().get(0); String pgJDBCEndpointStr = singleClusterResult.getDBClusterMembers().stream() .sorted(Comparator.comparing(DBClusterMember::getIsClusterWriter) .reversed()) // This puts the writer at the front of the list .map(m -> m.getDBInstanceIdentifier() + endpointPostfix + ":" + singleClusterResult.getPort())) .collect(Collectors.joining(","));

pgJDBCEndpointStr will contain a formatted list of endpoints, e.g:

Copy
my-node1.cksc6xlmwcyw.us-east-1-beta.rds.amazonaws.com:5432, my-node2.cksc6xlmwcyw.us-east-1-beta.rds.amazonaws.com:5432

The variable 'endpointPostfix' can be a constant that your application sets, or can be obtained by querying the DescribeDBInstances API for a single instance in your cluster. This value remains constant within a region and for an individual customer, so it would save an API call to simply keep this constant in a resource file that your application reads from. In the example above, it would be set to:

Copy
.cksc6xlmwcyw.us-east-1-beta.rds.amazonaws.com

For availability purposes, a good practice would be to default to using the Aurora Endpoints of your DB Cluster if the API is not responding, or taking too long to respond. The endpoints are guaranteed to be up to date within the time it takes to update the DNS record (typically less than 30 seconds). This again can be stored in a resource file that your application consumes.

Testing Failover

In all cases you must have a DB Cluster with >= 2 DB instances in it.

From the server side, certain APIs can cause an outage that can be used to test how your applications responds:

  • FailoverDBCluster - Attempts to promote a new DB Instance in your DB Cluster to writer.

    Copy
    public void causeFailover() { /* * See http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/basics.html for more details on setting up an RDS client */ final AmazonRDS rdsClient = AmazonRDSClientBuilder.defaultClient(); FailoverDBClusterRequest request = new FailoverDBClusterRequest(); request.setDBClusterIdentifier("cluster-identifier"); rdsClient.failoverDBCluster(request); }
  • RebootDBInstance - Failover is not guaranteed in this API. However, it will shutdown the database on the writer, and can be used to test how your application responds to connections dropping (note that the ForceFailover parameter is not applicable for Aurora engines and instead should use the FailoverDBCluster API).

  • ModifyDBCluster - Modifying the Port will cause an outage when the nodes in the cluster begin listening on a new port. In general your application can respond to this failure by ensuring that only your application controls port changes and can appropriately update the endpoints it depends on, by having someone manually update the port when they make modifications at the API level, or by querying the RDS API in your application to determine if the port has changed.

  • ModifyDBInstance - Modifying the DBInstanceClass will cause an outage.

  • DeleteDBInstance - Deleting the master/writer will cause a new DB Instance to be promoted to writer in your DB Cluster.

From the application/client side, if using Linux, you can test how the application responds to sudden packet drops based on port, host, or if tcp keepalive packets are not sent or received by using iptables.

Fast Failover Best Practices Example

The following code sample shows how an application might set up an Aurora (PostgreSQL) driver manager. The application would call getConnection(...) when it needed a connection. A call to that function can fail to find a valid host, such as when no writer is found but the targetServerType was set to "master"), and the calling application should simply retry. This can easily be wrapped into a connection pooler to avoid pushing the retry behavior onto the application. Most connection poolers allow you to specify a JDBC connection string, so your application could call into getJdbcConnectionString(...) and pass that to the connection pooler to make use of faster failover on Aurora (PostgreSQL).

Copy
import java.sql.Connection; import java.sql.DriverManager; import java.sql.SQLException; import java.sql.Statement; import java.util.ArrayList; import java.util.List; import java.util.stream.Collectors; import java.util.stream.IntStream; import org.joda.time.Duration; public class FastFailoverDriverManager { private static Duration LOGIN_TIMEOUT = Duration.standardSeconds(2); private static Duration CONNECT_TIMEOUT = Duration.standardSeconds(2); private static Duration CANCEL_SIGNAL_TIMEOUT = Duration.standardSeconds(1); private static Duration DEFAULT_SOCKET_TIMEOUT = Duration.standardSeconds(5); public FastFailoverDriverManager() { try { Class.forName("org.postgresql.Driver"); } catch (ClassNotFoundException e) { e.printStackTrace(); } /* * RO endpoint has a TTL of 1s, we should honor that here. Setting this aggressively makes sure that when * the PG JDBC driver creates a new connection, it will resolve a new different RO endpoint on subsequent attempts * (assuming there is > 1 read node in your cluster) */ java.security.Security.setProperty("networkaddress.cache.ttl" , "1"); // If the lookup fails, default to something like small to retry java.security.Security.setProperty("networkaddress.cache.negative.ttl" , "3"); } public Connection getConnection(String targetServerType) throws SQLException { return getConnection(targetServerType, DEFAULT_SOCKET_TIMEOUT); } public Connection getConnection(String targetServerType, Duration queryTimeout) throws SQLException { Connection conn = DriverManager.getConnection(getJdbcConnectionString(targetServerType, queryTimeout)); /* * A good practice is to set socket and statement timeout to be the same thing since both * the client AND server will kill the query at the same time, leaving no running queries * on the backend */ Statement st = conn.createStatement(); st.execute("set statement_timeout to " + queryTimeout.getMillis()); st.close(); return conn; } private static String urlFormat = "jdbc:postgresql://%s" + "/postgres" + "?user=%s" + "&password=%s" + "&loginTimeout=%d" + "&connectTimeout=%d" + "&cancelSignalTimeout=%d" + "&socketTimeout=%d" + "&targetServerType=%s" + "&tcpKeepAlive=true" + "&ssl=true" + "&loadBalanceHosts=true"; public String getJdbcConnectionString(String targetServerType, Duration queryTimeout) { return String.format(urlFormat, getFormattedEndpointList(getLocalEndpointList()), CredentialManager.getUsername(), CredentialManager.getPassword(), LOGIN_TIMEOUT.getStandardSeconds(), CONNECT_TIMEOUT.getStandardSeconds(), CANCEL_SIGNAL_TIMEOUT.getStandardSeconds(), queryTimeout.getStandardSeconds(), targetServerType ); } private List<String> getLocalEndpointList() { /* * As mentioned in the best practices doc, a good idea is to read a local resource file and parse the cluster endpoints. * For illustration purposes, the endpoint list is hardcoded here */ List<String> newEndpointList = new ArrayList<>(); newEndpointList.add("myauroracluster.cluster-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432"); newEndpointList.add("myauroracluster.cluster-ro-c9bfei4hjlrd.us-east-1-beta.rds.amazonaws.com:5432"); return newEndpointList; } private static String getFormattedEndpointList(List<String> endpoints) { return IntStream.range(0, endpoints.size()) .mapToObj(i -> endpoints.get(i).toString()) .collect(Collectors.joining(",")); } }