Using RDS Data API - Amazon Aurora

Using RDS Data API

By using RDS Data API (Data API), you can work with a web-services interface to your Aurora DB cluster. Data API doesn't require a persistent connection to the DB cluster. Instead, it provides a secure HTTP endpoint and integration with AWS SDKs. You can use the endpoint to run SQL statements without managing connections.

Users don't need to pass credentials with calls to Data API, because Data API uses database credentials stored in AWS Secrets Manager. To store credentials in Secrets Manager, users must be granted the appropriate permissions to use Secrets Manager, and also Data API. For more information about authorizing users, see Authorizing access to RDS Data API.

You can also use Data API to integrate Amazon Aurora with other AWS applications such as AWS Lambda, AWS AppSync, and AWS Cloud9. Data API provides a more secure way to use AWS Lambda. It enables you to access your DB cluster without your needing to configure a Lambda function to access resources in a virtual private cloud (VPC). For more information, see AWS Lambda, AWS AppSync, and AWS Cloud9.

You can enable Data API when you create the Aurora DB cluster. You can also modify the configuration later. For more information, see Enabling RDS Data API.

After you enable Data API, you can also use the query editor to run ad hoc queries without configuring a query tool to access Aurora in a VPC. For more information, see Using the Aurora query editor.

Region and version availability

For information about the Regions and engine versions available for Data API, see the following sections.

Cluster type Region and version availability

Aurora PostgreSQL provisioned and Serverless v2

Data API with Aurora PostgreSQL Serverless v2 and provisioned

Aurora PostgreSQL Serverless v1

Data API with Aurora PostgreSQL Serverless v1

Aurora MySQL Serverless v1

Data API with Aurora MySQL Serverless v1

Note

Currently, Data API isn't available for provisioned or Aurora Serverless v2 DB clusters that use the MySQL engine.

If you require cryptographic modules validated by FIPS 140-2 when accessing Data API through a command line interface or an API, use a FIPS endpoint. For more information about the available FIPS endpoints, see Federal Information Processing Standard (FIPS) 140-2.

Limitations with RDS Data API

RDS Data API (Data API) has the following limitations:

  • You can only execute Data API queries on writer instances in a DB cluster. However, writer instances can accept both write and read queries.

  • With Aurora global databases, you can enable Data API on both primary and secondary DB clusters. However, until a secondary cluster is promoted to be the primary, it has no writer instance. Thus, Data API queries that you send to the secondary fail. After a promoted secondary has an available writer instance, Data API queries on that DB instance should succeed.

  • Performance Insights doesn't support monitoring database queries that you make using Data API.

  • Data API isn't supported on T DB instance classes.

  • For Aurora Serverless v2 and provisioned DB clusters that use the PostgreSQL engine, RDS Data API doesn't support some data types. For the list of supported types, see Comparison of RDS Data API with Serverless v2 and provisioned, and Aurora Serverless v1.

  • For Aurora PostgreSQL version 14 and higher databases, Data API only supports scram-sha-256 for password encryption.

  • The response size limit is 1 MiB. If the call returns more than 1 MiB of response data, the call is terminated.

  • For Aurora Serverless v1, the maximum number of requests per second is 1,000. For all other supported databases, there is no limit.

  • The Data API size limit is 64 KB per row in the result set returned by the database. Make sure that each row in a result set is 64 KB or less.

Comparison of RDS Data API with Serverless v2 and provisioned, and Aurora Serverless v1

The most recent enhancements to RDS Data API make it available for clusters that use recent versions of the PostgreSQL engine. Those clusters could be configured to use Aurora Serverless v2, or provisioned instance classes such as db.t4g or db.r6i.

The following table describes differences between RDS Data API (Data API) with Aurora PostgreSQL Serverless v2 and provisioned DB clusters, and RDS API for Aurora Serverless v1 DB clusters.

Difference Aurora PostgreSQL Serverless v2 and provisioned Aurora Serverless v1
Maximum number of requests per second Unlimited 1,000
Enabling or disabling Data API on an existing database by using the RDS API or AWS CLI
  • RDS API – Use the EnableHttpEndpoint and DisableHttpEndpoint operations.

  • AWS CLI – Use the enable-http-endpoint and disable-http-endpoint operations.

  • RDS API – Use the ModifyDBCluster operation, and specify true or false, as applicable, for the EnableHttpEndpoint parameter.

  • AWS CLI – Use the modify-db-cluster operation with the --enable-http-endpoint or --no-enable-http-endpoint option, as applicable.

CloudTrail events Events from Data API calls are data events. These events are automatically excluded in a trail by default. For more information, see Including Data API events in an AWS CloudTrail trail. Events from Data API calls are management events. These events are automatically included in a trail by default. For more information, see Excluding Data API events from an AWS CloudTrail trail (Aurora Serverless v1 only).
Multistatement support Multistatements aren't supported. In this case, Data API throws ValidationException: Multistatements aren't supported. For Aurora PostgreSQL, multistatements return only the first query response. For Aurora MySQL, multistatements aren't supported.
BatchExecuteStatement The generated fields object in the update result is empty. The generated fields object in the update result includes inserted values.
ExecuteSQL Not supported Deprecated
ExecuteStatement

ExecuteStatement doesn't support retrieving multidimentional array columns. In this case, Data API throws UnsupportedResultException.

Data API doesn't support some data types, such as geometric and monetary types. In this case, Data API throws UnsupportedResultException: The result contains the unsupported data type data_type.

Only the following types are supported:

  • BOOL

  • BYTEA

  • DATE

  • CIDR

  • DECIMAL, NUMERIC

  • ENUM

  • FLOAT8, DOUBLE PRECISION

  • INET

  • INT, INT4, SERIAL

  • INT2, SMALLINT, SMALLSERIAL

  • INT8, BIGINT, BIGSERIAL

  • JSONB, JSON

  • REAL, FLOAT

  • TEXT, CHAR(N), VARCHAR, NAME

  • TIME

  • TIMESTAMP

  • UUID

  • VECTOR

Only the following array types are supported:

  • BOOL[], BIT[]

  • DATE[]

  • DECIMAL[], NUMERIC[]

  • FLOAT8[], DOUBLE PRECISION[]

  • INT[], INT4[]

  • INT2[]

  • INT8[], BIGINT[]

  • JSON[]

  • REAL[], FLOAT[]

  • TEXT[], CHAR(N)[], VARCHAR[], NAME[]

  • TIME[]

  • TIMESTAMP[]

  • UUID[]

ExecuteStatement supports retrieving multidimentional array columns and all advanced data types.

Authorizing access to RDS Data API

Users can invoke RDS Data API (Data API) operations only if they are authorized to do so. You can give a user permission to use Data API by attaching an AWS Identity and Access Management (IAM) policy that defines their privileges. You can also attach the policy to a role if you're using IAM roles. An AWS managed policy, AmazonRDSDataFullAccess, includes permissions for Data API.

The AmazonRDSDataFullAccess policy also includes permissions for the user to get the value of a secret from AWS Secrets Manager. Users need to use Secrets Manager to store secrets that they can use in their calls to Data API. Using secrets means that users don't need to include database credentials for the resources that they target in their calls to Data API. Data API transparently calls Secrets Manager, which allows (or denies) the user's request for the secret. For information about setting up secrets to use with Data API, see Storing database credentials in AWS Secrets Manager.

The AmazonRDSDataFullAccess policy provides complete access (through Data API) to resources. You can narrow the scope by defining your own policies that specify the Amazon Resource Name (ARN) of a resource.

For example, the following policy shows an example of the minimum required permissions for a user to access Data API for the DB cluster identified by its ARN. The policy includes the needed permissions to access Secrets Manager and get authorization to the DB instance for the user.

{ "Version": "2012-10-17", "Statement": [ { "Sid": "SecretsManagerDbCredentialsAccess", "Effect": "Allow", "Action": [ "secretsmanager:GetSecretValue" ], "Resource": "arn:aws:secretsmanager:*:*:secret:rds-db-credentials/*" }, { "Sid": "RDSDataServiceAccess", "Effect": "Allow", "Action": [ "rds-data:BatchExecuteStatement", "rds-data:BeginTransaction", "rds-data:CommitTransaction", "rds-data:ExecuteStatement", "rds-data:RollbackTransaction" ], "Resource": "arn:aws:rds:us-east-2:111122223333:cluster:prod" } ] }

We recommend that you use a specific ARN for the "Resources" element in your policy statements (as shown in the example) rather than a wildcard (*).

Working with tag-based authorization

RDS Data API (Data API) and Secrets Manager both support tag-based authorization. Tags are key-value pairs that label a resource, such as an RDS cluster, with an additional string value, for example:

  • environment:production

  • environment:development

You can apply tags to your resources for cost allocation, operations support, access control, and many other reasons. (If you don't already have tags on your resources and you want to apply them, you can learn more at Tagging Amazon RDS resources.) You can use the tags in your policy statements to limit access to the RDS clusters that are labeled with these tags. As an example, an Aurora DB cluster might have tags that identify its environment as either production or development.

The following example shows how you can use tags in your policy statements. This statement requires that both the cluster and the secret passed in the Data API request have an environment:production tag.

Here's how the policy is applied: When a user makes a call using Data API, the request is sent to the service. Data API first verifies that the cluster ARN passed in the request is tagged with environment:production. It then calls Secrets Manager to retrieve the value of the user's secret in the request. Secrets Manager also verifies that the user's secret is tagged with environment:production. If so, Data API then uses the retrieved value for the user's DB password. Finally, if that's also correct, the Data API request is invoked successfully for the user.

{ "Version": "2012-10-17", "Statement": [ { "Sid": "SecretsManagerDbCredentialsAccess", "Effect": "Allow", "Action": [ "secretsmanager:GetSecretValue" ], "Resource": "arn:aws:secretsmanager:*:*:secret:rds-db-credentials/*", "Condition": { "StringEquals": { "aws:ResourceTag/environment": [ "production" ] } } }, { "Sid": "RDSDataServiceAccess", "Effect": "Allow", "Action": [ "rds-data:*" ], "Resource": "arn:aws:rds:us-east-2:111122223333:cluster:*", "Condition": { "StringEquals": { "aws:ResourceTag/environment": [ "production" ] } } } ] }

The example shows separate actions for rds-data and secretsmanager for Data API and Secrets Manager. However, you can combine actions and define tag conditions in many different ways to support your specific use cases. For more information, see Using identity-based policies (IAM policies) for Secrets Manager.

In the "Condition" element of the policy, you can choose tag keys from among the following:

  • aws:TagKeys

  • aws:ResourceTag/${TagKey}

To learn more about resource tags and how to use aws:TagKeys, see Controlling access to AWS resources using resource tags.

Note

Both Data API and AWS Secrets Manager authorize users. If you don't have permissions for all actions defined in a policy, you get an AccessDeniedException error.

Storing database credentials in AWS Secrets Manager

When you call RDS Data API (Data API), you pass credentials for the Aurora DB cluster by using a secret in Secrets Manager. To pass credentials in this way, you specify the name of the secret or the Amazon Resource Name (ARN) of the secret.

To store DB cluster credentials in a secret
  1. Use Secrets Manager to create a secret that contains credentials for the Aurora DB cluster.

    For instructions, see Create a database secret in the AWS Secrets Manager User Guide.

  2. Use the Secrets Manager console to view the details for the secret you created, or run the aws secretsmanager describe-secret AWS CLI command.

    Note the name and ARN of the secret. You can use them in calls to Data API.

For more information about using Secrets Manager, see the AWS Secrets Manager User Guide.

To understand how Amazon Aurora manages identity and access management, see How Amazon Aurora works with IAM.

For more information about creating an IAM policy, see Creating IAM Policies in the IAM User Guide. For information about adding an IAM policy to a user, see Adding and Removing IAM Identity Permissions in the IAM User Guide.

Enabling RDS Data API

To use RDS Data API (Data API), enable it for your Aurora DB cluster. You can enable Data API when you create or modify the DB cluster.

Note

For Aurora PostgreSQL, Data API is supported with Aurora Serverless v2, Aurora Serverless v1, and provisioned databases. For Aurora MySQL, Data API is only supported with Aurora Serverless v1 databases.

Enabling RDS Data API when you create a database

While you are creating a database that supports RDS Data API (Data API), you can enable this feature. The following procedures describe how to do so when you use the AWS Management Console, the AWS CLI, or the RDS API.

To enable Data API when you create a DB cluster, select the Enable the RDS Data API checkbox in the Connectivity section of the Create database page, as in the following screenshot.

The Connectivity section on the Create database page, with the Enable the RDS Data API checkbox selected.

For instructions on how to create an Aurora DB cluster that can use the RDS Data API, see the following:

To enable Data API while you're creating an Aurora DB cluster, run the create-db-cluster AWS CLI command with the --enable-http-endpoint option.

The following example creates an Aurora PostgreSQL DB cluster with Data API enabled.

For Linux, macOS, or Unix:

aws rds create-db-cluster \ --db-cluster-identifier my_pg_cluster \ --engine aurora-postgresql \ --enable-http-endpoint

For Windows:

aws rds create-db-cluster ^ --db-cluster-identifier my_pg_cluster ^ --engine aurora-postgresql ^ --enable-http-endpoint

To enable Data API while you're creating an Aurora DB cluster, use the CreateDBCluster operation with the value of the EnableHttpEndpoint parameter set to true.

Enabling RDS Data API on an existing database

You can modify a DB cluster that supports RDS Data API (Data API) to enable or disable this feature.

Enabling or disabling Data API (Aurora PostgreSQL Serverless v2 and provisioned)

Use the following procedures to enable or disable Data API on Aurora PostgreSQL Serverless v2 and provisioned databases. To enable or disable Data API on Aurora Serverless v1 databases, use the procedures in Enabling or disabling Data API (Aurora Serverless v1 only).

You can enable or disable Data API by using the RDS console for a DB cluster that supports this feature. To do so, open the cluster details page of the database on which you want to enable or disable Data API, and on the Connectivity & security tab, go to the RDS Data API section. This section displays the status of Data API, and allows you to enable or disable it.

The following screenshot shows that the RDS Data API isn't enabled.

The RDS Data API section on the Connectivity and security tab of the details page for a DB cluster. The status of Data API displays as disabled, and the Enable the RDS Data API button is present.

To enable or disable Data API on an existing database, run the enable-http-endpoint or disable-http-endpoint AWS CLI command, and specify the ARN of your DB cluster.

The following example enables Data API.

For Linux, macOS, or Unix:

aws rds enable-http-endpoint \ --resource-arn cluster_arn

For Windows:

aws rds enable-http-endpoint ^ --resource-arn cluster_arn

To enable or disable Data API on an existing database, use the EnableHttpEndpoint and DisableHttpEndpoint operations.

Enabling or disabling Data API (Aurora Serverless v1 only)

Use the following procedures to enable or disable Data API on existing Aurora Serverless v1 databases. To enable or disable Data API on Aurora PostgreSQL Serverless v2 and provisioned databases, use the procedures in Enabling or disabling Data API (Aurora PostgreSQL Serverless v2 and provisioned).

When you modify an Aurora Serverless v1 DB cluster, you enable Data API in the RDS console's Connectivity section.

The following screenshot shows the enabled Data API when modifying an Aurora DB cluster.

The Connectivity section on the Modify DB Cluster page, the Data API checkbox is selected.

For instructions on how to modify an Aurora Serverless v1 DB cluster, see Modifying an Aurora Serverless v1 DB cluster.

To enable or disable Data API, run the modify-db-cluster AWS CLI command, with the --enable-http-endpoint or --no-enable-http-endpoint, as applicable.

The following example enables Data API on sample-cluster.

For Linux, macOS, or Unix:

aws rds modify-db-cluster \ --db-cluster-identifier sample-cluster \ --enable-http-endpoint

For Windows:

aws rds modify-db-cluster ^ --db-cluster-identifier sample-cluster ^ --enable-http-endpoint

To enable Data API, use the ModifyDBCluster operation, and set the value of EnableHttpEndpoint to true or false, as applicable.

Creating an Amazon VPC endpoint for RDS Data API (AWS PrivateLink)

Amazon VPC enables you to launch AWS resources, such as Aurora DB clusters and applications, into a virtual private cloud (VPC). AWS PrivateLink provides private connectivity between VPCs and AWS services with high security on the Amazon network. Using AWS PrivateLink, you can create Amazon VPC endpoints, which enable you to connect to services across different accounts and VPCs based on Amazon VPC. For more information about AWS PrivateLink, see VPC Endpoint Services (AWS PrivateLink) in the Amazon Virtual Private Cloud User Guide.

You can call RDS Data API (Data API) with Amazon VPC endpoints. Using an Amazon VPC endpoint keeps traffic between applications in your Amazon VPC and Data API in the AWS network, without using public IP addresses. Amazon VPC endpoints can help you meet compliance and regulatory requirements related to limiting public internet connectivity. For example, if you use an Amazon VPC endpoint, you can keep traffic between an application running on an Amazon EC2 instance and Data API in the VPCs that contain them.

After you create the Amazon VPC endpoint, you can start using it without making any code or configuration changes in your application.

To create an Amazon VPC endpoint for Data API
  1. Sign in to the AWS Management Console and open the Amazon VPC console at https://console.aws.amazon.com/vpc/.

  2. Choose Endpoints, and then choose Create Endpoint.

  3. On the Create Endpoint page, for Service category, choose AWS services. For Service Name, choose rds-data.

    Create an Amazon VPC endpoint for Data API
  4. For VPC, choose the VPC to create the endpoint in.

    Choose the VPC that contains the application that makes Data API calls.

  5. For Subnets, choose the subnet for each Availability Zone (AZ) used by the AWS service that is running your application.

    Choose subnets for the Amazon VPC endpoint

    To create an Amazon VPC endpoint, specify the private IP address range in which the endpoint will be accessible. To do this, choose the subnet for each Availability Zone. Doing so restricts the VPC endpoint to the private IP address range specific to each Availability Zone and also creates an Amazon VPC endpoint in each Availability Zone.

  6. For Enable DNS name, select Enable for this endpoint.

    Enable DNS name for the Amazon VPC endpoint

    Private DNS resolves the standard Data API DNS hostname (https://rds-data.region.amazonaws.com) to the private IP addresses associated with the DNS hostname specific to your Amazon VPC endpoint. As a result, you can access the Data API VPC endpoint using the AWS CLI or AWS SDKs without making any code or configuration changes to update Data API's endpoint URL.

  7. For Security group, choose a security group to associate with the Amazon VPC endpoint.

    Choose the security group that allows access to the AWS service that is running your application. For example, if an Amazon EC2 instance is running your application, choose the security group that allows access to the Amazon EC2 instance. The security group enables you to control the traffic to the Amazon VPC endpoint from resources in your VPC.

  8. For Policy, choose Full Access to allow anyone inside the Amazon VPC to access the Data API through this endpoint. Or choose Custom to specify a policy that limits access.

    If you choose Custom, enter the policy in the policy creation tool.

  9. Choose Create endpoint.

After the endpoint is created, choose the link in the AWS Management Console to view the endpoint details.

Link to the Amazon VPC endpoint details

The endpoint Details tab shows the DNS hostnames that were generated while creating the Amazon VPC endpoint.

Link to the Amazon VPC endpoint details

You can use the standard endpoint (rds-data.region.amazonaws.com) or one of the VPC-specific endpoints to call the Data API within the Amazon VPC. The standard Data API endpoint automatically routes to the Amazon VPC endpoint. This routing occurs because the Private DNS hostname was enabled when the Amazon VPC endpoint was created.

When you use an Amazon VPC endpoint in a Data API call, all traffic between your application and Data API remains in the Amazon VPCs that contain them. You can use an Amazon VPC endpoint for any type of Data API call. For information about calling Data API, see Calling RDS Data API.

Calling RDS Data API

With RDS Data API (Data API) enabled on your Aurora DB cluster, you can run SQL statements on the Aurora DB cluster by using Data API or the AWS CLI. Data API supports the programming languages supported by the AWS SDKs. For more information, see Tools to build on AWS.

Data API operations reference

Data API provides the following operations to perform SQL statements.

Data API operation

AWS CLI command

Description

ExecuteStatement

aws rds-data execute-statement

Runs a SQL statement on a database.

BatchExecuteStatement

aws rds-data batch-execute-statement

Runs a batch SQL statement over an array of data for bulk update and insert operations. You can run a data manipulation language (DML) statement with an array of parameter sets. A batch SQL statement can provide a significant performance improvement over individual insert and update statements.

You can use either operation to run individual SQL statements or to run transactions. For transactions, Data API provides the following operations.

Data API operation

AWS CLI command

Description

BeginTransaction

aws rds-data begin-transaction

Starts a SQL transaction.

CommitTransaction

aws rds-data commit-transaction

Ends a SQL transaction and commits the changes.

RollbackTransaction

aws rds-data rollback-transaction

Performs a rollback of a transaction.

The operations for performing SQL statements and supporting transactions have the following common Data API parameters and AWS CLI options. Some operations support other parameters or options.

Data API operation parameter

AWS CLI command option

Required

Description

resourceArn

--resource-arn

Yes

The Amazon Resource Name (ARN) of the Aurora DB cluster.

secretArn

--secret-arn

Yes

The name or ARN of the secret that enables access to the DB cluster.

You can use parameters in Data API calls to ExecuteStatement and BatchExecuteStatement, and when you run the AWS CLI commands execute-statement and batch-execute-statement. To use a parameter, you specify a name-value pair in the SqlParameter data type. You specify the value with the Field data type. The following table maps Java Database Connectivity (JDBC) data types to the data types that you specify in Data API calls.

JDBC data type

Data API data type

INTEGER, TINYINT, SMALLINT, BIGINT

LONG (or STRING)

FLOAT, REAL, DOUBLE

DOUBLE

DECIMAL

STRING

BOOLEAN, BIT

BOOLEAN

BLOB, BINARY, LONGVARBINARY, VARBINARY

BLOB

CLOB

STRING

Other types (including types related to date and time)

STRING

Note

You can specify the LONG or STRING data type in your Data API call for LONG values returned by the database. We recommend that you do so to avoid losing precision for extremely large numbers, which can happen when you work with JavaScript.

Certain types, such as DECIMAL and TIME, require a hint so that Data API passes String values to the database as the correct type. To use a hint, include values for typeHint in the SqlParameter data type. The possible values for typeHint are the following:

  • DATE – The corresponding String parameter value is sent as an object of DATE type to the database. The accepted format is YYYY-MM-DD.

  • DECIMAL – The corresponding String parameter value is sent as an object of DECIMAL type to the database.

  • JSON – The corresponding String parameter value is sent as an object of JSON type to the database.

  • TIME – The corresponding String parameter value is sent as an object of TIME type to the database. The accepted format is HH:MM:SS[.FFF].

  • TIMESTAMP – The corresponding String parameter value is sent as an object of TIMESTAMP type to the database. The accepted format is YYYY-MM-DD HH:MM:SS[.FFF].

  • UUID – The corresponding String parameter value is sent as an object of UUID type to the database.

    Note

    Currently, Data API doesn't support arrays of Universal Unique Identifiers (UUIDs).

Note

For Amazon Aurora PostgreSQL, Data API always returns the Aurora PostgreSQL data type TIMESTAMPTZ in UTC time zone.

Calling RDS Data API with the AWS CLI

You can call RDS Data API (Data API) using the AWS CLI.

The following examples use the AWS CLI for Data API. For more information, see AWS CLI reference for the Data API.

In each example, replace the Amazon Resource Name (ARN) for the DB cluster with the ARN for your Aurora DB cluster. Also, replace the secret ARN with the ARN of the secret in Secrets Manager that allows access to the DB cluster.

Note

The AWS CLI can format responses in JSON.

Starting a SQL transaction

You can start a SQL transaction using the aws rds-data begin-transaction CLI command. The call returns a transaction identifier.

Important

Within Data API, a transaction times out if there are no calls that use its transaction ID in three minutes. If a transaction times out before it's committed, Data API rolls it back automatically.

MySQL data definition language (DDL) statements inside a transaction cause an implicit commit. We recommend that you run each MySQL DDL statement in a separate execute-statement command with the --continue-after-timeout option.

In addition to the common options, specify the --database option, which provides the name of the database.

For example, the following CLI command starts a SQL transaction.

For Linux, macOS, or Unix:

aws rds-data begin-transaction --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret"

For Windows:

aws rds-data begin-transaction --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret"

The following is an example of the response.

{ "transactionId": "ABC1234567890xyz" }

Running a SQL statement

You can run a SQL statement using the aws rds-data execute-statement CLI command.

You can run the SQL statement in a transaction by specifying the transaction identifier with the --transaction-id option. You can start a transaction using the aws rds-data begin-transaction CLI command. You can end and commit a transaction using the aws rds-data commit-transaction CLI command.

Important

If you don't specify the --transaction-id option, changes that result from the call are committed automatically.

In addition to the common options, specify the following options:

  • --sql (required) – A SQL statement to run on the DB cluster.

  • --transaction-id (optional) – The identifier of a transaction that was started using the begin-transaction CLI command. Specify the transaction ID of the transaction that you want to include the SQL statement in.

  • --parameters (optional) – The parameters for the SQL statement.

  • --include-result-metadata | --no-include-result-metadata (optional) – A value that indicates whether to include metadata in the results. The default is --no-include-result-metadata.

  • --database (optional) – The name of the database.

    The --database option might not work when you run a SQL statement after running --sql "use database_name;" in the previous request. We recommend that you use the --database option instead of running --sql "use database_name;" statements.

  • --continue-after-timeout | --no-continue-after-timeout (optional) – A value that indicates whether to continue running the statement after the call exceeds the Data API timeout interval of 45 seconds. The default is --no-continue-after-timeout.

    For data definition language (DDL) statements, we recommend continuing to run the statement after the call times out to avoid errors and the possibility of corrupted data structures.

  • --format-records-as "JSON"|"NONE" – An optional value that specifies whether to format the result set as a JSON string. The default is "NONE". For usage information about processing JSON result sets, see Processing RDS Data API query results in JSON format.

The DB cluster returns a response for the call.

Note

The response size limit is 1 MiB. If the call returns more than 1 MiB of response data, the call is terminated.

For Aurora Serverless v1, the maximum number of requests per second is 1,000. For all other supported databases, there is no limit.

For example, the following CLI command runs a single SQL statement and omits the metadata in the results (the default).

For Linux, macOS, or Unix:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" \ --sql "select * from mytable"

For Windows:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" ^ --sql "select * from mytable"

The following is an example of the response.

{ "numberOfRecordsUpdated": 0, "records": [ [ { "longValue": 1 }, { "stringValue": "ValueOne" } ], [ { "longValue": 2 }, { "stringValue": "ValueTwo" } ], [ { "longValue": 3 }, { "stringValue": "ValueThree" } ] ] }

The following CLI command runs a single SQL statement in a transaction by specifying the --transaction-id option.

For Linux, macOS, or Unix:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" \ --sql "update mytable set quantity=5 where id=201" --transaction-id "ABC1234567890xyz"

For Windows:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" ^ --sql "update mytable set quantity=5 where id=201" --transaction-id "ABC1234567890xyz"

The following is an example of the response.

{ "numberOfRecordsUpdated": 1 }

The following CLI command runs a single SQL statement with parameters.

For Linux, macOS, or Unix:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" \ --sql "insert into mytable values (:id, :val)" --parameters "[{\"name\": \"id\", \"value\": {\"longValue\": 1}},{\"name\": \"val\", \"value\": {\"stringValue\": \"value1\"}}]"

For Windows:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" ^ --sql "insert into mytable values (:id, :val)" --parameters "[{\"name\": \"id\", \"value\": {\"longValue\": 1}},{\"name\": \"val\", \"value\": {\"stringValue\": \"value1\"}}]"

The following is an example of the response.

{ "numberOfRecordsUpdated": 1 }

The following CLI command runs a data definition language (DDL) SQL statement. The DDL statement renames column job to column role.

Important

For DDL statements, we recommend continuing to run the statement after the call times out. When a DDL statement terminates before it is finished running, it can result in errors and possibly corrupted data structures. To continue running a statement after a call exceeds the RDS Data API timeout interval of 45 seconds, specify the --continue-after-timeout option.

For Linux, macOS, or Unix:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" \ --sql "alter table mytable change column job role varchar(100)" --continue-after-timeout

For Windows:

aws rds-data execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" ^ --sql "alter table mytable change column job role varchar(100)" --continue-after-timeout

The following is an example of the response.

{ "generatedFields": [], "numberOfRecordsUpdated": 0 }
Note

The generatedFields data isn't supported by Aurora PostgreSQL. To get the values of generated fields, use the RETURNING clause. For more information, see Returning data from modified rows in the PostgreSQL documentation.

Running a batch SQL statement over an array of data

You can run a batch SQL statement over an array of data by using the aws rds-data batch-execute-statement CLI command. You can use this command to perform a bulk import or update operation.

You can run the SQL statement in a transaction by specifying the transaction identifier with the --transaction-id option. You can start a transaction by using the aws rds-data begin-transaction CLI command. You can end and commit a transaction by using the aws rds-data commit-transaction CLI command.

Important

If you don't specify the --transaction-id option, changes that result from the call are committed automatically.

In addition to the common options, specify the following options:

  • --sql (required) – A SQL statement to run on the DB cluster.

    Tip

    For MySQL-compatible statements, don't include a semicolon at the end of the --sql parameter. A trailing semicolon might cause a syntax error.

  • --transaction-id (optional) – The identifier of a transaction that was started using the begin-transaction CLI command. Specify the transaction ID of the transaction that you want to include the SQL statement in.

  • --parameter-set (optional) – The parameter sets for the batch operation.

  • --database (optional) – The name of the database.

The DB cluster returns a response to the call.

Note

There isn't a fixed upper limit on the number of parameter sets. However, the maximum size of the HTTP request submitted through Data API is 4 MiB. If the request exceeds this limit, Data API returns an error and doesn't process the request. This 4 MiB limit includes the size of the HTTP headers and the JSON notation in the request. Thus, the number of parameter sets that you can include depends on a combination of factors, such as the size of the SQL statement and the size of each parameter set.

The response size limit is 1 MiB. If the call returns more than 1 MiB of response data, the call is terminated.

For Aurora Serverless v1, the maximum number of requests per second is 1,000. For all other supported databases, there is no limit.

For example, the following CLI command runs a batch SQL statement over an array of data with a parameter set.

For Linux, macOS, or Unix:

aws rds-data batch-execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" \ --sql "insert into mytable values (:id, :val)" \ --parameter-sets "[[{\"name\": \"id\", \"value\": {\"longValue\": 1}},{\"name\": \"val\", \"value\": {\"stringValue\": \"ValueOne\"}}], [{\"name\": \"id\", \"value\": {\"longValue\": 2}},{\"name\": \"val\", \"value\": {\"stringValue\": \"ValueTwo\"}}], [{\"name\": \"id\", \"value\": {\"longValue\": 3}},{\"name\": \"val\", \"value\": {\"stringValue\": \"ValueThree\"}}]]"

For Windows:

aws rds-data batch-execute-statement --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --database "mydb" --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" ^ --sql "insert into mytable values (:id, :val)" ^ --parameter-sets "[[{\"name\": \"id\", \"value\": {\"longValue\": 1}},{\"name\": \"val\", \"value\": {\"stringValue\": \"ValueOne\"}}], [{\"name\": \"id\", \"value\": {\"longValue\": 2}},{\"name\": \"val\", \"value\": {\"stringValue\": \"ValueTwo\"}}], [{\"name\": \"id\", \"value\": {\"longValue\": 3}},{\"name\": \"val\", \"value\": {\"stringValue\": \"ValueThree\"}}]]"
Note

Don't include line breaks in the --parameter-sets option.

Committing a SQL transaction

Using the aws rds-data commit-transaction CLI command, you can end a SQL transaction that you started with aws rds-data begin-transaction and commit the changes.

In addition to the common options, specify the following option:

  • --transaction-id (required) – The identifier of a transaction that was started using the begin-transaction CLI command. Specify the transaction ID of the transaction that you want to end and commit.

For example, the following CLI command ends a SQL transaction and commits the changes.

For Linux, macOS, or Unix:

aws rds-data commit-transaction --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" \ --transaction-id "ABC1234567890xyz"

For Windows:

aws rds-data commit-transaction --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" ^ --transaction-id "ABC1234567890xyz"

The following is an example of the response.

{ "transactionStatus": "Transaction Committed" }

Rolling back a SQL transaction

Using the aws rds-data rollback-transaction CLI command, you can roll back a SQL transaction that you started with aws rds-data begin-transaction. Rolling back a transaction cancels its changes.

Important

If the transaction ID has expired, the transaction was rolled back automatically. In this case, an aws rds-data rollback-transaction command that specifies the expired transaction ID returns an error.

In addition to the common options, specify the following option:

  • --transaction-id (required) – The identifier of a transaction that was started using the begin-transaction CLI command. Specify the transaction ID of the transaction that you want to roll back.

For example, the following AWS CLI command rolls back a SQL transaction.

For Linux, macOS, or Unix:

aws rds-data rollback-transaction --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" \ --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" \ --transaction-id "ABC1234567890xyz"

For Windows:

aws rds-data rollback-transaction --resource-arn "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster" ^ --secret-arn "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret" ^ --transaction-id "ABC1234567890xyz"

The following is an example of the response.

{ "transactionStatus": "Rollback Complete" }

Calling RDS Data API from a Python application

You can call RDS Data API (Data API) from a Python application.

The following examples use the AWS SDK for Python (Boto). For more information about Boto, see the AWS SDK for Python (Boto 3) documentation.

In each example, replace the DB cluster's Amazon Resource Name (ARN) with the ARN for your Aurora DB cluster. Also, replace the secret ARN with the ARN of the secret in Secrets Manager that allows access to the DB cluster.

Running a SQL query

You can run a SELECT statement and fetch the results with a Python application.

The following example runs a SQL query.

import boto3 rdsData = boto3.client('rds-data') cluster_arn = 'arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster' secret_arn = 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret' response1 = rdsData.execute_statement( resourceArn = cluster_arn, secretArn = secret_arn, database = 'mydb', sql = 'select * from employees limit 3') print (response1['records']) [ [ { 'longValue': 1 }, { 'stringValue': 'ROSALEZ' }, { 'stringValue': 'ALEJANDRO' }, { 'stringValue': '2016-02-15 04:34:33.0' } ], [ { 'longValue': 1 }, { 'stringValue': 'DOE' }, { 'stringValue': 'JANE' }, { 'stringValue': '2014-05-09 04:34:33.0' } ], [ { 'longValue': 1 }, { 'stringValue': 'STILES' }, { 'stringValue': 'JOHN' }, { 'stringValue': '2017-09-20 04:34:33.0' } ] ]

Running a DML SQL statement

You can run a data manipulation language (DML) statement to insert, update, or delete data in your database. You can also use parameters in DML statements.

Important

If a call isn't part of a transaction because it doesn't include the transactionID parameter, changes that result from the call are committed automatically.

The following example runs an insert SQL statement and uses parameters.

import boto3 cluster_arn = 'arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster' secret_arn = 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret' rdsData = boto3.client('rds-data') param1 = {'name':'firstname', 'value':{'stringValue': 'JACKSON'}} param2 = {'name':'lastname', 'value':{'stringValue': 'MATEO'}} paramSet = [param1, param2] response2 = rdsData.execute_statement(resourceArn=cluster_arn, secretArn=secret_arn, database='mydb', sql='insert into employees(first_name, last_name) VALUES(:firstname, :lastname)', parameters = paramSet) print (response2["numberOfRecordsUpdated"])

Running a SQL transaction

You can start a SQL transaction, run one or more SQL statements, and then commit the changes with a Python application.

Important

A transaction times out if there are no calls that use its transaction ID in three minutes. If a transaction times out before it's committed, it's rolled back automatically.

If you don't specify a transaction ID, changes that result from the call are committed automatically.

The following example runs a SQL transaction that inserts a row in a table.

import boto3 rdsData = boto3.client('rds-data') cluster_arn = 'arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster' secret_arn = 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret' tr = rdsData.begin_transaction( resourceArn = cluster_arn, secretArn = secret_arn, database = 'mydb') response3 = rdsData.execute_statement( resourceArn = cluster_arn, secretArn = secret_arn, database = 'mydb', sql = 'insert into employees(first_name, last_name) values('XIULAN', 'WANG')', transactionId = tr['transactionId']) cr = rdsData.commit_transaction( resourceArn = cluster_arn, secretArn = secret_arn, transactionId = tr['transactionId']) cr['transactionStatus'] 'Transaction Committed' response3['numberOfRecordsUpdated'] 1
Note

If you run a data definition language (DDL) statement, we recommend continuing to run the statement after the call times out. When a DDL statement terminates before it is finished running, it can result in errors and possibly corrupted data structures. To continue running a statement after a call exceeds the RDS Data API timeout interval of 45 seconds, set the continueAfterTimeout parameter to true.

Calling RDS Data API from a Java application

You can call RDS Data API (Data API) from a Java application.

The following examples use the AWS SDK for Java. For more information, see the AWS SDK for Java Developer Guide.

In each example, replace the DB cluster's Amazon Resource Name (ARN) with the ARN for your Aurora DB cluster. Also, replace the secret ARN with the ARN of the secret in Secrets Manager that allows access to the DB cluster.

Running a SQL query

You can run a SELECT statement and fetch the results with a Java application.

The following example runs a SQL query.

package com.amazonaws.rdsdata.examples; import com.amazonaws.services.rdsdata.AWSRDSData; import com.amazonaws.services.rdsdata.AWSRDSDataClient; import com.amazonaws.services.rdsdata.model.ExecuteStatementRequest; import com.amazonaws.services.rdsdata.model.ExecuteStatementResult; import com.amazonaws.services.rdsdata.model.Field; import java.util.List; public class FetchResultsExample { public static final String RESOURCE_ARN = "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster"; public static final String SECRET_ARN = "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret"; public static void main(String[] args) { AWSRDSData rdsData = AWSRDSDataClient.builder().build(); ExecuteStatementRequest request = new ExecuteStatementRequest() .withResourceArn(RESOURCE_ARN) .withSecretArn(SECRET_ARN) .withDatabase("mydb") .withSql("select * from mytable"); ExecuteStatementResult result = rdsData.executeStatement(request); for (List<Field> fields: result.getRecords()) { String stringValue = fields.get(0).getStringValue(); long numberValue = fields.get(1).getLongValue(); System.out.println(String.format("Fetched row: string = %s, number = %d", stringValue, numberValue)); } } }

Running a SQL transaction

You can start a SQL transaction, run one or more SQL statements, and then commit the changes with a Java application.

Important

A transaction times out if there are no calls that use its transaction ID in three minutes. If a transaction times out before it's committed, it's rolled back automatically.

If you don't specify a transaction ID, changes that result from the call are committed automatically.

The following example runs a SQL transaction.

package com.amazonaws.rdsdata.examples; import com.amazonaws.services.rdsdata.AWSRDSData; import com.amazonaws.services.rdsdata.AWSRDSDataClient; import com.amazonaws.services.rdsdata.model.BeginTransactionRequest; import com.amazonaws.services.rdsdata.model.BeginTransactionResult; import com.amazonaws.services.rdsdata.model.CommitTransactionRequest; import com.amazonaws.services.rdsdata.model.ExecuteStatementRequest; public class TransactionExample { public static final String RESOURCE_ARN = "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster"; public static final String SECRET_ARN = "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret"; public static void main(String[] args) { AWSRDSData rdsData = AWSRDSDataClient.builder().build(); BeginTransactionRequest beginTransactionRequest = new BeginTransactionRequest() .withResourceArn(RESOURCE_ARN) .withSecretArn(SECRET_ARN) .withDatabase("mydb"); BeginTransactionResult beginTransactionResult = rdsData.beginTransaction(beginTransactionRequest); String transactionId = beginTransactionResult.getTransactionId(); ExecuteStatementRequest executeStatementRequest = new ExecuteStatementRequest() .withTransactionId(transactionId) .withResourceArn(RESOURCE_ARN) .withSecretArn(SECRET_ARN) .withSql("INSERT INTO test_table VALUES ('hello world!')"); rdsData.executeStatement(executeStatementRequest); CommitTransactionRequest commitTransactionRequest = new CommitTransactionRequest() .withTransactionId(transactionId) .withResourceArn(RESOURCE_ARN) .withSecretArn(SECRET_ARN); rdsData.commitTransaction(commitTransactionRequest); } }
Note

If you run a data definition language (DDL) statement, we recommend continuing to run the statement after the call times out. When a DDL statement terminates before it is finished running, it can result in errors and possibly corrupted data structures. To continue running a statement after a call exceeds the RDS Data API timeout interval of 45 seconds, set the continueAfterTimeout parameter to true.

Running a batch SQL operation

You can run bulk insert and update operations over an array of data with a Java application. You can run a DML statement with array of parameter sets.

Important

If you don't specify a transaction ID, changes that result from the call are committed automatically.

The following example runs a batch insert operation.

package com.amazonaws.rdsdata.examples; import com.amazonaws.services.rdsdata.AWSRDSData; import com.amazonaws.services.rdsdata.AWSRDSDataClient; import com.amazonaws.services.rdsdata.model.BatchExecuteStatementRequest; import com.amazonaws.services.rdsdata.model.Field; import com.amazonaws.services.rdsdata.model.SqlParameter; import java.util.Arrays; public class BatchExecuteExample { public static final String RESOURCE_ARN = "arn:aws:rds:us-east-1:123456789012:cluster:mydbcluster"; public static final String SECRET_ARN = "arn:aws:secretsmanager:us-east-1:123456789012:secret:mysecret"; public static void main(String[] args) { AWSRDSData rdsData = AWSRDSDataClient.builder().build(); BatchExecuteStatementRequest request = new BatchExecuteStatementRequest() .withDatabase("test") .withResourceArn(RESOURCE_ARN) .withSecretArn(SECRET_ARN) .withSql("INSERT INTO test_table2 VALUES (:string, :number)") .withParameterSets(Arrays.asList( Arrays.asList( new SqlParameter().withName("string").withValue(new Field().withStringValue("Hello")), new SqlParameter().withName("number").withValue(new Field().withLongValue(1L)) ), Arrays.asList( new SqlParameter().withName("string").withValue(new Field().withStringValue("World")), new SqlParameter().withName("number").withValue(new Field().withLongValue(2L)) ) )); rdsData.batchExecuteStatement(request); } }

Controlling Data API timeout behavior

All calls to Data API are synchronous. Suppose that you perform a Data API operation that runs a SQL statement such as INSERT or CREATE TABLE. If the Data API call returns successfully, the SQL processing is finished when the call returns.

By default, Data API cancels an operation and returns a timeout error if the operation doesn't finish processing within 45 seconds. In that case, the data isn't inserted, the table isn't created, and so on.

You can use Data API to perform long-running operations that can't complete within 45 seconds. If you expect that an operation such as a bulk INSERT or a DDL operation on a large table takes longer than 45 seconds, you can specify the continueAfterTimeout parameter for the ExecuteStatement operation. Your application still receives the timeout error. However, the operation continues running and isn't cancelled. For an example, see Running a SQL transaction.

If the AWS SDK for your programming language has its own timeout period for API calls or HTTP socket connections, make sure that all such timeout periods are more than 45 seconds. For some SDKs, the timeout period is less than 45 seconds by default. We recommend setting any SDK-specific or client-specific timeout periods to at least one minute. Doing so avoids the possibility that your application receives a timeout error while the Data API operation still completes successfully. That way, you can be sure whether to retry the operation or not.

For example, suppose that the SDK returns a timeout error to your application, but the Data API operation still completes within the Data API timeout interval. In that case, retrying the operation might insert duplicate data or otherwise produce incorrect results. The SDK might retry the operation automatically, causing incorrect data without any action from your application.

The timeout interval is especially important for the Java 2 SDK. In that SDK, the API call timeout and the HTTP socket timeout are both 30 seconds by default. Here is an example of setting those timeouts to a higher value:

public RdsDataClient createRdsDataClient() { return RdsDataClient.builder() .region(Region.US_EAST_1) // Change this to your desired Region .overrideConfiguration(createOverrideConfiguration()) .httpClientBuilder(createHttpClientBuilder()) .credentialsProvider(defaultCredentialsProvider()) // Change this to your desired credentials provider .build(); } private static ClientOverrideConfiguration createOverrideConfiguration() { return ClientOverrideConfiguration.builder() .apiCallTimeout(Duration.ofSeconds(60)) .build(); } private HttpClientBuilder createHttpClientBuilder() { return ApacheHttpClient.builder() // Change this to your desired HttpClient .socketTimeout(Duration.ofSeconds(60)); }

Here is an equivalent example using the asynchronous data client:

public static RdsDataAsyncClient createRdsDataAsyncClient() { return RdsDataAsyncClient.builder() .region(Region.US_EAST_1) // Change this to your desired Region .overrideConfiguration(createOverrideConfiguration()) .credentialsProvider(defaultCredentialsProvider()) // Change this to your desired credentials provider .build(); } private static ClientOverrideConfiguration createOverrideConfiguration() { return ClientOverrideConfiguration.builder() .apiCallAttemptTimeout(Duration.ofSeconds(60)) .build(); } private HttpClientBuilder createHttpClientBuilder() { return NettyNioAsyncHttpClient.builder() // Change this to your desired AsyncHttpClient .readTimeout(Duration.ofSeconds(60)); }

Using the Java client library for RDS Data API

You can download and use a Java client library for RDS Data API (Data API). This Java client library provides an alternative way to use Data API. Using this library, you can map your client-side classes to Data API requests and responses. This mapping support can ease integration with some specific Java types, such as Date, Time, and BigDecimal.

Downloading the Java client library for Data API

The Data API Java client library is open source in GitHub at the following location:

https://github.com/awslabs/rds-data-api-client-library-java

You can build the library manually from the source files, but the best practice is to consume the library using Apache Maven dependency management. Add the following dependency to your Maven POM file.

For version 2.x, which is compatible with AWS SDK 2.x, use the following:

<dependency> <groupId>software.amazon.rdsdata</groupId> <artifactId>rds-data-api-client-library-java</artifactId> <version>2.0.0</version> </dependency>

For version 1.x, which is compatible with AWS SDK 1.x, use the following:

<dependency> <groupId>software.amazon.rdsdata</groupId> <artifactId>rds-data-api-client-library-java</artifactId> <version>1.0.8</version> </dependency>

Java client library examples

Following, you can find some common examples of using the Data API Java client library. These examples assume that you have a table accounts with two columns: accountId and name. You also have the following data transfer object (DTO).

public class Account { int accountId; String name; // getters and setters omitted }

The client library enables you to pass DTOs as input parameters. The following example shows how customer DTOs are mapped to input parameters sets.

var account1 = new Account(1, "John"); var account2 = new Account(2, "Mary"); client.forSql("INSERT INTO accounts(accountId, name) VALUES(:accountId, :name)") .withParamSets(account1, account2) .execute();

In some cases, it's easier to work with simple values as input parameters. You can do so with the following syntax.

client.forSql("INSERT INTO accounts(accountId, name) VALUES(:accountId, :name)") .withParameter("accountId", 3) .withParameter("name", "Zhang") .execute();

The following is another example that works with simple values as input parameters.

client.forSql("INSERT INTO accounts(accountId, name) VALUES(?, ?)", 4, "Carlos") .execute();

The client library provides automatic mapping to DTOs when a result is returned. The following examples show how the result is mapped to your DTOs.

List<Account> result = client.forSql("SELECT * FROM accounts") .execute() .mapToList(Account.class); Account result = client.forSql("SELECT * FROM accounts WHERE account_id = 1") .execute() .mapToSingle(Account.class);

In many cases, the database result set contains only a single value. In order to simplify retrieving such results, the client library offers the following API:

int numberOfAccounts = client.forSql("SELECT COUNT(*) FROM accounts") .execute() .singleValue(Integer.class);
Note

The mapToList function converts a SQL result set into a user-defined object list. We don't support using the .withFormatRecordsAs(RecordsFormatType.JSON) statement in an ExecuteStatement call for the Java client library, because it serves the same purpose. For more information, see Processing RDS Data API query results in JSON format.

Processing RDS Data API query results in JSON format

When you call the ExecuteStatement operation, you can choose to have the query results returned as a string in JSON format. That way, you can use your programming language's JSON parsing capabilities to interpret and reformat the result set. Doing so can help to avoid writing extra code to loop through the result set and interpret each column value.

To request the result set in JSON format, you pass the optional formatRecordsAs parameter with a value of JSON. The JSON-formatted result set is returned in the formattedRecords field of the ExecuteStatementResponse structure.

The BatchExecuteStatement action doesn't return a result set. Thus, the JSON option doesn't apply to that action.

To customize the keys in the JSON hash structure, define column aliases in the result set. You can do so by using the AS clause in the column list of your SQL query.

You might use the JSON capability to make the result set easier to read and map its contents to language-specific frameworks. Because the volume of the ASCII-encoded result set is larger than the default representation, you might choose the default representation for queries that return large numbers of rows or large column values that consume more memory than is available to your application.

Retrieving query results in JSON format

To receive the result set as a JSON string, include .withFormatRecordsAs(RecordsFormatType.JSON) in the ExecuteStatement call. The return value comes back as a JSON string in the formattedRecords field. In this case, the columnMetadata is null. The column labels are the keys of the object that represents each row. These column names are repeated for each row in the result set. The column values are quoted strings, numeric values, or special values representing true, false, or null. Column metadata such as length constraints and the precise type for numbers and strings isn't preserved in the JSON response.

If you omit the .withFormatRecordsAs() call or specify a parameter of NONE, the result set is returned in binary format using the Records and columnMetadata fields.

Data Type Mapping

The SQL values in the result set are mapped to a smaller set of JSON types. The values are represented in JSON as strings, numbers, and some special constants such as true, false, and null. You can convert these values into variables in your application, using strong or weak typing as appropriate for your programming language.

JDBC data type

JSON data type

INTEGER, TINYINT, SMALLINT, BIGINT

Number by default. String if the LongReturnType option is set to STRING.

FLOAT, REAL, DOUBLE

Number

DECIMAL

String by default. Number if the DecimalReturnType option is set to DOUBLE_OR_LONG.

STRING

String

BOOLEAN, BIT

Boolean

BLOB, BINARY, VARBINARY, LONGVARBINARY

String in base64 encoding.

CLOB

String

ARRAY

Array

NULL

null

Other types (including types related to date and time)

String

Troubleshooting

The JSON response is limited to 10 megabytes. If the response is larger than this limit, your program receives a BadRequestException error. In this case, you can resolve the error using one of the following techniques:

  • Reduce the number of rows in the result set. To do so, add a LIMIT clause. You might split a large result set into multiple smaller ones by submitting several queries with LIMIT and OFFSET clauses.

    If the result set includes rows that are filtered out by application logic, you can remove those rows from the result set by adding more conditions in the WHERE clause.

  • Reduce the number of columns in the result set. To do so, remove items from the select list of the query.

  • Shorten the column labels by using column aliases in the query. Each column name is repeated in the JSON string for each row in the result set. Thus, a query result with long column names and many rows could exceed the size limit. In particular, use column aliases for complicated expressions to avoid having the entire expression repeated in the JSON string.

  • Although with SQL you can use column aliases to produce a result set having more than one column with the same name, duplicate key names aren't allowed in JSON. The RDS Data API returns an error if you request the result set in JSON format and more than one column has the same name. Thus, make sure that all the column labels have unique names.

Examples

The following Java examples show how to call ExecuteStatement with the response as a JSON-formatted string, then interpret the result set. Substitute the appropriate values for the databaseName, secretStoreArn, and clusterArn parameters.

The following Java example demonstrates a query that returns a decimal numeric value in the result set. The assertThat calls test that the fields of the response have the expected properties based on the rules for JSON result sets.

This example works with the following schema and sample data:

create table test_simplified_json (a float); insert into test_simplified_json values(10.0);
public void JSON_result_set_demo() { var sql = "select * from test_simplified_json"; var request = new ExecuteStatementRequest() .withDatabase(databaseName) .withSecretArn(secretStoreArn) .withResourceArn(clusterArn) .withSql(sql) .withFormatRecordsAs(RecordsFormatType.JSON); var result = rdsdataClient.executeStatement(request); }

The value of the formattedRecords field from the preceding program is:

[{"a":10.0}]

The Records and ColumnMetadata fields in the response are both null, due to the presence of the JSON result set.

The following Java example demonstrates a query that returns an integer numeric value in the result set. The example calls getFormattedRecords to return only the JSON-formatted string and ignore the other response fields that are blank or null. The example deserializes the result into a structure representing a list of records. Each record has fields whose names correspond to the column aliases from the result set. This technique simplifies the code that parses the result set. Your application doesn't have to loop through the rows and columns of the result set and convert each value to the appropriate type.

This example works with the following schema and sample data:

create table test_simplified_json (a int); insert into test_simplified_json values(17);
public void JSON_deserialization_demo() { var sql = "select * from test_simplified_json"; var request = new ExecuteStatementRequest() .withDatabase(databaseName) .withSecretArn(secretStoreArn) .withResourceArn(clusterArn) .withSql(sql) .withFormatRecordsAs(RecordsFormatType.JSON); var result = rdsdataClient.executeStatement(request) .getFormattedRecords(); /* Turn the result set into a Java object, a list of records. Each record has a field 'a' corresponding to the column labelled 'a' in the result set. */ private static class Record { public int a; } var recordsList = new ObjectMapper().readValue( response, new TypeReference<List<Record>>() { }); }

The value of the formattedRecords field from the preceding program is:

[{"a":17}]

To retrieve the a column of result row 0, the application would refer to recordsList.get(0).a.

In contrast, the following Java example shows the kind of code that's required to construct a data structure holding the result set when you don't use the JSON format. In this case, each row of the result set contains fields with information about a single user. Building a data structure to represent the result set requires looping through the rows. For each row, the code retrieves the value of each field, performs an appropriate type conversion, and assigns the result to the corresponding field in the object representing the row. Then the code adds the object representing each user to the data structure representing the entire result set. If the query was changed to reorder, add, or remove fields in the result set, the application code would have to change also.

/* Verbose result-parsing code that doesn't use the JSON result set format */ for (var row: response.getRecords()) { var user = User.builder() .userId(row.get(0).getLongValue()) .firstName(row.get(1).getStringValue()) .lastName(row.get(2).getStringValue()) .dob(Instant.parse(row.get(3).getStringValue())) .build(); result.add(user); }

The following sample values show the values of the formattedRecords field for result sets with different numbers of columns, column aliases, and column data types.

If the result set includes multiple rows, each row is represented as an object that is an array element. Each column in the result set becomes a key in the object. The keys are repeated for each row in the result set. Thus, for result sets consisting of many rows and columns, you might need to define short column aliases to avoid exceeding the length limit for the entire response.

This example works with the following schema and sample data:

create table sample_names (id int, name varchar(128)); insert into sample_names values (0, "Jane"), (1, "Mohan"), (2, "Maria"), (3, "Bruce"), (4, "Jasmine");
[{"id":0,"name":"Jane"},{"id":1,"name":"Mohan"}, {"id":2,"name":"Maria"},{"id":3,"name":"Bruce"},{"id":4,"name":"Jasmine"}]

If a column in the result set is defined as an expression, the text of the expression becomes the JSON key. Thus, it's typically convenient to define a descriptive column alias for each expression in the select list of the query. For example, the following query includes expressions such as function calls and arithmetic operations in its select list.

select count(*), max(id), 4+7 from sample_names;

Those expressions are passed through to the JSON result set as keys.

[{"count(*)":5,"max(id)":4,"4+7":11}]

Adding AS columns with descriptive labels makes the keys simpler to interpret in the JSON result set.

select count(*) as rows, max(id) as largest_id, 4+7 as addition_result from sample_names;

With the revised SQL query, the column labels defined by the AS clauses are used as the key names.

[{"rows":5,"largest_id":4,"addition_result":11}]

The value for each key-value pair in the JSON string can be a quoted string. The string might contain unicode characters. If the string contains escape sequences or the " or \ characters, those characters are preceded by backslash escape characters. The following examples of JSON strings demonstrate these possibilities. For example, the string_with_escape_sequences result contains the special characters backspace, newline, carriage return, tab, form feed, and \.

[{"quoted_string":"hello"}] [{"unicode_string":"邓不利多"}] [{"string_with_escape_sequences":"\b \n \r \t \f \\ '"}]

The value for each key-value pair in the JSON string can also represent a number. The number might be an integer, a floating-point value, a negative value, or a value represented as exponential notation. The following examples of JSON strings demonstrate these possibilities.

[{"integer_value":17}] [{"float_value":10.0}] [{"negative_value":-9223372036854775808,"positive_value":9223372036854775807}] [{"very_small_floating_point_value":4.9E-324,"very_large_floating_point_value":1.7976931348623157E308}]

Boolean and null values are represented with the unquoted special keywords true, false, and null. The following examples of JSON strings demonstrate these possibilities.

[{"boolean_value_1":true,"boolean_value_2":false}] [{"unknown_value":null}]

If you select a value of a BLOB type, the result is represented in the JSON string as a base64-encoded value. To convert the value back to its original representation, you can use the appropriate decoding function in your application's language. For example, in Java you call the function Base64.getDecoder().decode(). The following sample output shows the result of selecting a BLOB value of hello world and returning the result set as a JSON string.

[{"blob_column":"aGVsbG8gd29ybGQ="}]

The following Python example shows how to access the values from the result of a call to the Python execute_statement function. The result set is a string value in the field response['formattedRecords']. The code turns the JSON string into a data structure by calling the json.loads function. Then each row of the result set is a list element within the data structure, and within each row you can refer to each field of the result set by name.

import json result = json.loads(response['formattedRecords']) print (result[0]["id"])

The following JavaScript example shows how to access the values from the result of a call to the JavaScript executeStatement function. The result set is a string value in the field response.formattedRecords. The code turns the JSON string into a data structure by calling the JSON.parse function. Then each row of the result set is an array element within the data structure, and within each row you can refer to each field of the result set by name.

<script> const result = JSON.parse(response.formattedRecords); document.getElementById("display").innerHTML = result[0].id; </script>

Troubleshooting RDS Data API issues

Use the following sections, titled with common error messages, to help troubleshoot problems that you have with RDS Data API (Data API).

Transaction <transaction_ID> is not found

In this case, the transaction ID specified in a Data API call wasn't found. The cause for this issue is appended to the error message, and is one of the following:

  • Transaction may be expired.

    Make sure that each transactional call runs within three minutes of the previous one.

    It's also possible that the specified transaction ID wasn't created by a BeginTransaction call. Make sure that your call has a valid transaction ID.

  • One previous call resulted in a termination of your transaction.

    The transaction was already ended by your CommitTransaction or RollbackTransaction call.

  • Transaction has been aborted due to an error from a previous call.

    Check whether your previous calls have thrown any exceptions.

For information about running transactions, see Calling RDS Data API.

Packet for query is too large

In this case, the result set returned for a row was too large. The Data API size limit is 64 KB per row in the result set returned by the database.

To solve this issue, make sure that each row in a result set is 64 KB or less.

Database response exceeded size limit

In this case, the size of the result set returned by the database was too large. The Data API limit is 1 MiB in the result set returned by the database.

To solve this issue, make sure that calls to Data API return 1 MiB of data or less. If you need to return more than 1 MiB, you can use multiple ExecuteStatement calls with the LIMIT clause in your query.

For more information about the LIMIT clause, see SELECT syntax in the MySQL documentation.

HttpEndpoint is not enabled for cluster <cluster_ID>

Check the following potential causes for this issue:

  • The Aurora DB cluster doesn't support Data API. For example, for Aurora MySQL, you can only use Data API with Aurora Serverless v1. For information about the types of DB clusters RDS Data API supports, see Region and version availability.

  • Data API isn't enabled for the Aurora DB cluster. To use Data API with an Aurora DB cluster, Data API must be enabled for the DB cluster. For information about enabling Data API, see Enabling RDS Data API.

  • The DB cluster was renamed after Data API was enabled for it. In that case, turn off Data API for that cluster and then enable it again.

  • The ARN you specified doesn't precisely match the ARN of the cluster. Check that the ARN returned from another source or constructed by program logic matches the ARN of the cluster exactly. For example, make sure that the ARN you use has the correct letter case for all alphabetic characters.