Attaching a data source in AWS AppSyncData sources are resources in your AWS account that GraphQL APIs can interact with. AWS AppSync supports a
multitude of data sources like AWS Lambda, Amazon DynamoDB, relational databases (Amazon Aurora Serverless), Amazon OpenSearch Service, and
HTTP endpoints. An AWS AppSync API can be configured to interact with multiple data sources, enabling you to
aggregate data in a single location. AWS AppSync can use existing AWS resources from your account or provision
DynamoDB tables on your behalf from a schema definition.
The following section will show you how to attach a data source to your GraphQL API.
Types of data sources
Now that you have created a schema in the AWS AppSync console, you can attach a data source to it. When
you initially create an API, there's an option to provision an Amazon DynamoDB table during the creation of the
predefined schema. However, we won't be covering that option in this section. You can see an example of this in
the Launching a
schema section.
Instead, we'll be looking at all of the data sources AWS AppSync supports. There are many factors that go
into picking the right solution for your application. The sections below will provide some additional context
for each data source. For general information about data sources, see Data sources.
Amazon DynamoDB
Amazon DynamoDB is one of AWS' main storage solutions for scalable applications. The core component of DynamoDB
is the table, which is simply a collection of data. You will typically
create tables based on entities like Book
or Author
. Table entry information is
stored as items, which are groups of fields that are unique to each entry.
A full item represents a row/record in the database. For example, an item for a Book
entry
might include title
and author
along with their values. The individual fields like
the title
and author
are called attributes, which
are akin to column values in relational databases.
As you can guess, tables will be used to store data from your application. AWS AppSync allows you to hook up
your DynamoDB tables to your GraphQL API to manipulate data. Take this use case from the Front-end web and mobile blog. This
application lets users sign up for a social media app. Users can join groups and upload posts that are
broadcasted to other users subscribed to the group. Their application stores user, post, and user group
information in DynamoDB. The GraphQL API (managed by AWS AppSync) interfaces with the DynamoDB table. When a user makes
a change in the system that will be reflected on the front-end, the GraphQL API retrieves these changes and
broadcasts them to other users in real time.
AWS Lambda
Lambda is an event-driven service that automatically builds the necessary resources to run code as a
response to an event. Lambda uses functions, which are group statements
containing the code, dependencies, and configurations for executing a resource. Functions automatically
execute when they detect a trigger, a group of activities that invoke your
function. A trigger could be anything like an application making an API call, an AWS service in your
account spinning up a resource, etc. When triggered, functions will process events, which are JSON documents containing the data to modify.
Lambda is good for running code without having to provision the resources to run it. Take this use
case from the Front-end web and mobile blog. This use case is
a bit similar to the one showcased in the DynamoDB section. In this application, the GraphQL API is responsible
for defining the operations for things like adding posts (mutations) and fetching that data (queries). To
implement the functionality of their operations (e.g., getPost ( id: String ! ) : Post
,
getPostsByAuthor ( author: String ! ) : [ Post ]
), they use Lambda functions to process
inbound requests. Under Option 2: AWS AppSync with Lambda resolver, they use
the AWS AppSync service to maintain their schema and link a Lambda data source to one of the operations. When the
operation is called, Lambda interfaces with the Amazon RDS proxy to perform the business logic on the
database.
Amazon RDS
Amazon RDS lets you quickly build and configure relational databases. In Amazon RDS, you'll create a generic
database instance that will serve as the isolated database environment
in the cloud. In this instance, you'll use a DB engine, which is the actual
RDBMS software (PostgreSQL, MySQL, etc.). The service offloads much of the backend work by providing
scalability using AWS' infrastructure, security services such as patching and encryption, and lowered
administrative costs for deployments.
Take the same use
case from the Lambda section. Under Option 3: AWS AppSync with Amazon RDS resolver, another option presented is linking the GraphQL API in AWS AppSync to Amazon RDS
directly. Using a data API, they associate the database with the GraphQL API. A resolver is attached to a field
(usually a query, mutation, or subscription) and implements the SQL statements needed to access the
database. When a request calling the field is made by the client, the resolver executes the statements
and returns the response.
Amazon EventBridge
In EventBridge, you'll create event buses, which are pipelines that receive
events from services or applications you attach (the event source) and
process them based on a set of rules. An event is some state change in
an execution environment, while a rule is a set of filters for events. A
rule follows an event pattern, or metadata of an event's state change
(id, Region, account number, ARN(s), etc.). When an event matches the event pattern, EventBridge will send the
event across the pipeline to the destination service (target) and
trigger the action specified in the rule.
EventBridge is good for routing state-changing operations to some other service. Take this use case from the Front-end web and mobile blog. The example depicts an e-commerce solution that
has several teams maintaining different services. One of these services provides order updates to the
customer at each step of the delivery (order placed, in progress, shipped, delivered, etc.) on the
front-end. However, the front-end team managing this service doesn't have direct access to the ordering
system data as that's maintained by a separate backend team. The backend team's ordering system is also
described as a black box, so it's hard to glean information about the way they're structuring their data.
However, the backend team did set up a system that published order data through an event bus managed by
EventBridge. To access the data coming from the event bus and route it to the front-end, the front-end team created
a new target pointing to their GraphQL API sitting in AWS AppSync. They also created a rule to only send data
relevant to the order update. When an update is made, the data from the event bus is sent to the GraphQL
API. The schema in the API processes the data, then passes it to the front-end.
None data sources
If you aren't planning on using a data source, you can set it to none
. A none
data source, while still explicitly categorized as a data source, isn't a storage medium. Typically, a
resolver will invoke one or more data sources at some point to process the request. However, there are
situations where you may not need to manipulate a data source. Setting the data source to none
will run the request, skip the data invocation step, then run the response.
Take the same use case from
the EventBridge section. In the schema, the mutation processes the status update, then sends it out to subscribers.
Recalling how resolvers work, there's usually at least one data source invocation. However, the data in this
scenario was already sent automatically by the event bus. This means there's no need for the mutation to
perform a data source invocation; the order status can simply be handled locally. The mutation is set to
none
, which acts as a pass-through value with no data source invocation. The schema is then
populated with the data, which is sent out to subscribers.
OpenSearch
Amazon OpenSearch Service is a suite of tools to implement full-text searching, data visualization, and logging. You can
use this service to query the structured data you've uploaded.
In this service, you'll create instances of OpenSearch. These are called nodes. In a node, you'll be adding at least one index.
Indices conceptually are a bit like tables in relational databases. (However, OpenSearch isn't ACID
compliant, so it shouldn't be used that way). You'll populate your index with data that you upload to the
OpenSearch service. When your data is uploaded, it will be indexed in one or more shards that exist in the
index. A shard is like a partition of your index that contains some of your
data and can be queried separately from other shards. Once uploaded, your data will be structured as JSON
files called documents. You can then query the node for data in the
document.
HTTP endpoints
You can use HTTP endpoints as data sources. AWS AppSync can send requests to the endpoints with the
relevant information like params and payload. The HTTP response will be exposed to the resolver, which will
return the final response after it finishes its operation(s).
Adding a data source
If you created a data source, you can link it to the AWS AppSync service and, more specifically, the
API.
- Console
-
-
Sign in to the AWS Management Console and open the AppSync
console.
-
Choose your API in the Dashboard.
-
In the Sidebar, choose Data
Sources.
-
Choose Create data source.
-
Give your data source a name. You can also give it a description, but that's
optional.
-
Choose your Data source type.
-
For DynamoDB, you'll have to choose your Region, then the table in the Region. You can
dictate interaction rules with your table by choosing to make a new generic table role or
importing an existing role for the table. You can enable versioning,
which can automatically create versions of data for each request when multiple clients are
trying to update data at the same time. Versioning is used to keep and maintain multiple
variants of data for conflict detection and resolution purposes. You can also enable
automatic schema generation, which takes your data source and generates some of the CRUD,
List
, and Query
operations needed to access it in your
schema.
For OpenSearch, you'll have to choose your Region, then the domain (cluster) in the
Region. You can dictate interaction rules with your domain by choosing to make a new
generic table role or importing an existing role for the table.
For Lambda, you'll have to choose your Region, then the ARN of the Lambda function in
the Region. You can dictate interaction rules with your Lambda function by choosing to
make a new generic table role or importing an existing role for the table.
For HTTP, you'll have to enter your HTTP endpoint.
For EventBridge, you'll have to choose your Region, then the event bus in the Region.
You can dictate interaction rules with your event bus by choosing to make a new generic
table role or importing an existing role for the table.
For RDS, you'll have to choose your Region, then the secret store (username and
password), database name, and schema.
For none, you will add a data source with no actual data source. This is for handling
resolvers locally rather than through an actual data source.
If you're importing existing roles, they need a trust policy. For more information,
see the IAM trust policy.
-
Choose Create.
Alternatively, if you're creating a DynamoDB data source, you can go to the Schema page in the console, choose Create
Resources at the top of the page, then fill out a predefined model to convert
into a table. In this option, you will fill out or import the base type, configure the basic
table data including the partition key, and review the schema changes.
- CLI
-
-
Create your data source by running the create-data-source
command.
You'll need to enter a few parameters for this particular command:
-
The api-id
of your API.
-
The name
of your table.
-
The type
of data source. Depending on the data source type you choose, you
may need to enter a service-role-arn
and a -config
tag.
An example command may look like this:
aws appsync create-data-source --api-id abcdefghijklmnopqrstuvwxyz --name data_source_name --type data_source_type --service-role-arn arn:aws:iam::107289374856:role/role_name --[data_source_type]-config {params}
- CDK
-
Before you use the CDK, we recommend reviewing the CDK's official documentation along
with AWS AppSync's CDK
reference.
The steps listed below will only show a general example of the snippet used to add a particular
resource. This is not meant to be a working solution in your
production code. We also assume you already have a working app.
To add your particular data source, you'll need to add the construct to your stack file. A list of
data source types can be found here:
-
In general, you may have to add the import directive to the service you're using. For
example, it may follow the forms:
import * as x
from 'x
'; # import wildcard as the 'x' keyword from 'x-service'
import {a
, b
, ...} from 'c
'; # import {specific constructs} from 'c-service'
For example, here's how you could import the AWS AppSync and DynamoDB services:
import * as appsync from 'aws-cdk-lib/aws-appsync';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';
-
Some services like RDS require some additional setup in the stack file before creating the
data source (e.g., VPC creation, roles, and access credentials). Consult the examples in the
relevant CDK pages for more information.
-
For most data sources, especially AWS services, you'll be creating a new instance of the
data source in your stack file. Typically, this will look like the following:
const add_data_source_func
= new service_scope
.resource_name
(scope: Construct, id: string, props: data_source_props);
For example, here's an example Amazon DynamoDB table:
const add_ddb_table = new dynamodb.Table(this, 'Table_ID', {
partitionKey: {
name: 'id',
type: dynamodb.AttributeType.STRING,
},
sortKey: {
name: 'id',
type: dynamodb.AttributeType.STRING,
},
tableClass: dynamodb.TableClass.STANDARD,
});
Most data sources will have at least one required prop (will be denoted without a ?
symbol). Consult the CDK documentation to
see which props are needed.
-
Next, you need to link the data source to the GraphQL API. The recommended method is to add
it when you make a function for your pipeline resolver. For instance, the snippet below is a
function that scans all elements in a DynamoDB table:
const add_func = new appsync.AppsyncFunction(this, 'func_ID', {
name: 'func_name_in_console',
add_api,
dataSource: add_api.addDynamoDbDataSource('data_source_name_in_console', add_ddb_table),
code: appsync.Code.fromInline(`
export function request(ctx) {
return { operation: 'Scan' };
}
export function response(ctx) {
return ctx.result.items;
}
`),
runtime: appsync.FunctionRuntime.JS_1_0_0,
});
In the dataSource
props, you can call the GraphQL API (add_api
) and
use one of its built-in methods (addDynamoDbDataSource
) to make the association
between the table and the GraphQL API. The arguments are the name of this link that will exist
in the AWS AppSync console (data_source_name_in_console
in this example) and the table
method (add_ddb_table
). More on this topic will be revealed in the next section
when you start making resolvers.
There are alternative methods for linking a data source. You could technically add
api
to the props list in the table function. For example, here's the snippet
from step 3 but with an api
props containing a GraphQL API:
const add_api = new appsync.GraphqlApi(this, 'API_ID', {
...
});
const add_ddb_table = new dynamodb.Table(this, 'Table_ID', {
...
api: add_api
});
Alternatively, you can call the GraphqlApi
construct separately:
const add_api = new appsync.GraphqlApi(this, 'API_ID', {
...
});
const add_ddb_table = new dynamodb.Table(this, 'Table_ID', {
...
});
const link_data_source = add_api.addDynamoDbDataSource('data_source_name_in_console', add_ddb_table);
We recommend only creating the association in the function's props. Otherwise, you'll either
have to link your resolver function to the data source manually in the AWS AppSync console (if you
want to keep using the console value data_source_name_in_console
) or create a
separate association in the function under another name like
data_source_name_in_console_2
. This is due to limitations in how the props
process information.
You'll have to redeploy the app to see your changes.
IAM trust policy
If you’re using an existing IAM role for your data source, you need to grant that role the appropriate
permissions to perform operations on your AWS resource, such as PutItem
on an Amazon DynamoDB
table. You also need to modify the trust policy on that role to allow AWS AppSync to use it for resource
access as shown in the following example policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "appsync.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
You can also add conditions to your trust policy to limit access to the data source as desired.
Currently, SourceArn
and SourceAccount
keys can be used in these conditions. For
example, the following policy limits access to your data source to the account
123456789012
:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "appsync.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"aws:SourceAccount": "123456789012"
}
}
}
]
}
Alternatively, you can limit access to a data source to a specific API, such as
abcdefghijklmnopq
, using the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "appsync.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"ArnEquals": {
"aws:SourceArn": "arn:aws:appsync:us-west-2:123456789012:apis/abcdefghijklmnopq"
}
}
}
]
}
You can limit access to all AWS AppSync APIs from a specific region, such as us-east-1
, using
the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "appsync.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"ArnEquals": {
"aws:SourceArn": "arn:aws:appsync:us-east-1:123456789012:apis/*"
}
}
}
]
}
In the next section (Configuring Resolvers), we'll add our resolver business logic and attach it to the fields in our
schema to process the data in our data source.
For more information regarding role policy configuration, see Modifying a role in the
IAM User Guide.
For more information regarding cross-account access of AWS Lambda resolvers for AWS AppSync, see Building cross-account
AWS Lambda resolvers for AWS AppSync.