Caching and compression - AWS AppSync

Caching and compression

AWS AppSync's server-side data caching capabilities make data available in a high speed, in-memory cache, improving performance and decreasing latency. This reduces the need to directly access data sources. Caching is available for both unit and pipeline resolvers.

AWS AppSync also allows you to compress API responses so that payload content loads and downloads faster. This potentially reduces the strain on your applications while also potentially reducing your data transfer charges. Compression behavior is configurable and can be set at your own discretion.

Refer to this section for help defining the desired behavior of server-side caching and compression in your AWS AppSync API.

Instance Types

AWS AppSync hosts Amazon ElastiCache for Redis instances in the same AWS account and AWS Region as your AWS AppSync API.

The following ElastiCache for Redis instance types are available:


1 vCPU, 1.5 GiB RAM, low to moderate network performance


2 vCPU, 3 GiB RAM, low to moderate network performance


2 vCPU, 12.3 GiB RAM, up to 10 Gigabit network performance


4 vCPU, 25.05 GiB RAM, up to 10 Gigabit network performance


8 vCPU, 50.47 GiB RAM, up to 10 Gigabit network performance


16 vCPU, 101.38 GiB RAM, up to 10 Gigabit network performance


32 vCPU, 203.26 GiB RAM, 10 Gigabit network performance (not available in all Regions)


48 vCPU, 317.77 GiB RAM, 10 Gigabit network performance


Historically, you specified a specific instance type (such as t2.medium). As of July 2020, these legacy instance types continue to be available, but their use is deprecated and discouraged. We recommend that you use the generic instance types described here.

Caching behavior

The following are the behaviors related to caching:


No server-side caching.

Full request caching

If the data is not in the cache, it is retrieved from the data source and populates the cache until the time to live (TTL) expiration. All subsequent requests to your API are returned from the cache. This means that data sources aren't contacted directly unless the TTL expires. In this setting, we use the contents of the $context.arguments and $context.identity maps as caching keys.

Per-resolver caching

With this setting, each resolver must be explicitly opted in for it to cache responses. You can specify a TTL and caching keys on the resolver. Caching keys that you can specify are the top-level maps $context.arguments, $context.source, and $context.identity, and/or string fields from these maps. The TTL value is mandatory, but the caching keys are optional. If you don't specify any caching keys, the defaults are the contents of the $context.arguments, $context.source, and $context.identity maps.

For example, you could use the following combinations:

  • $context.arguments and $context.source

  • $context.arguments and $context.identity.sub

  • $ or $

  • $ and $context.identity.sub

  • $

When you specify only a TTL and no caching keys, the behavior of the resolver is the same as full request caching.

Cache time to live

This setting defines the amount of time to store cached entries in memory. The maximum TTL is 3,600 seconds (1 hour), after which entries are automatically deleted.

Cache encryption

Cache encryption comes in the following two flavors. These are similar to the settings that ElastiCache for Redis allows. You can enable the encryption settings only when first enabling caching for your AWS AppSync API.

  • Encryption in transit – Requests between AWS AppSync, the cache, and data sources (except insecure HTTP data sources) are encrypted at the network level. Because there is some processing needed to encrypt and decrypt the data at the endpoints, in-transit encryption can impact performance.

  • Encryption at rest – Data saved to disk from memory during swap operations are encrypted at the cache instance. This setting also impacts performance.

To invalidate cache entries, you can make a flush cache API call using either the AWS AppSync console or the AWS Command Line Interface (AWS CLI).

For more information, see the ApiCache data type in the AWS AppSync API Reference.

Cache eviction

When you set up AWS AppSync's server-side caching, you can configure a maximum TTL. This value defines the amount of time that cached entries are stored in memory. In situations where you must remove specific entries from your cache, you can use AWS AppSync's evictFromApiCache extensions utility in your resolver's request or response mapping template. (For example, when your data in your data sources have changed, and your cache entry is now stale.) To evict an item from the cache, you must know its key. For this reason, if you must evict items dynamically, we recommend using per-resolver caching and explicitly defining a key to use to add entries to your cache.

Evicting a cache entry

To evict an item from the cache, use the evictFromApiCache extensions utility. Specify the type name and field name, then provide an object of key-value items to build the key of the entry that you want to evict. In the object, each key represents a valid entry from the $context object that is used in the cached resolver's cachingKey list. Each value is the actual value used to construct the value of the key. You must put the items in the object in the same order as the caching keys in the cached resolver's cachingKey list.

For example, see the following schema:

type Note{ id: ID! title: String content: String! } type Query { getNote(id: ID!): Note } type Mutation { updateNote(id: ID!, content: String!): Note }

In this example, you can enable per-resolver caching, then enable it for the getNote query. Then, you can configure the caching key to consist of [$].

When you try to get a Note, to build the cache key, AWS AppSync performs a lookup in its server-side cache using the id argument of the getNote query.

When you update a Note, you must evict the entry for the specific note to make sure that the next request fetches it from the backend data source. To do this, you must create a request mapping template.

The following example shows one way to handle the eviction using this method:

#set($cachingKeys = {}) $util.qr($cachingKeys.put("", $ $extensions.evictFromApiCache("Query", "getNote", $cachingKeys) { "version" : "2017-02-28", "operation" : "UpdateItem", "key" : { "id" : "$" }, "update" : { "expression" : "SET #content = :content", "expressionNames": { "#content" : "content" } "expressionValues": { ":content" : $util.dynamodb.toDynamoDBJson($context.arguments.content) } } }

Alternatively, you can also handle the eviction in the response mapping template:

#set($cachingKeys = {}) $util.qr($cachingKeys.put("", $ $extensions.evictFromApiCache("Query", "getNote", $cachingKeys) $util.toJson($context.result)

When the updateNote mutation is processed, AWS AppSync tries to evict the entry. If an entry is successfully cleared, the response contains an apiCacheEntriesDeleted value in the extensions object that shows how many entries were deleted:

"extensions": { "apiCacheEntriesDeleted": 1}

Evicting a cache entry based on identity

You can create caching keys based on multiple values from the $context object.

For example, take the following schema that uses Amazon Cognito user pools as the default auth mode and is backed by an Amazon DynamoDB data source:

type Note { id: ID! # a slug; e.g.: "my-first-note-on-graphql" title: String content: String! } type Query { getNote(id: ID!): Note } type Mutation { updateNote(id: ID!, content: String!): Note }

The Note object types are saved in a DynamoDB table. The table has a composite key that uses the Amazon Cognito user name as the primary key and the id (a slug) of the Note as the partition key. This is a multi-tenant system that allows multiple users to host and update their private Note objects, which are never shared.

Since this is a read-heavy system, the getNote query is cached using per-resolver caching, with the caching key composed of [$context.identity.username, $]. When a Note is updated, you can evict the entry for that specific Note. You must add the components in the object in the same order that they are specified in your resolver's cachingKeys list.

The following example shows this:

#set($cachingKeys = {}) $util.qr($cachingKeys.put("context.identity.username", $context.identity.username)) $util.qr($cachingKeys.put("", $ $extensions.evictFromApiCache("Query", "getNote", $cachingKeys) { "version" : "2017-02-28", "operation" : "UpdateItem", "key" : { "username": "$context.identity.username", "slug" : "$" }, "update" : { "expression" : "SET #content = :content", "expressionNames": { "#content" : "content" } "expressionValues": { ":content" : $util.dynamodb.toDynamoDBJson($context.arguments.content) } } }

A backend system can also update the Note and evict the entry. For example, take this mutation:

type Mutation { updateNoteFromBackend(id: ID!, content: String!, username: ID!): Note @aws_iam }

You can evict the entry, but add the components of the caching key to the cachingKeys object.

In the following example, the eviction occurs in the response mapping template of the resolver:

#set($cachingKeys = {}) $util.qr($cachingKeys.put("context.identity.username", $context.arguments.username)) $util.qr($cachingKeys.put("", $ $extensions.evictFromApiCache("Query", "getNote", $cachingKeys) $utils.toJson($context.result)

In cases where your backend data has been updated outside of AWS AppSync, you can evict an item from the cache by calling a mutation that uses a NONE data source.

Compressing API responses

AWS AppSync allows clients to request compressed payloads. If requested, API responses are compressed and returned in response to requests that indicate that compressed content is preferred. Compressed API responses load faster, content is downloaded faster, and your data transfer charges may be reduced as well.


Compression is available on all new APIs created after June 1st, 2020.

AWS AppSync can compress GraphQL query payload sizes between 1,000 to 10,000,000 bytes. To enable compression, a client must send the Accept-Encoding header with the value gzip or br. Compression can be verified by checking the Content-Encoding header's value in the response (gzip or br).

The query explorer in the AWS AppSync console automatically sets the header value in the request by default. If you execute a query that has a large enough response, compression can be confirmed using your browser developer tools.