Change data capture streams (Preview)
Important
This feature is provided as an AWS Preview and is subject to change. For more
information, see section 2, Betas and Previews, in the AWS Service Terms
Before general availability, we will add new operation types ("op": "u" for
updates) to your stream payload. To ensure your application handles these changes without
modification, treat any unrecognized op value as an upsert by applying the
after payload. See Understanding CDC records for details.
Amazon Aurora DSQL change data capture (CDC) streams committed database changes in near real time directly to Amazon Kinesis Data Streams. Aurora DSQL delivers each committed row-level change as a structured JSON record to a Kinesis data stream that you configure.
CDC is useful when you want to:
-
Keep downstream systems in sync – Replicate changes to a search index, cache, data warehouse, or analytics system without batch jobs.
-
Build event-driven architectures – Trigger workflows, notifications, or microservice actions in response to database changes.
-
Maintain an audit trail – Capture every committed change for compliance, debugging, or historical analysis.
-
Decouple producers from consumers – Let the database focus on transactions while downstream systems process changes at their own pace.
How it works
Aurora DSQL reads committed transactions, formats each row change as a structured JSON record,
and delivers it to a Kinesis data stream that you configure. CDC automatically captures every
INSERT, UPDATE, and DELETE across all user tables in
the cluster. Apply filtering logic in your downstream apps by using the
source.schema and source.table fields in each CDC record to
focus on the tables or changes your app needs.
CDC streams are fully managed. Aurora DSQL manages all infrastructure required to capture
change events, monitors stream health, and reports the status through the
GetStream API operation and CloudWatch metrics.
CDC streams use a bring-your-own-target model. You create and manage the Kinesis data
stream in your account, and Aurora DSQL assumes an IAM role that you configure to write CDC
records on your behalf. You're responsible for the target's capacity, encryption,
and retention settings. For the latest supported targets, see the
TargetDefinition parameter in CreateStream in the
Amazon Aurora DSQL API Reference. For a complete list of CDC stream API operations, see the
Amazon Aurora DSQL API
Reference.
Topics on this page
Related topics
Ordering and delivery semantics
Delivery guarantees
Aurora DSQL CDC guarantees that every committed change reaches the target at least one time.
Aurora DSQL can deliver a record more than one time. Design your app to handle duplicates. You can
identify a duplicate by comparing source.ts_ns and the primary key
values—a duplicate has the same values as the original delivery.
Ordering
CDC streams use UNORDERED mode. In practice, records arrive in approximate
commit order because Aurora DSQL reads and publishes changes sequentially. However, Aurora DSQL
doesn't guarantee strict ordering. Specifically:
-
Aurora DSQL can deliver records from different transactions in any order.
-
Records for the same primary key from different transactions can arrive out of commit order.
-
Records from a single transaction can interleave with records from other transactions. Use the
source.txIdfield to group records by transaction when your workflow requires it.
Each CDC record includes a source.ts_ns field that contains the transaction
commit timestamp in nanoseconds. Use this field to establish commit order on the receiving
side.
Consumer strategies
Because records can arrive out of commit order and can appear more than one time, your app must account for both conditions.
Important
Define a primary key on all tables that participate in CDC. Without a primary key, your app can't deduplicate records or correlate deletes with the affected row.
Last-writer-wins (materialized views, caches)
Track the highest source.ts_ns value per primary key. Discard any record
with a source.ts_ns less than or equal to the tracked value. This filters
both duplicates and out-of-order records, keeping the most recent state for each key.
When you process a delete (op: "d"), store a tombstone for the primary key
that preserves the source.ts_ns value instead of removing the entry. The
tombstone ensures that an insert or update with an earlier source.ts_ns
that arrives after the delete doesn't incorrectly restore the row.
Every-change processing (audit logging, event sourcing)
Remove duplicates by comparing source.ts_ns combined with the primary key
values. Buffer incoming records and sort by source.ts_ns before processing to
reconstruct commit order.
Multi-Region CDC stream configuration
A CDC stream is a regional resource. Each stream belongs to a single AWS Region and delivers changes to a Kinesis data stream in the same Region. On a multi-Region cluster, a CDC stream in any one Region captures committed writes from all Regions in the cluster. This means you only need one stream to capture every change, regardless of where the write originated. To deliver CDC records in more than one Region, create a separate stream in each Region. Each stream independently captures the full set of committed changes across the cluster.
All resources—the Aurora DSQL cluster, Kinesis data stream, IAM service role, and calling principal—must be in the same AWS account and Region.
Processing CDC records downstream
After CDC records arrive in your Kinesis data stream, you can process them directly or route them to other destinations by using AWS integration services. The following table summarizes common processing patterns.
| Pattern | How it works |
|---|---|
| Direct consumption | Read records from Kinesis by using the Amazon Kinesis Client Library (KCL), the AWS SDK, or a Kinesis Data Streams consumer. See Developing KCL consumers in the Amazon Kinesis Data Streams Developer Guide. |
| AWS Lambda | Configure a Lambda function as an event source for your Kinesis data stream to process each batch of CDC records as they arrive. See Using AWS Lambda with Amazon Kinesis in the AWS Lambda Developer Guide. |
| Amazon Data Firehose | Deliver CDC records from Kinesis to Amazon S3, Amazon Redshift, Amazon OpenSearch Service, or other destinations for analytics and archival. See Sending data to a delivery stream in the Amazon Data Firehose Developer Guide. |
| Self-managed consumers | Run Apache Kafka Connect with the Kinesis source connector, Apache Flink, or other stream processing frameworks to transform and route records. For Apache Flink on AWS, see Configuring app input in the Amazon Managed Service for Apache Flink Developer Guide. |
Each CDC record includes fields such as source.schema,
source.table, and op that you can use to route and filter records
in your processing logic. For the full record schema, see
Understanding CDC records.