Exploring Real-Time Streaming for Retrieval Augmented Generation in GenerativeAI Download editable diagram Create a free AWS account Further reading Contributors Diagram history

Exploring Real-Time Streaming for Retrieval Augmented Generation in GenerativeAI

Publication date: August 12, 2024 (Diagram history)

This architecture demonstrates the integration of streaming data services on AWS with Retrieval Augmented Generation(RAG) in Generative AI applications.

Exploring Real-Time Streaming for Retrieval Augmented Generation in GenerativeAI

Reference architecture diagram that demonstrates the integration of streaming data services on AWS with Retrieval Augmented Generation(RAG) in Generative AI applications.

Data sources for change data capture(CDC) includes on-premise or AWS databases such as Oracle, SQL Server, MySQL, PostgreSQL, Amazon Aurora , and Amazon RDS, all funneling data into your Retrieval Augmented Generation(RAG) model.
AWS Database Migration Service and Amazon MSK Connect with Debezium connector help with one-time data migration of databases and continuous data replication. AWS Database Migration Service and Amazon MSK Connect with Debezium connector captures and stream changes from source databases and applies them in same order they are captured to the target.
Utilizing AWS Database Migration Service and Amazon MSK Connect with Debezium connector enables the streaming of data to Amazon Kinesis Data Streams or Amazon Managed Streaming for Apache Kafka (Amazon MSK) , facilitating the collection and processing of large streams.
By utilizing AWS Glue Spark Streaming and Amazon Managed Service for Apache Flink , you can construct specialized data processing pipelines to cater to your data consumption requirements.
AWS Lambda , Amazon Kinesis Data Firehose and Amazon MSK Connect , which serve as data sink services, enable the direct transfer of source data into destinations like Data Lake , Amazon Redshift, among others.
Leveraging Amazon Aurora PostgreSQL with pgvector, Amazon Opensearch and Amazon DocumentDB allows the generation of vector embeddings for streamlined data retrieval , vector representation management, scalability, and real-time inference capabilities.
Amazon SageMaker and Amazon Bedrock offers the means to discover pertinent information within a corpus, conduct similarity search on vectorized domain specific datasets, and use this data as input for generation models.
To preserve user profiles and conversation history, Amazon DocumentDB, Amazon DynamoDB and Amazon MemoryDB provide suitable options.
Leverage Amazon Redshift service to ensure data persistence, thereby augmenting the data inputs for the Generative AI RAG model.
For details on the workings of RAG applications, refer to Retrieval-Augmented Generation(RAG)

Download editable diagram

To customize this reference architecture diagram based on your business needs, download the ZIP file which contains an editable PowerPoint.

Create a free AWS account

Sign up for an AWS account. New accounts include 12 months of AWS Free Tier access, including the use of Amazon EC2, Amazon S3, and Amazon DynamoDB.

Contributors

Contributors to this reference architecture diagram include:

Jatinder Singh (jsinghtq@), Senior Technical Account Manager
Manpreet Kour (mkour@), Senior Technical Account Manager
Ali Alemi (alialem@), Senior WW SSA Streaming

Diagram history

To be notified about updates to this reference architecture diagram, subscribe to the RSS feed.

Change	Description	Date
Initial publication	Reference architecture diagram first published.	August 22, 2024

Note

To subscribe to RSS updates, you must have an RSS plugin enabled for the browser you are using.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions