Develop advanced generative AI chat-based assistants by using RAG and ReAct prompting
Created by Praveen Kumar Jeyarajan (AWS), Jundong Qiao (AWS), Kara Yang (AWS), Kiowa Jackson (AWS), Noah Hamilton (AWS), and Shuai Cao (AWS)
Code repository: genai-bedrock-chatbot | Environment: PoC or pilot | Technologies: Machine learning & AI; Databases; DevOps; Serverless |
AWS services: Amazon Bedrock; Amazon ECS; Amazon Kendra; AWS Lambda |
Summary
A typical corporation has 70 percent of its data trapped in siloed systems. You can use generative AI-powered chat-based assistants to unlock insights and relationships between these data silos through natural language interactions. To get the most out of generative AI, the outputs must be trustworthy, accurate, and inclusive of the available corporate data. Successful chat-based assistants depend on the following:
Generative AI models (such as Anthropic Claude 2)
Data source vectorization
Advanced reasoning techniques, such as the ReAct framework
, for prompting the model
This pattern provides data-retrieval approaches from data sources such as Amazon Simple Storage Service (Amazon S3) buckets, AWS Glue, and Amazon Relational Database Service (Amazon RDS). Value is gained from that data by interleaving Retrieval Augmented Generation (RAG) with chain-of-thought methods. The results support complex chat-based assistant conversations that draw on the entirety of your corporation's stored data.
This pattern uses Amazon SageMaker manuals and pricing data tables as an example to explore the capabilities of a generative AI chat-based assistant. You will build a chat-based assistant that helps customers evaluate the SageMaker service by answering questions about pricing and the service's capabilities. The solution uses a Streamlit library for building the frontend application and the LangChain framework for developing the application backend powered by a large language model (LLM).
Inquiries to the chat-based assistant are met with an initial intent classification for routing to one of three possible workflows. The most sophisticated workflow combines general advisory guidance with complex pricing analysis. You can adapt the pattern to suit enterprise, corporate, and industrial use cases.
Prerequisites and limitations
Prerequisites
AWS Command Line Interface (AWS CLI) installed and configured
AWS Cloud Development Kit (AWS CDK) Toolkit 2.114.1 or later installed and configured
Basic familiarity with Python and AWS CDK
Git
installed Docker
installed Python 3.11 or later
installed and configured (for more information, see the Tools section) An active AWS account bootstrapped by using AWS CDK
Amazon Titan and Anthropic Claude model access enabled in the Amazon Bedrock service
AWS security credentials, including
AWS_ACCESS_KEY_ID
, correctly configured in your terminal environment
Limitations
LangChain doesn't support every LLM for streaming. The Anthropic Claude models are supported, but models from AI21 Labs are not.
This solution is deployed to a single AWS account.
This solution can be deployed only in AWS Regions where Amazon Bedrock and Amazon Kendra are available. For information about availability, see the documentation for Amazon Bedrock and Amazon Kendra.
Product versions
Python version 3.11 or later
Streamlit version 1.30.0 or later
Streamlit-chat version 0.1.1 or later
LangChain version 0.1.12 or later
AWS CDK version 2.132.1 or later
Architecture
Target technology stack
Amazon Athena
Amazon Bedrock
Amazon Elastic Container Service (Amazon ECS)
AWS Glue
AWS Lambda
Amazon S3
Amazon Kendra
Elastic Load Balancing
Target architecture
The AWS CDK code will deploy all the resources that are required to set up the chat-based assistant application in an AWS account. The chat-based assistant application shown in the following diagram is designed to answer SageMaker related queries from users. Users connect through an Application Load Balancer to a VPC that contains an Amazon ECS cluster hosting the Streamlit application. An orchestration Lambda function connects to the application. S3 bucket data sources provide data to the Lambda function through Amazon Kendra and AWS Glue. The Lambda function connects to Amazon Bedrock for answering queries (questions) from chat-based assistant users.
The orchestration Lambda function sends the LLM prompt request to the Amazon Bedrock model (Claude 2).
Amazon Bedrock sends the LLM response back to the orchestration Lambda function.
Logic flow within the orchestration Lambda function
When users ask a question through the Streamlit application, it invokes the orchestration Lambda function directly. The following diagram shows the logic flow when the Lambda function is invoked.
Step 1 – The input
query
(question) is classified into one of the three intents:General SageMaker guidance questions
General SageMaker pricing (training/inference) questions
Complex questions related to SageMaker and pricing
Step 2 – The input
query
initiates one of the three services:RAG Retrieval service
, which retrieves relevant context from the Amazon Kendravector database and calls the LLM through Amazon Bedrock to summarize the retrieved context as the response. Database Query service
, which uses- the LLM, database metadata, and sample rows from relevant tables to convert the inputquery
into a SQL query. Database Query service runs the SQL query against the SageMaker pricing database through Amazon Athenaand summarizes the query results as the response. In-context ReACT Agent service
, which breaks down the inputquery
into multiple steps before providing a response. The agent usesRAG Retrieval service
andDatabase Query service
as tools to retrieve relevant information during the reasoning process. After the reasoning and actions processes are complete, the agent generates the final answer as the response.
Step 3 – The response from the orchestration Lambda function is sent to the Streamlit application as output.
Tools
AWS services
Amazon Athena is an interactive query service that helps you analyze data directly in Amazon Simple Storage Service (Amazon S3) by using standard SQL.
Amazon Bedrock is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API.
AWS Cloud Development Kit (AWS CDK) is a software development framework that helps you define and provision AWS Cloud infrastructure in code.
AWS Command Line Interface (AWS CLI) is an open-source tool that helps you interact with AWS services through commands in your command-line shell.
Amazon Elastic Container Service (Amazon ECS) is a fast and scalable container management service that helps you run, stop, and manage containers on a cluster.
AWS Glue is a fully managed extract, transform, and load (ETL) service. It helps you reliably categorize, clean, enrich, and move data between data stores and data streams. This pattern uses an AWS Glue crawler and an AWS Glue Data Catalog table.
Amazon Kendra is an intelligent search service that uses natural language processing and advanced machine learning algorithms to return specific answers to search questions from your data.
AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use.
Amazon Simple Storage Service (Amazon S3) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data.
Elastic Load Balancing (ELB) distributes incoming application or network traffic across multiple targets. For example, you can distribute traffic across Amazon Elastic Compute Cloud (Amazon EC2) instances, containers, and IP addresses in one or more Availability Zones.
Code repository
The code for this pattern is available in the GitHub genai-bedrock-chatbot
The code repository contains the following files and folders:
assets
folder – The static assets the architecture diagram and the public datasetcode/lambda-container
folder – The Python code that is run in the Lambda functioncode/streamlit-app
folder – The Python code that is run as the container image in Amazon ECStests
folder – The Python files that are run to unit test the AWS CDK constructscode/code_stack.py
– The AWS CDK construct Python files used to create AWS resourcesapp.py
– The AWS CDK stack Python files used to deploy AWS resources in the target AWS accountrequirements.txt
– The list of all Python dependencies that must be installed for AWS CDKrequirements-dev.txt
– The list of all Python dependencies that must be installed for AWS CDK to run the unit-test suitecdk.json
– The input file to provide values required to spin up resources
Note: The AWS CDK code uses L3 (layer 3) constructs and AWS Identity and Access Management (IAM) policies managed by AWS for deploying the solution. |
Best practices
The code example provided here is for a proof-of-concept (PoC) or pilot demo only. If you want to take the code to Production, be sure to use the following best practices:
Set up monitoring and alerting for the Lambda function. For more information, see Monitoring and troubleshooting Lambda functions. For general best practices when working with Lambda functions, see the AWS documentation.
Epics
Task | Description | Skills required |
---|---|---|
Export variables for the account and AWS Region where the stack will be deployed. | To provide AWS credentials for AWS CDK by using environment variables, run the following commands.
| DevOps engineer, AWS DevOps |
Set up the AWS CLI profile. | To set up the AWS CLI profile for the account, follow the instructions in the AWS documentation. | DevOps engineer, AWS DevOps |
Task | Description | Skills required |
---|---|---|
Clone the repo on your local machine. | To clone the repository, run the following command in your terminal.
| DevOps engineer, AWS DevOps |
Set up the Python virtual environment and install required dependencies. | To set up the Python virtual environment, run the following commands.
To set up the required dependencies, run the following command.
| DevOps engineer, AWS DevOps |
Set up the AWS CDK environment and synthesize the AWS CDK code. |
| DevOps engineer, AWS DevOps |
Task | Description | Skills required |
---|---|---|
Provision Claude model access. | To enable Anthropic Claude model access for your AWS account, follow the instructions in the Amazon Bedrock documentation. | AWS DevOps |
Deploy resources in the account. | To deploy resources in the AWS account by using the AWS CDK, do the following:
Upon successful deployment, you can access the chat-based assistant application by using the URL provided in the CloudFormation Outputs section. | AWS DevOps, DevOps engineer |
Run the AWS Glue crawler and create the Data Catalog table. | An AWS Glue crawler is used to keep the data schema dynamic. The solution creates and updates partitions in the AWS Glue Data Catalog table by running the crawler on demand. After the CSV dataset files are copied into the S3 bucket, run the AWS Glue crawler and create the Data Catalog table schema for testing:
Note: The AWS CDK code configures the AWS Glue crawler to run on demand, but you can also schedule it to run periodically. | DevOps engineer, AWS DevOps |
Initiate document indexing. | After the files are copied into the S3 bucket, use Amazon Kendra to crawl and index them:
Note: The AWS CDK code configures the Amazon Kendra index sync to run on demand, but you can also run periodically by using the Schedule parameter. | AWS DevOps, DevOps engineer |
Task | Description | Skills required |
---|---|---|
Remove the AWS resources. | After you test the solution, clean up the resources:
| DevOps engineer, AWS DevOps |
Troubleshooting
Issue | Solution |
---|---|
AWS CDK returns errors. | For help with AWS CDK issues, see Troubleshooting common AWS CDK issues. |
Related resources
Additional information
AWS CDK commands
When working with AWS CDK, keep in mind the following useful commands:
Lists all stacks in the app
cdk ls
Emits the synthesized AWS CloudFormation template
cdk synth
Deploys the stack to your default AWS account and Region
cdk deploy
Compares the deployed stack with the current state
cdk diff
Opens the AWS CDK documentation
cdk docs
Deletes the CloudFormation stack and removes AWS deployed resources
cdk destroy