Using Aurora PostgreSQL as a Knowledge Base for Amazon Bedrock
From Aurora PostgreSQL 15.4, 14.9, 13.12, 12.16 versions, you can use the Aurora PostgreSQL DB cluster as a Knowledge Base for Amazon Bedrock. For more information, see Create a vector store in Amazon Aurora. A Knowledge Base automatically takes unstructured text data stored in a Amazon S3 bucket, converts it to text chunks and vectors, and stores it in a PostgreSQL database. With the generative AI applications, you can use Agents for Amazon Bedrock to query the data stored in the Knowledge Base and use the results of those queries to augment answers provided by foundational models. This workflow is called Retrieval Augmented Generation (RAG). For more information on RAG, see Retrieval Augmented Generation (RAG).
Topics
Prerequisites
Familiarize yourself with the following prerequisites to use Aurora PostgreSQL cluster as a Knowledge Base for Amazon Bedrock. At a high-level, you need to configure the following services for use with Bedrock:
Amazon Aurora PostgreSQL DB cluster created in the following versions:
15.4 and higher versions
14.9 and higher versions
13.12 and higher versions
12.16 and higher versions
Note
You must enable the
pgvector
extension in your target database and use version 0.5.0 or higher. For more information, see pgvector v0.5.0 with HNSW indexing. Data API
A user managed in Secrets Manager. For more information, see Password management with Amazon Aurora and AWS Secrets Manager.
Preparing Aurora PostgreSQL to be used as a Knowledge Base for Amazon Bedrock
You need to follow the steps below to create and configure an Aurora PostgreSQL DB cluster to use it as a Knowledge Base for Amazon Bedrock.
Create an Aurora PostgreSQL DB cluster. For more information, see Creating and connecting to an Aurora PostgreSQL DB cluster
Enable Data API while creating Aurora PostgreSQL DB cluster. For more information on the versions supported, see Using RDS Data API.
Note the Aurora PostgreSQL DB cluster Amazon Resource Names (ARN) to use it in the Amazon Bedrock. For more information, see Amazon Resource Names (ARNs)
Log in to the database with your master user and setup pgvector. Use the following command if the extension is not installed:
CREATE EXTENSION IF NOT EXISTS vector;
Use
pgvector
0.5.0 and higher version that supports HNSW indexing. For more information, see pgvector v0.5.0 with HNSW indexing. Use the following command to check the version of the
pg_vector
installed:postgres=>
SELECT extversion FROM pg_extension WHERE extname='vector';
Create a specific schema that Bedrock can use to query the data. Use the following command to create a schema:
CREATE SCHEMA bedrock_integration;
Create a new role that Bedrock can use to query the database. Use the following command to create a new role:
CREATE ROLE bedrock_user WITH PASSWORD
password
LOGIN;Note
Make a note of this password as you would be using the same to create a Secrets Manager password.
To grant the
bedrock_user
permission to manage thebedrock_integration
schema, so they can create tables or indexes in it.GRANT ALL ON SCHEMA bedrock_integration to bedrock_user;
Login as the
bedrock_user
and create a table in thebedrock_integration schema
.CREATE TABLE bedrock_integration.bedrock_kb (id uuid PRIMARY KEY, embedding vector(1536), chunks text, metadata json);
We recommend you to create an index with the cosine operator which the bedrock can use to query the data.
CREATE INDEX on bedrock_integration.bedrock_kb USING hnsw (embedding vector_cosine_ops);
Create an AWS Secrets Manager database secret. For more information, see AWS Secrets Manager database secret.
Creating a knowledge base in the Bedrock console
While preparing Aurora PostgreSQL to be used as the vector store for a Knowledge Base, gather the following details that you need to supply to Amazon Bedrock console.
Amazon Aurora DB cluster ARN
Secret ARN
Database name (e.g. postgres)
Table name - Advise to provide a schema qualified name, ie. CREATE TABLE bedrock_integration.bedrock_kb; which will create the bedrock_kb table in the bedrock_integration schema
Table fields:
ID: (id)
Text chunks (chunks)
Vector embedding (embedding)
Metadata (metadata)
With these details you can create a knowledge base in the Bedrock console. For more information, see Create a vector store in Amazon Aurora.
Once Aurora is added as a knowledge base, you ingest the data sources into it. For more information, see Ingest your data sources into the knowledge base.