Individual vector database options Managed service option

Vector database options

AWS offers a diverse range of vector database solutions to support different use cases and requirements in generative AI applications. These options can be broadly categorized into individual database services and managed service offerings, each with distinct characteristics and advantages. Understanding these options is crucial for organizations looking to implement vector search capabilities effectively while maintaining optimal performance, scalability, and cost efficiency.

For more information about vector database solutions, see the following sections:

Individual vector database options
Managed service option

Individual vector database options

The individual vector database options on AWS include Amazon Kendra, Amazon OpenSearch Service, and Amazon RDS for PostgreSQL with pgvector. (An open-source extension, pgvector adds the ability to store and search machine learning (ML)-generated vector embeddings.) These solutions offer different approaches to vector search, allowing organizations to choose based on their existing infrastructure, technical requirements, and specific use cases.

Amazon Kendra

Amazon Kendra is an enterprise-grade intelligent search service that uses natural language processing and advanced machine learning algorithms to return specific answers to search questions from your data. Amazon Kendra simplifies the implementation of search functionality, making it an effective backend solution for generative AI applications.

Other key features of Amazon Kendra include the following:

Native connections to over 40 data sources
Built-in data preparation capabilities
Quick setup that doesn't require deep technical expertise

Benefits of Amazon Kendra include the following

Automated data processing (chunking, ingestion, retrieval)
Powerful customization options:
Simple programmatic access through the AWS SDK for Python (Boto3)

For more information, see Benefits of Amazon Kendra in the Amazon Kendra Developer Guide.

Amazon OpenSearch Service

Amazon OpenSearch Service is a managed service that helps you deploy, operate, and scale OpenSearch Service clusters in the AWS Cloud.

Core capabilities of OpenSearch Service include the following:

Open-source search and analytics engine
Distributed architecture
Real-time data processing

Some advantages of using OpenSearch Service include the following:

Horizontal scalability
RESTful API support
Handles structured and unstructured data
Real-time data analysis
Suitable for various deployment sizes

For more information, see Features of Amazon OpenSearch Service in the OpenSearch Service Developer Guide.

Amazon RDS for PostgreSQL with pgvector

Amazon RDS for PostgreSQL with pgvector combines the AWS managed relational database service with PostgreSQL's vector processing extension. This combination enables organizations to store and query high-dimensional vectors while maintaining Amazon RDS. The solution is particularly suitable for generative AI applications that require real-time vector operations without the overhead of managing database infrastructure.

Key benefits of Amazon RDS for PostgreSQL with pgvector include the following:

High availability
Automatic failover
Cost-effective (pay-per-use)
Built-in monitoring
Real-time vector data integration

For more information, see Advantages of Amazon RDS in the Amazon Relational Database Service User Guide.

Managed service option

Amazon Bedrock Knowledge Bases represents the AWS fully managed approach to vector database implementation. The service's flexibility in storage options, combined with its automated management features, makes it particularly valuable for organizations seeking to implement RAG without managing complex infrastructure.

With Amazon Bedrock Knowledge Bases, you can create, maintain, and query knowledge bases that enhance your foundation models using RAG. This service simplifies the complex process of implementing RAG by managing the entire data ingestion, vectorization, and retrieval pipeline.

Key benefits of Amazon Bedrock Knowledge Bases include the following:

Simplified data processing

Automatic data ingestion and chunking
Built-in text extraction from multiple file formats
Managed vector embeddings generation
Automatic metadata extraction and indexing

Streamlined RAG implementation

Pre-configured retrieval strategies
Automatic context window optimization
Built-in relevancy tuning
Semantic search capabilities out of the box

Security and governance

Integrated AWS Identity and Access Management (IAM) controls
Data encryption at rest and in transit
VPC support
Audit logging with AWS CloudTrail

Amazon Bedrock Knowledge Bases supports multiple vector store options. The following list provides an overview of each option's key features:

Amazon Aurora PostgreSQL with pgvector
- PostgreSQL-compatible vector storage
- Integrated with existing Aurora databases
- Cost-effective for smaller deployments
- Good for hybrid structured and unstructured data
Amazon Neptune Analytics
- Graph-based vector search
- Combines relationship data with vectors
- Ideal for connected data use cases
- Advanced query capabilities
Amazon OpenSearch Serverless
- Fully managed serverless experience
- Automatic scaling based on workload
- Built-in k-NN capabilities
- Cost-effective for varying workloads
Pinecone
- Purpose-built vector database
- High performance at scale
- Advanced similarity search features
- Managed through the Amazon Bedrock console
Redis Enterprise Cloud
- In-memory vector search capabilities
- Low-latency performance
- Real-time vector search
- Integrated caching capabilities

When choosing a vector store that's supported by Amazon Bedrock Knowledge Bases, consider the following key characteristics of each option:

Aurora PostgreSQL – Relational data with vector capabilities
Neptune Analytics – Graph-based knowledge representations
OpenSearch Service – Search and analytics focus
Pinecone – Pure vector search performance
Redis Enterprise Cloud – Real-time and low-latency needs

Each implementation offers the following unique advantages:

Aurora PostgreSQL – Best for applications needing both traditional SQL and vector capabilities
Neptune Analytics – Ideal for complex relationship-based queries and knowledge graphs
OpenSearch Service – Strong in full-text search and analytics
Pinecone – Optimized for pure vector operations
Redis Enterprise Cloud – Best for real-time applications

Following are some key points to consider when selecting a vector store for your RAG solution:

Scalability – Ability to handle large and growing datasets efficiently.
Query performance – Fast and efficient nearest neighbor search capabilities.
Data ingestion – Existing data model requirements. Support for diverse data formats and ease of ingestion.
Filtering and ranking – Advanced filtering and ranking mechanisms for retrieved results.
Integration – Seamless integration with other systems and tools through APIs or protocols.
Persistence and durability – Suitable persistence and durability options (in-memory or disk-based).
Concurrency and consistency – Efficient handling of concurrent access and data consistency.
Licensing and cost – Evaluation of licensing model, upfront and ongoing costs, and vendor lock-in.
Community and support – Vibrant community and comprehensive documentation.
Security and compliance – Adherence to necessary security and compliance requirements.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Overview of vector databases

Vector database comparison