Writing best practices to optimize RAG applications
Ivan Cui and Samantha Stuart, Amazon Web Services
July 2025 (document history)
Large language models (LLMs) have revolutionized the field of artificial intelligence with
their remarkable ability to understand and generate human-like text. However, they face a
significant limitation: they can only work with knowledge contained in their training data.
This is where Retrieval Augmented Generation (RAG)
How can you optimize content for retrieval in a RAG-based application? This guide provides best practices to help you optimize the formatting and writing style of text-based content in the knowledge base. Optimizing the content enhances the context that helps RAG applications understand task-specific information more accurately. When the system retrieves highly relevant and accurate content, then the quality of the LLM's response improves. Optimizing the context delivery process at a system level is called context engineering, and it forms an essential part of agentic RAG architectures. In agentic RAG, one or more additional LLMs reason and act on intake requests before the RAG execution. This facilitates a multi-step information delivery process. As RAG architectures grow increasingly complex, source content optimization remains the most direct means of delivering clear context to LLMs. These best practices are designed to help you maximize your organization's investment in a RAG application.
Intended audience
This guide is intended for AI engineers, data scientists, data engineers, or software developers who are building LLM applications with one or more RAG components. To understand the concepts and recommendations in this guide, you should be familiar with vector databases and prompts for LLMs.
Objectives
The recommendations in this guide can help you achieve the following:
-
Improve the accuracy and relevancy of responses generated by RAG applications by providing well-structured and semantically rich source documents, optimized for token usage and redundancy.
-
Help RAG applications to better understand domain-specific knowledge and context by providing clear definitions and explanations within source documents.
-
Facilitate easier maintenance and knowledge base updates for RAG applications by adhering to consistent formatting and structuring guidelines across source documents.
-
Improve the scalability of RAG solutions by breaking down large, monolithic documents into smaller, self-contained units that can be efficiently indexed and retrieved.