/rih-TREE-vul AWG-men-ted jen-uh-RAY-shun/
A technique that gives AI access to external documents before generating a response, dramatically reducing hallucination and enabling domain-specific answers.
Retrieval-Augmented Generation (RAG) is the most practical architecture for making AI useful in business. Instead of relying solely on what the model memorized during training, RAG first searches your documents, finds relevant passages, and feeds them into the prompt as context. The AI then generates its answer grounded in your actual data.
RAG solves the two biggest problems with vanilla AI: hallucination (because answers are grounded in real documents) and knowledge currency (because you can update documents without retraining). It's how companies build AI assistants that know about their specific products, policies, and processes.
The RAG pipeline is: embed your documents → store vectors in a database → when a user asks a question, embed the question → find similar documents → include them in the prompt → generate answer. It sounds complex but modern tools make it surprisingly accessible.
When building any AI system that needs to answer questions about specific documents, products, or knowledge that isn't in the model's training data.
RAG is how you turn a general-purpose AI into a domain expert. It's the architecture behind every useful enterprise AI chatbot.
RAG = Research, then Answer, then Generate. The AI does its homework before speaking.
A Mac app that coaches your AI vocabulary daily