What RAG does
Retrieval-Augmented Generation chunks documents, embeds them in a vector store, retrieves the closest matches by similarity, and injects them into the prompt. The LLM generates a response using the retrieved context.
RAG is simple, well-understood, and works well for internal knowledge bases where provenance tracking is optional. It reduces hallucination by providing relevant context, but does not eliminate it. Retrieved chunks can be ambiguous, and the LLM can still generate claims not directly supported by the retrieved material.
What grounded retrieval does
Grounded retrieval retrieves candidate evidence, checks relevance against the query, verifies provenance, and returns structured output with source links. Every claim in the output is traceable to a specific source. The output is auditable: you can check why the system said what it said.
Grounded retrieval is more complex than RAG. It requires a relevance verification step, provenance tracking, and structured output handling. The tradeoff is higher reliability and auditability.