A complete field manual for Retrieval-Augmented Generation. Build LLM systems that answer from your own data — covering data prep, chunking, embeddings, vector indexes, retrieval, reranking, generation, and honest evaluation, with runnable code and real measurements at every step.
Most teams building with large language models hit the same wall within a week: the model is brilliant in general and useless on your data. It has never read your documentation, your tickets, your contracts, your codebase. Retrieval-Augmented Generation — RAG — is how you close that gap, by fetching the right pieces of your data at query time and handing them to the model as evidence. This series is a field manual for building those systems properly, end to end.
A working manual, not a survey. Every chapter takes one stage of a RAG system and goes as deep as the stage deserves — the honest trade-offs, the failure modes nobody warns you about, runnable code, and real measurements on public datasets. It is written so a developer who has never built with LLMs can follow it from the first page, and so an engineer who has shipped RAG to production still finds material they did not have. The prose carries both; there are no "beginner boxes" and no "advanced sidebars." You read until you have what you need.
Before the chapters, the whole picture in one diagram. Every chapter in this series lives somewhere on this path.
Wave one — the chapters published here — is the core of RAG, enough to build a real system and understand every decision inside it. It runs in four groups:
A later wave goes into the specialised verticals — multi-modal, code, and graph RAG — plus production hardening, security, tooling, real case studies, and a full capstone build.
Comfort reading Python is enough. Examples lean on Python with brief TypeScript where it matters; the ideas transfer to any stack. If you have never written a prompt or called a model API, the Prompt Engineering series is the right place to start, then come back here. If you are interested in giving these systems the ability to act rather than just answer, the companion Agentic SDLC field manual picks up where this one leaves off.
The reason most RAG systems disappoint is not the model and not the vector database. It is the quiet decisions — how you split a document, which neighbours you fetch, what you do when the evidence is thin — that nobody benchmarks. This series benchmarks them.
Start with Chapter 00 — Why RAG, and why it didn't die with long context.
Sign in to join the discussion and post comments.
Sign in