LangChain is the most widely used framework for stringing prompts, models, retrievers, and tools into reliable pipelines. Used well, it removes a lot of glue code. Used badly, it adds a confusing layer on top of an already-confusing problem. This tutorial covers what LangChain actually gives you, the parts worth learning, and when plain Python is the better tool.
If you have built more than a handful of prompts, you have probably written the same code several times: a function that loads a template, fills in variables, calls a model, parses the output, retries on failure, and logs everything. LangChain's pitch is that this is plumbing — and it can be replaced with a small, composable vocabulary of prompt templates, chains, and runnables that snap together with a pipe operator.
This tutorial gives you a clear-eyed tour: the core primitives, the composition pattern, an end-to-end RAG pipeline, and an honest discussion of when LangChain is the right tool and when it adds more weight than it saves.
LangChain's modern API ("LangChain Expression Language" or LCEL) is built around three primitives:
invoke / stream / batch methods.|. Two runnables piped together produce a new runnable. The output of the first becomes the input of the second. Chains are just compositions of runnables.The pipe operator is the centre of the API. Once you internalise that everything-is-a-runnable and the pipe means "send the output here", LangChain becomes mostly autocompleted Lego.
| sends the output of one runnable into the next.Hand-rolled glue
def answer(question):
docs = vector_store.search(question, k=4)
ctx = "\n\n".join(d.text for d in docs)
prompt = SYSTEM_PROMPT.replace("{ctx}", ctx)\
.replace("{q}", question)
raw = openai.chat.completions.create(
model=MODEL, messages=[{"role":"user","content":prompt}]
).choices[0].message.content
try:
return json.loads(raw)
except Exception:
# silently swallow, return None
return None
Works fine on day one. By day fifty you have written four versions of this function across the codebase, each with subtly different retry, logging, and parsing behaviour. Nothing streams. Switching models means editing five files.
Same pipeline, LCEL style
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
from langchain_core.runnables import RunnableParallel
retriever = vector_store.as_retriever(search_kwargs={"k": 4})
model = ChatOpenAI(model="gpt-x-class", temperature=0)
parser = JsonOutputParser()
prompt = ChatPromptTemplate.from_messages([
("system",
"Answer ONLY from the context. Reply as JSON: "
"{{\"answer\": str, \"citations\": [str]}}.\n\n"
"Context:\n{context}"),
("user", "{question}"),
])
def format_docs(docs):
return "\n\n".join(f"[{d.metadata['id']}] {d.page_content}"
for d in docs)
chain = (
RunnableParallel(
context = retriever | format_docs,
question = lambda x: x,
)
| prompt
| model
| parser
)
result = chain.invoke("What is our wholesale return policy?")
The whole pipeline is one composable object. .invoke, .stream, .batch, and async variants come for free. Swapping models or retrievers is a one-line change. Logging, tracing, and retries are configurable in one place.
ChatPromptTemplate first. The slot names you choose flow through the entire chain.|. Read pipelines left-to-right like a Unix pipe. If a step needs multiple inputs (RAG needs both context and question), use RunnableParallel to produce a dict..stream for perceived latency. The LCEL composition supports streaming end-to-end with no extra code.When NOT to use LangChain: If your project has one or two prompts, switches model providers rarely, and doesn't need RAG or agents, plain Python with a small prompt-template helper is usually clearer and easier to debug. LangChain is most valuable when you have multiple pipelines, multiple model providers, or need streaming, batching, and tracing across the whole system.
Rebuild your simplest existing prompt as an LCEL chain: template → model → parser. Compare the lines of code, the readability, and the ease of switching models, to your previous implementation.
Wire a parallel step into the chain — for example, fetch retrieved context and the user's previous answer at the same time using RunnableParallel. Pipe both into the prompt template. This pattern unlocks most of LangChain's real power.
Add a tracing or callback handler that logs every prompt that hits the model. Run the chain on five questions. Inspect the logged prompts — you will almost certainly find one surprise. That is exactly the point of tracing.
|.Sign in to join the discussion and post comments.
Sign inPrompt Engineering for Education & Learning
Use AI as your personal tutor. Learn how to study faster, create lesson plans, generate practice questions, master languages, and prepare for competitive exams with smart prompts.
Prompt Engineering Projects & Real-World Applications
Twelve hands-on projects that turn prompt engineering theory into a portfolio. Build chatbots, content generators, RAG systems, and more.
Prompt Engineering for Business & Productivity
Use AI to work smarter — automate tasks, make better decisions, and communicate professionally. 12 practical business prompt tutorials for professionals.
Prompt Engineering for Developers
Use AI as your coding co-pilot. 18 tutorials on writing prompts to generate clean code, debug faster, write tests, build APIs, and ship better software.
Prompt Engineering for Content & Copywriting
Write blogs, ads, emails, and social media content ten times faster with AI. 13 practical tutorials on prompt engineering for content creators and copywriters.
Foundations of Prompt Engineering
The must-know basics of prompt engineering. Learn what prompts are, how AI models read them, and how to write clear instructions that get great results.