In this project you will build an AI-powered news summariser using retrieval-augmented generation (RAG) concepts — grounded, cited summaries that avoid hallucinations and stay on topic. The deliverable is a working prompt-based summariser plus a small evaluation harness you can use to check whether the summaries are honest.
Ask a chat model to "summarise today's news" and you get confident text that sounds right and is usually wrong — half-remembered events, fabricated dates, and a smug authoritative tone. The fix is not a smarter model; it is a different architecture. Retrieval-augmented generation (RAG) means: first retrieve the actual source articles, then augment the prompt with those articles, then generate the summary strictly from them.
You can do this without a database or vector store. A prompt-only RAG is a great starting point and is enough for most personal projects.
RAG has three stages. Retrieval pulls the documents that are likely to contain the answer. Augmentation injects those documents into the prompt as context. Generation produces a response that must be grounded in the injected context.
Memory-only summary
Summarise the latest news on UK rail strikes.
The model will produce a confident summary built from whatever it remembers from training data plus pattern-matched plausibility. It will likely invent specific union names, dates, or numbers. There is no way to check the claims, and the model has no idea what is "latest" beyond its training cutoff. Confident hallucinations are the worst kind.
The retrieval step is up to you — copy-paste from news sites, use an RSS reader, paste in URLs that a tool with browsing capability can fetch, or run a small Python script. The point is that the model never works from memory; it always works from text you handed it.
Step 1 — Source intake prompt
You are a news researcher. I will paste 5–10 articles below,
separated by lines of "=====". For each article, do four things:
1) Assign it a short ID: SRC-1, SRC-2, …
2) Note the publication, the date, and the headline.
3) In ≤4 sentences, extract the facts only — no interpretation,
no opinion, no editorial framing.
4) Flag any claim that appears in only one source (i.e. uncorroborated).
Return as a structured Markdown table.
Articles:
"""
=====
{paste article 1 — full text, with attribution}
=====
{paste article 2}
=====
...
"""
You now have a clean facts table. This is the corpus the rest of the chain works from. It is also the answer to the hardest RAG question: "what did we actually feed the model?"
Step 2 — Augmented summary prompt
You are a careful news writer. Using ONLY the facts in the table
below, write a 250-word summary of the story.
Rules:
- Every factual sentence must end with a citation like [SRC-2] or
[SRC-2, SRC-5] when multiple sources confirm.
- If a fact is corroborated by 2+ sources, you may state it
confidently. If it appears in only 1 source, qualify it
("according to {publication}…").
- Do not introduce any fact not in the table. If something feels
missing, say "I don't have a source for that yet."
- Tone: neutral, plain language, no hype, no editorialising.
- Structure: 1 lead paragraph (what + when + who), 1 paragraph on
context (why this is happening), 1 paragraph on what's next.
Facts table:
"""
{paste output of step 1}
"""
Sample sentence: "Rail union ASLEF announced a 48-hour walkout starting Tuesday in response to a stalled pay offer [SRC-1, SRC-3]. According to the Times, the operators have invited mediators to a Thursday meeting [SRC-2]." Citations make every claim auditable.
Step 3 — Faithfulness check prompt
You are now a fact-checker. Compare the summary against the
facts table.
For each sentence in the summary:
- find the source(s) it cites
- check whether the claim is actually present in those sources
- classify as: SUPPORTED / PARTIALLY-SUPPORTED / UNSUPPORTED / EDITORIAL
Return a table:
| Sentence # | Claim | Cited sources | Classification | Note |
If any sentence is UNSUPPORTED, propose a rewritten version that
either (a) cites the correct source, (b) qualifies the claim, or
(c) removes the sentence entirely.
This is the "evaluation harness" — a built-in audit that catches drift. Run it after step 2. If anything is UNSUPPORTED, regenerate the summary with the rewritten sentences merged in.
Step 4 — Audience-shaped output prompt
Take the verified summary above and produce three audience-shaped
versions:
1) "TL;DR" — 3 bullets, the busy commuter version, ≤60 words total
2) "Briefing" — 250 words, for someone catching up after a week
off, citations preserved
3) "Deep-dive" — 600 words, includes the "what's contested" and
"what to watch next" sections, citations preserved
Do not introduce any new facts beyond the verified summary.
Citations must remain accurate.
You now have three outputs from the same source set — a flexible package you could send to a newsletter, a Slack channel, or a personal daily digest.
Tip: If you want to scale this up, replace the manual "paste articles" step with a small Python script that pulls RSS feeds, dedupes by URL, and writes a daily folder of articles. The four prompts above stay identical; only the retrieval automation changes.
Pick a news story you care about. Find five articles from different outlets. Run the full chain and read the verified summary. Compare it to a memory-only summary you produce with the prompt
Summarise X for me.
The difference is the entire argument for RAG.
Deliberately include one source with a strong editorial slant. Run the chain. Does the verified summary preserve neutrality? If not, harden the rules in step 2 (e.g., "Adjectives only allowed if they appear in 2+ sources") and rerun.
Add a "perspective" pass: "Given the facts table, list the two main competing interpretations of this story and which sources support each. Do not invent interpretations." Notice how this turns the summariser into a tool for honest disagreement instead of false neutrality.
Sign in to join the discussion and post comments.
Sign inPrompt Engineering for Image Generation
Turn words into stunning visuals. Master AI image generation tools like Midjourney, DALL·E 3, and Stable Diffusion with 18 focused tutorials — from first prompt to full brand identity.
Advanced Prompt Engineering Techniques
Master the powerful techniques AI experts use every day. Chain-of-thought, RAG, agents, function calling, prompt evaluation, and much more — 20 deep-dive tutorials.
Prompt Engineering for Data Science & Analytics
Supercharge your data workflows with AI. 15 practical tutorials on using prompt engineering for data cleaning, EDA, machine learning, SQL, visualisation, and more.
Prompt Engineering for Business & Productivity
Use AI to work smarter — automate tasks, make better decisions, and communicate professionally. 12 practical business prompt tutorials for professionals.
Foundations of Prompt Engineering
The must-know basics of prompt engineering. Learn what prompts are, how AI models read them, and how to write clear instructions that get great results.
Prompt Engineering for Content & Copywriting
Write blogs, ads, emails, and social media content ten times faster with AI. 13 practical tutorials on prompt engineering for content creators and copywriters.