Most of what's written about agentic SDLC is about the exciting part — building features faster, fixing bugs in minutes, shipping more. The unglamorous truth is that 70% of software engineering is maintenance, and the maintenance question is where the practice is either proven or quietly broken. Six months in, half your codebase has agent-touched files. The agent that wrote them is gone — or, more precisely, it's a fresh session every time, with no memory of what it did last March. Maintenance is now a problem of continuity without memory, and the practices that solve it look meaningfully different from human-only maintenance.
This chapter is about the long tail: months and years after the first agent-assisted commit, when the romance is gone and you just need the system to keep working.
A human engineer who joined six months ago has internalised a lot. They know that the billing module is more careful than it looks because of an incident in 2024. They know not to refactor the legacy auth module because the rewrite was attempted twice and failed. They know which colleague wrote which module, and who to ask. This tacit knowledge is the connective tissue of a codebase.
An agent has none of it. A fresh session has read what you put in front of it and nothing else. The agent that fixed yesterday's bug doesn't know about today's bug; the agent fixing today's bug doesn't know what yesterday's said about the same module. Every session is day one.
The maintenance question is: how do you compensate? You can't keep the agent in the office over coffee. You have to write down everything that would otherwise live in tacit memory, and you have to write it in a place the agent will actually read.
You met this in Ch. 04. It's load-bearing for maintenance. Treat it as a living document — not a write-once setup file. Every recurring correction goes into it. Every "the agent didn't know X about our codebase" moment becomes a line in the file.
A project prompt that survives a year typically has these sections:
Not the generated-from-commits changelog. A human-curated record of why things changed — at the level of weeks or releases, not commits. The agent reads this when investigating "when did this start happening" or "is this related to that recent refactor."
Each entry covers a small batch of related changes with one or two sentences of context. "Switched checkout to read currency from user.preferences.currency instead of session.currency — fixed EU-pricing bug, see decision log 2026-05-12." Six months from now, when an agent investigates an odd currency behavior, the trail is immediate.
Architecture Decision Records, but lightweight. Five-to-ten-line entries explaining a non-obvious choice. Not every decision; only the ones that future-you (or a future agent) will want to know about.
Three short sections cover most of what's needed: context (what triggered the decision), decision (what was chosen), and why not (what alternative was rejected and why). Five minutes to write; decades of value in a long-lived codebase.
The "would I want to know this in two years" filter. Use it to decide what gets a decision log entry. Trivia about typography in the marketing site: no. The fact that you can't lazy-load the dashboard chart library because of a known incompatibility with their SSR mode: yes.
Drift is when agent-made changes accumulate that don't match the conventions you set out — not flagrantly, but a degree at a time. Naming conventions slip. New files appear in odd places. The "we always use repositories for DB access" rule has six exceptions now, all introduced in the last quarter. None of them flagged at PR review because each one was small.
The defenses, in order of effectiveness:
Drift is reversible if caught early. Codebases that don't catch drift end up with the same problem human-only codebases get — death by a thousand small inconsistencies — just faster.
You're handed a bug in a module no one on the team wrote. The original author was an agent in a session six months ago. Git blame points to a commit by your CI bot. The PR description is two sentences. What now?
The pattern that works has four steps:
Total time on a typical bug: forty minutes. Without the external memory systems, the same investigation would take hours, and the chance of accidentally re-breaking something else is much higher.
Sometimes you inherit a codebase you've never seen, written largely by agents over months, with weak documentation and no decision log. Maintenance starts with archaeology: building the missing memory before you can safely change anything.
The workflow:
This is a week of work for a medium codebase, more for a large one. It feels like overhead until the first time you have to fix something — at which point the maintenance debt you would have paid is gone.
A specific maintenance pattern worth naming: agents tend to write code that's slightly more verbose than it needs to be. Helper functions for things that could be inlined. Comments explaining the obvious. None of this is a bug; cumulatively it makes the code wordier than a senior would write.
The pattern: periodic compression passes on modules the agent has touched several times. Ask another agent session to "tighten this — same behavior, fewer lines, no comments that just describe what the code does." Review the diff. Usually 10–25% smaller, often more readable, no behavioral change.
The reverse is also worth doing occasionally — when the code is too dense and missing the comments that explain why. A pass to add those comments, based on the decision log and changelog, pays for itself the next time anyone reads the file.
Maintenance practices fail not because individuals don't care but because the team doesn't have a shared agreement about who owns the external memory. Three questions worth answering explicitly:
The answers don't have to be elaborate. They have to be agreed.
If you don't have a project prompt for any project, create one for your largest active project. Spend an hour. Use the section list earlier. You'll be surprised how much tacit knowledge you write down without effort.
Look back at the last three significant non-obvious decisions you made on the project. Write decision-log entries for them, retroactively. Five minutes each. Then commit to writing one prospectively, the next time you make a decision that future-you will want to remember.
Audit one module that's been touched by agents at least five times. Read git log for the last six months on it. Has anything drifted from the conventions in your project prompt? If yes, plan a corrective sweep. If no, you've earned the right to be smug for one day.
Next chapter: Patterns — the recurring shapes of agentic systems that work, look-like-work, and fail interestingly.
Sign in to join the discussion and post comments.
Sign in