esteban@devtrillo:~/blog/writing$
← cd -
$ cat throwback-thursday-extreme-programming.md

Throwback Thursday: Extreme Programming in Today's World

Jul 2, 2026·7 min read
$ tail -f throwback-thursday-extreme-programming/readers
connecting…

I have been rereading Extreme Programming Explained.1

It came out in 1999. Before the cloud. Before the phone in your pocket. Long before a model could write a function while you made coffee.

So why open it now?

Because the book is not really about 1999.

It is about what happens when the cost of change goes down, and what you have to do to keep it down.

That used to be a question for teams of humans. Now you have a teammate that types faster than all of them combined.

The old book reads like it was waiting for this.

Why we still care about this old book

Most programming books age badly. They tie their advice to a language, a framework, a tool. The tool dies. The advice dies with it.

Extreme Programming did something different.

It made a bet about people and change. The bet was that software does not fail because the code is hard. It fails because requirements move, and our process punishes us for letting them move.

That bet did not depend on Java or pair programming stations or index cards on a wall.

It depended on one number.

The cost of change

For decades, everyone agreed on a curve.

A bug caught while you are typing costs almost nothing. The same bug caught in design costs more. In testing, more again. In production, it costs a fortune.2

The cost of change rose over time. Exponentially. So the safe move was to decide everything up front, before the cost got out of hand.

Kent Beck looked at that curve and asked the dangerous question.

What if it is not a law of nature?

What if good practices could flatten it? Tests, small steps, constant integration, code you can actually read. What if you could make change cheap enough, late enough, that you no longer had to guess the whole system on day one?

That was the heresy. The cost of change is not fixed. It is something you engineer.

NOTE

This is the part that matters for AI. An agent makes change fast. Fast is not the same as cheap. If a one-line edit can quietly break four other files, speed just helps you reach the disaster sooner. Cheap change is what you were buying all along. AI raised the stakes on getting it.

The whole point of XP was to keep that curve flat. Everything else in the book is in service of that one goal.

Good practices are the flat curve

You do not flatten the cost of change with a wish.

You flatten it with habits that sound boring until you skip them.

Small commits, so a mistake is small. Continuous integration, so two changes meet today instead of next month. Names that say what they mean. Code simple enough that the next person does not have to reverse-engineer it.

None of that was new advice in 1999. It is not new now.

What changed is who else reads your code.

An agent works the same code you do. It reads names to guess intent. It greps for patterns and trusts them. It edits the call site it can see and assumes the rest match.

Good practices were how a tired human stayed sane.

Now they are how the agent stays correct.

A messy codebase does not slow AI down. It speeds up the mess. The model picks up your worst habit and applies it everywhere, at machine speed, with full confidence.

The discipline XP asked for kept change cheap. Tidiness was a side effect.

That job did not go away. It got a much faster apprentice.

Should we still care about TDD

Test-driven development is the part of XP people love to argue about.

Write the test first. Watch it fail. Write the smallest code that passes. Clean it up. Repeat.

The usual complaint, in 2026, goes like this.

AI writes the code in seconds. Why slow down to write a failing test first? Just generate the feature and move on.

I understand the pull. I think it is exactly backwards.

The test was never mostly about catching bugs.

A test is a specification you can execute. It says, in code, what “working” means. Before you write a line, you have already answered the only question that matters: how will I know this is done?

That question got more important the moment a machine started writing the code.

Why TDD helps AI more than it hurts

An agent is fast, tireless, and confidently wrong on a regular basis.

You cannot fix the confidently-wrong part by reading every line it writes. There is too much, and it all looks plausible. Plausible is the whole problem.

What you can do is give it a target it cannot argue with.

A failing test is that target.

The test is written by you, or written first and reviewed by you, in the language of “what should be true.” Now the agent is not free-associating toward something that looks like code. It is working against a verifier. Red, then green. The bar is not “looks right.” The bar is “passes.”

NOTE

This flips the loop. Without tests, you review the AI’s output by reading it and hoping. With tests, the AI reviews its own output by running it, and only brings you work that already clears the bar. You move from inspecting code to inspecting intent.

without testsReview by reading.

You read every line and hope. Plausible code passes. So do its bugs.

with testsReview by running.

The suite clears the bar first. You inspect intent, not every line.

So the “TDD slows me down” argument has it inside out.

Tests do not slow the AI down. They are the reason you can let it run fast at all.

The test is the leash and the goal at the same time. Without one, you are not going faster. You are just generating unverified code more quickly, and calling the part where it breaks in production “later.”

TDD pays for itself with humans. With agents, it is closer to required.

How to use AI to refactor passing tests

This is my favorite move, and it is the cleanest payoff of the whole idea.

Get the tests green.

A green test suite is a contract. It says: this is what the system does, and here is proof. As long as those tests stay green, behavior has not changed. That is the definition of refactoring. Change the shape, keep the behavior.

Which means a green suite is exactly the safety net an agent needs to be let loose.

The workflow is simple.

  1. Write the tests. Make them pass, even with ugly code.
  2. Hand the agent the ugly code and the green suite.
  3. Ask it to refactor: simplify, rename, split, deduplicate. Anything it wants.
  4. The rule is the tests stay green. If a test goes red, the refactor is wrong, not the test.
function total(items) {
  return items
    .filter((item) => item.active)
    .reduce((sum, item) => sum + item.price * item.qty, 0);
}

Where it starts: index math, null paranoia, a temp for every step.

Now the thing you feared about AI becomes the thing you want from it.

Aggressive, large-scale rewrites are dangerous by hand because you cannot hold the whole system in your head. The agent cannot either. But the test suite can. It does not need to understand the code. It only needs to notice when the behavior moved.

So you point the model at a thousand lines of mess and say: make this better, do not change what it does. The tests tell you, in seconds, whether it kept its promise.

That is the loop XP was reaching for in 1999, with a faster pair partner than Beck could have imagined.

Tests pin the behavior. The agent reshapes the code. The curve stays flat.

The old book was about now

Extreme Programming reads like a period piece. The vocabulary is dated. Some of the rituals feel quaint.

But strip the rituals and what is left is one idea.

Keep change cheap, and you can keep deciding late, keep responding, keep shipping without fear.

That idea did not need an update.

It needed a faster teammate to make it urgent.

We got one.

So embrace change. Just keep paying the small, boring price that keeps change cheap. Tests first. Thin slices. Names that mean something. A green suite you trust.

The book told us how to do that twenty-five years ago.

We just finally have a reason to listen.

Footnotes

  1. Kent Beck, Extreme Programming Explained: Embrace Change.

  2. The exponential cost-of-change curve is usually traced to Barry Boehm’s work on software economics. XP’s whole argument is a challenge to it.