esteban@devtrillo:~/blog/writing$
← cd -
$ cat lets-build-codebases-that-ai-loves.md

let's build codebases that AI loves

Jun 30, 2026·6 min read
$ tail -f lets-build-codebases-that-ai-loves/readers
connecting…

I have been rereading A Philosophy of Software Design.1

One idea keeps following me into every AI coding session:

The best modules are deep.

That sounds backwards if you judge code by file size.

A deep module has a small interface and a lot of behavior behind it. A shallow module has a big interface and not much behind it. The deep one gives you more than it asks you to understand. The shallow one makes you learn its details anyway.

This mattered before AI.

It matters more now.

Agents cannot hold a codebase in their head. They do not have one. They have a context window, a grep command, and whatever clues we left in the code.

If you want a codebase that AI loves, write fewer shallow modules.

A shallow module makes you pay twice

Here is a call, on its own, the way you meet it while scanning a file:

formatDate(post.pubDate, "", false, true);

Tell me what it does.

Not the date. The rest. What is the empty string for? What does false turn off? What does true turn on?

You cannot.

I could not either, and I wrote it. I had to open the function to find out:

function formatDate(
date: Date,
locale: string,
includeTime: boolean,
useUtc: boolean,
) {
// 12 lines of formatting
}

Tap to decode each argument.

Three policy questions. None of them answered at the call site. All of them handed back to me, every time I call the function.

That is the tax a shallow module charges. It looks cheap inside its file. It is expensive everywhere else.

NOTE

An agent does not read the way you just did. It can chase the definition down, but every hop costs context it would rather spend on your task. So it guesses what "", false, true meant. When the guess is wrong, it edits the wrong call site with full confidence.

A deep module hides the boring choices

The fix is not clever. It is just honest:

function formatPostDate(date: Date) {
return new Intl.DateTimeFormat("en", {
year: "numeric",
month: "long",
day: "numeric",
timeZone: "UTC",
}).format(date);
}

Now the call says what it means:

formatPostDate(post.pubDate);

The date policy lives in one place. The call site reads like the product. The next person never asks whether blog dates include the time, because the module already answered.

That is depth.

Not more code. More decision behind each line of interface.

NOTE

A deep module is not a big module. It is a module where the caller gets to know less.

Deep modules are better prompts

Every public function is a prompt to whoever calls it next.

That might be you in two months. A teammate who never opened the file. An agent editing the repo at 1 a.m. because you asked it to add RSS while you made coffee.

A shallow interface is a bad prompt. Too much to infer. What does true switch on? What does false skip?

A deep interface answers before you ask. Tap to deepen it — watch what stays and what vanishes:

The shape of the code carries the intent. The agent guesses less. Grep gets easier. Review gets easier.

This is the part of AI coding people keep skipping. They obsess over the prompt in the chat box. Then they hand the model a codebase full of vague names, boolean flags, and helpers that barely help.

Your codebase is part of the prompt.

The largest part. And the part you can edit ahead of time.

Shallow modules create context debt

There is a cost under all of this. It deserves a name.

You read twelve files before you can safely touch one line. A helper exposes five knobs because one caller might want them someday. The real rule lives in three call sites instead of inside the module.

That is context debt. The amount you have to understand before you are allowed to change anything.

Humans feel it as drag.

Agents fail from it.

The agent misses the rule no one wrote down. It updates four call sites and leaves the fifth. It bolts another option onto the shallow helper because the old one almost fit. Now the next task starts from a worse interface.

This is how AI makes a messy codebase messier.

It does not invent the mess. It scales the mess that is already there.

The test I use now

When I add a module, or review one, I ask one question:

Does this let the caller know less?

If the answer is yes, the module is getting deeper.

A good module hides a policy:

getPublishedPosts();

A weak module exports the ingredients and makes every caller rebuild it:

getPosts().filter((post) => post.published && post.pubDate <= new Date());

A good module gives the domain a name:

subscribeToLaunchEmails(email);

A weak module makes the caller speak database:

insertSubscriber(email, "launch", false);

A good module removes a choice most callers should not have.

A weak module adds a parameter, because that was the fastest way to avoid naming the real thing.

What this looks like in a big codebase

In a small repo, a shallow module is a paper cut.

In a big one, it is the whole disease.

A formatDate with three mystery flags is annoying in a side project. In a codebase with two thousand files, it is called in ninety places. Each call leaks the same three policy questions. Each call is a place the next change can go wrong.

Multiply that by every helper someone wrote in a hurry.

That is the real shape of a large system. Not one big mess. Thousands of small ones, each cheap to make and expensive to live with.

The picture is the whole argument. A wall of small boxes is more surface, more edges, more places to be wrong. Consolidate them and the caller has three names to learn instead of ninety boxes to track.

A human survives this with tribal knowledge. You learn that true means UTC. You learn which call sites lie. You learn who to ask.

An agent has none of that.

It does not know the meeting where you agreed dates are always UTC. It cannot ask the person who wrote the flag. It sees ninety call sites and a context window that fits maybe ten.

So it samples. It guesses. It edits the calls it can see and trusts the pattern for the rest.

In a small codebase that guess is usually fine.

In a big one it is how a one-line change becomes a three-day incident.

The fix does not scale by adding rules. It scales by removing them. Every shallow module you make deep is ninety call sites that stop asking questions. Every policy you name once is a thing the agent no longer has to infer across a system it cannot hold.

Depth is not a nicety in a big codebase.

It is the only thing that keeps the codebase changeable at all.

Write for the smallest context window

Ousterhout was writing for humans. The advice still lands.

A deep module is one fewer thing a reader has to keep in memory. That was already good software design. Now the reader may have a literal context window, and every leaked detail competes with the task you wanted done.

So do the boring work.

Name the thing the product actually means. Keep the rule in one place. Delete the flags that make callers remember policy. Hide the details most callers should never see.

The prize is not prettier code.

The prize is a codebase a human can change without loading the whole system into their head. And one an agent can change without guessing what "", false, true was supposed to mean.

That is the codebase I want more of.

Footnotes

  1. John Ousterhout, A Philosophy of Software Design.