Your Agent Doesn't Need a Smarter Model
On this page.
Your Agent Doesn't Need a Smarter Model
A smart model with bad context guesses. A cheap model with great context delivers. That's the whole argument — and it's why I now spend more time writing CLAUDE.md files and organizing context folders than I do picking which tier to dispatch a task to.
Part 1 was about matching the right model to the task. This one is about the thing that determines whether any model produces useful output: what it knows before you ask it to do anything.
CLAUDE.md Is the Cheapest Leverage You Have
The CLAUDE.md file sits in your project root and gets loaded automatically at the start of every Claude Code conversation. If you're not maintaining it, you're paying a hidden tax on every single prompt — either in wasted tokens re-explaining things, or in outputs that don't match your project's conventions.
A focused CLAUDE.md should include your stack versions (Rails 8.1, PostgreSQL 17, Ruby 3.4.9), your coding conventions, your testing approach, and the patterns your team follows. It shouldn't include your life story, a full project history, or documentation that belongs in a wiki. Every line in that file gets sent as context with every prompt. Make each one count.
Before I cleaned mine up, I'd spend the first two or three messages of every session correcting the model — wrong Rails version, outdated syntax, test patterns that didn't match my setup. After tightening it, the first response is usually already in the right ballpark. Fewer tokens spent on corrections, fewer round trips to get to useful output.
One disclosure worth making out loud: don't put secrets, private keys, unannounced roadmap, or anything you wouldn't paste into a public Gist inside CLAUDE.md. It gets loaded into every conversation — every sub-agent, every skill, every /compact summary the platform ever generates. If your project needs credentials for local development, reference the file they live in, don't copy the values.
What Sub-Agents Inherit (and What They Don't)
When your main agent spawns a sub-agent — say, to review a specific file or run a focused check — that sub-agent doesn't inherit the parent's full context window. It starts fresh. The main thing it gets automatically is your CLAUDE.md.
That means the orchestrator (your main agent) is responsible for passing the right information to each sub-agent. If the orchestrator doesn't explicitly share the relevant context — the diff, the file list, the conventions — the sub-agent is working blind, burning tokens trying to rediscover things the parent already knew. CLAUDE.md is the one piece of context every sub-agent gets for free; everything else has to be deliberately assembled and passed along.
The Right Context for the Right Agent
When I started working with multiple agents — like the PR review skill from Part 1 — I realized each one needs its own slice of context. Not the whole project. Not every file. Just the specific information relevant to what it's doing.
The first layer is a shared context: the basic knowledge any agent needs about the current task. For a PR review, that's the diff, the changed file list, the ticket information from GitHub. This gets fetched once into a shared directory so agents aren't each making redundant API calls. On a mid-sized PR with six reviewers, fetching context once instead of six times saves roughly five gh round-trips and somewhere around 20,000 tokens of duplicated diff payload.
The second layer is where it gets harder: per-agent context. Each agent gets a knowledge base tailored to its area. The backend agent gets context about the application's architecture, service patterns, and API conventions. The frontend agent gets component structure, state management patterns, and styling rules. The testing agent gets factory patterns, shared examples, RSpec configuration, and the project's coverage expectations. The database agent gets migration conventions — whether the project uses a specific directory structure, whether there are separate files for complex queries, how multi-step migrations are handled.
Building this per-agent context isn't something you can fully hand off to even a capable model. It's not that it's impossible to automate — it's that classifying and identifying which piece of project knowledge belongs to which agent requires judgment. A human needs to look at the codebase, understand how the team works, and split the context into the right buckets. The model can help gather the information, but the structure is a human decision.
Once you've drawn those boundaries, there are tools to keep each bucket filled with up-to-date information. One that works well for me is Context7. It gives each agent access to current documentation for the frameworks the project uses. The agents don't just know my project's conventions — they also know the base structure of Rails 8.1, what patterns it encourages, and what the framework provides out of the box. That combination of project-specific context plus framework knowledge is what makes the suggestions relevant instead of generic.
A fair warning on Context7: it only has docs for libraries it has indexed. Niche gems, private SDKs, and pre-release versions won't be there, and the agent will confidently fill the gap with plausible-sounding nonsense if you don't tell it to stop. When I use Context7-backed agents on a less-popular gem, I always double-check the generated code against the gem's actual README.
Organize Agents by Area, Not by Language
My first attempt at organizing agents was by language: one for Ruby, one for JavaScript, one for CSS. It felt logical.
It didn't work. Too much overlap. The Ruby agent and the JavaScript agent both needed rules about naming conventions, error handling, and test coverage. The same steps showed up in multiple agent configurations. I was duplicating rules and maintaining three agents that were 60% identical.
When I switched to organizing by area — one agent for code review, one for testing, one for database architecture, one for security — the duplication dropped immediately. Each agent's context stayed focused on its responsibility, not on a language boundary. A code review agent needs to know about readability, naming, and design patterns regardless of whether the file is .rb or .tsx.
The bonus: an area-based agent can serve double duty. The same backend architecture agent that reviews a PR can also give its opinion on a proposed new feature or suggest design patterns for a new service. It's not locked into "review mode" — it has broad enough context about the area to be useful across different types of questions.
The gotcha: area boundaries drift. A code-review agent slowly accumulates opinions about testing, security, and naming conventions because reviews touch all of them. Six months in, it's a grab bag with no clear expertise. Audit your agent catalog every quarter or so — pull up each agent's config, ask yourself whether it's still about one thing, and carve out anything that's drifted into a sibling's territory. Drift is the tax you pay for the clarity of area-based organization, and it's still cheaper than maintaining three language-agents that were 60% identical anyway.
A Colleague's Experiment: Nightly PR Reviews
One idea that stuck with me came from a colleague. He set up a scheduled task that reviews the repository's pull requests every night. The model reads through the day's PRs, builds context from them, and surfaces issues or patterns it notices.
The concept is solid, and it could be a great way to incrementally build up the per-agent context files I described above — every night, the model learns a little more about how the project works.
Where he hit a wall was the structure and assignment layer — which agent gets which context, what each one is responsible for, and how the output gets organized. Without that, the reviews were unfocused.
The second risk is token consumption. Not every PR has a well-written description. If the model is reviewing PRs with vague titles and empty descriptions, it's either burning tokens trying to infer what happened from the diff alone, or worse, building context from assumptions that aren't accurate. That's how you end up with a context file that actively misleads future agents.
It's an experiment I want to run, but with guardrails. Token budgets per review, minimum description quality before the model engages, and manual review of whatever context it generates. Automation without oversight in this area could cost more than it saves.
The Human Isn't Going Anywhere
Everything I've described here — context files, agent knowledge bases, area-based organization — is a simplified version of what tools like OpenClaw are doing at full scale. Routing tasks to specialized agents with tailored context. My version is simpler, not fully automated, and still needs a human steering it.
On a big project at a big company, the human isn't coming out of the loop. The models are good at the mechanical parts: reading diffs, checking conventions, surfacing patterns. The judgment calls — what's safe to ship, what needs a second look, whether a refactor is worth the risk right now — those still need a person.
Here's a concrete example of why context matters. Take an application that uses multi-tenancy with a specific migration structure — one directory for schema changes, another for data backfills, with tenant-scoped models that follow a particular setup pattern. Without context, if you ask an agent "I need a new model related to X, and we need to update the data," it has to figure out the migration structure, the tenancy setup, and the model conventions from scratch. It burns tokens searching, guessing, and often getting it wrong on the first pass.
With the right context — the agent already knows that the existing model is tenant-scoped, that migrations need to be split into schema and data steps, and where each one goes — it suggests the correct structure on the first try. I tested this side by side. Without context, the agent made three attempts before landing on something usable. With context, it was right on the first pass. Roughly 4x fewer tokens, and no back-and-forth.
As a software developer, I'm no longer writing huge amounts of code by hand. But I'm not just using AI to write it for me either. I'm building the system that enables AI to build software well — the context, the structure, the agent organization. That's the real work now. Not throwing more tokens at better models, but giving the right model the right information at the right time.
What's Next
Part 3 is the hands-on one. I'll open up my actual .claude/ folder, walk through the agent configs and skills I reach for every day, and show the ones I've already deleted. No more theory — just my real setup, file by file. Coming soon.