The 30 Principles for Agentic Engineering — Part 3: The Harness
Part 3 of the 30-principle reference. The kernel sets the spine. The lifecycle moves work through it. This part is the actual configuration of the harness underneath both — how CLAUDE.md, hooks, skills, subagents, and plugins should be set up so the previous fourteen principles stay cheap to run.
Six principles. All operational. Spend a quiet afternoon on these once and you'll save weeks downstream.
Principle 15 — CLAUDE.md under 200 lines, always
Statement. The root CLAUDE.md must stay under 200 lines and stay stable. Anything deep, topic-specific, or path-scoped belongs in rules/<topic>.md or a skill.
Why it matters. A 2,000-line CLAUDE.md is the most common form of harness pollution — it consumes 30–40% of every context window with content the agent doesn't need for most tasks. Boris Cherny's published guidance is unambiguous on the size cap; Jose Parreo Garcia's "You probably don't understand Claude Code memory" walks through the mechanism. The root CLAUDE.md is an index, not an encyclopedia.
Tomorrow morning.
- Count your
CLAUDE.mdlines. If it's over 200, move topics torules/*.md. - Treat the root file as a table of contents pointing at depth.
- Quarterly: audit for stale content, move to rules or delete.
Principle 16 — Hooks for invariants that have caused real incidents
Statement. Use hooks for rules that must fire — secret scanning, dangerous-bash blocks, formatting. Reserve hooks for rules that have caused incidents. Don't over-hook.
Why it matters. Hooks are deterministic — exit 2 blocks the agent. That property is valuable exactly where determinism matters: rules with a measurable failure history. Hooks for soft preferences are the hook-hammer anti-pattern — they bloat the harness and slow every turn for nothing. Keep hooks scarce and meaningful.
Tomorrow morning.
- List incidents your team has had: forgotten tests, leaked secret, wrong package manager.
- Add one hook per incident class —
pre-commit-secret-scan,block-dangerous-bash,block-wrong-pm. - Add
allowManagedHooksOnly: truein managed settings so developers can't disable.
Principle 17 — Skills auto-invoke based on description matching
Statement. A skill's description: field is its activation phrase. Vague descriptions mean the skill won't trigger. Broad descriptions mean the wrong skill triggers. Write the description precisely.
Why it matters. "Skill bit-rot" is the most common reason teams ship a marketplace of skills nobody ever uses. The skill is technically correct; the description doesn't match how anyone phrases the prompt. Writing the description is half the work of writing the skill. The right test is to write the user phrase first and check whether your description matches it.
Tomorrow morning.
- Audit
.claude/skills/*/SKILL.mdfrontmatter. - For each skill, ask: "What user phrase would cause this to fire?" Update the description.
- Test by typing that phrase as a prompt and watching whether the skill triggers.
Principle 18 — Subagents have isolated context; never let them recurse
Statement. Subagents (.claude/agents/) run in their own context window. They cannot spawn further subagents. Use this property: delegate noise to subagents, keep the main thread clean.
Why it matters. Subagent isolation is the harness feature that lets you do expensive exploration without polluting the main loop. Anthropic's multi-agent Research feature uses this extensively — the per-subagent token cap is the load-bearing constraint that makes the topology affordable. The "no recursion" rule is the design that prevents cost explosions.
Tomorrow morning.
- Build one subagent for your noisiest task type — exploration, security review, audit.
- Use it on a task that would normally fill the main context with
ReadandGrepoutput. Watch the main context stay clean. - Don't let a subagent invoke another subagent. Claude Code enforces this; honour it.
Principle 19 — Pin everything
Statement. Claude Code CLI version, model name, skill SHAs, MCP server versions. Pin all of it. Floating versions equal silent behavioural changes.
Why it matters. The "it worked last week" failure mode is the most expensive form of unmanaged drift. A model upgrade silently changed behaviour, you can't reproduce the regression, and you can't even tell when it happened. Goldman Sachs and other named deployments run pinned for a reason. Pinning costs a one-line config; not pinning costs investigation time you'll wish you had spent on the configuration.
Tomorrow morning.
- Set
minimumVersionin managed settings (Claude Code CLI pin). - Set
ANTHROPIC_DEFAULT_SONNET_MODELto a specific version, notsonnet. - Pin skills, MCP servers, and plugins to SHA or version tag in the marketplace.
Principle 20 — Stage 5 (Distribution) is the team multiplier
Statement. The five-stage maturity model bottoms out at Stage 5 — Distribution via a plugin marketplace. This is where teams stop reinventing and start inheriting.
Why it matters. Stage 4 produces individual mastery. Stage 5 produces fleet capability. Spotify Honk's 650+ PRs/month, Box's 85% daily adoption, and the Anthropic-Accenture 30,000-developer rollout are all Stage 5 deployments — the maturity-model post names the trajectory. The Stage 4 plateau — a team that masters subagents but never packages — is the named anti-pattern. Each new repo, each new team starts from zero. Distribution fixes that.
Tomorrow morning.
- Identify your three most-used skills, agents, or hooks.
- Sanitise them — remove personal references, hard-coded paths.
- Push them to a private marketplace repo.
- Tag
v1.0.0. Add toextraKnownMarketplacesfor one other team.
The harness in one line
CLAUDE.md is an index, hooks are for incidents, skills live by their description, subagents isolate noise, pin everything, and Stage 5 is where one team's discipline starts benefiting the next team.
Part 4 covers principles 21–25: the governance and safety layer that keeps the harness defensible.
Series Navigation — The 30 Principles for Agentic Engineering
- Part 1: The Kernel
- Part 2: The Lifecycle
- Part 3: The Harness (you are here)
- Part 4: Governance and Safety
- Part 5: Calibration and Reality
The Cutler.sg Newsletter
Weekly notes on AI, engineering leadership, and building in Singapore. No fluff.
Standardise the Harness, Customise the Work: The 5-Layer Agent Architecture
Three open-source extractions converged on the same five layers. The architecture isn't a vendor narrative — it's a discovered structure. Here's the decision rule that keeps you from over-engineering it.
From Solo Tool to Team Infrastructure: Scaling Gluon for Production
When I first built Gluon on my Mac mini, I was solving a personal problem: monitoring Claude agents without losing my mind to tmux logs. But when teams join the picture, everything changes — security, governance, observability, and the fundamental role of the developer. Here's what production infrastructure for autonomous agents looks like.
The 15-Tool-Call Rule: Where Agent Quality Falls Off a Cliff
Practitioner consensus puts the cliff around fifteen tool calls per prompt. Here's why agents degrade past that, and the three operational rules that keep them on the safe side.