Agent design on core-agent

System instructions

Mon, 01 Jan 0001 00:00:00 +0000

AGENTS.md is the most-impactful customization surface in core-agent. The model sees it on every turn. Get it right and the agent behaves consistently across operators and across sessions; get it wrong and the agent acts unpredictably regardless of what skills or tools you wire up.

This page covers the patterns that work, the failure modes to watch for, and how to iterate on an AGENTS.md instead of writing it once and hoping.

Skills

Mon, 01 Jan 0001 00:00:00 +0000

Skills are named, reusable procedures the agent invokes when the task matches the skill’s description. This page covers the design patterns: when to write a skill vs. an AGENTS.md rule, how to write a description that actually triggers, what belongs in the body vs. a references/ file, and how to test that a skill fires when you expect it to.

For the schema and discovery details (file locations, YAML frontmatter, allow/deny lists), see Reference → Skills. This page is the prescriptive companion.

Subagents and wrappers

Mon, 01 Jan 0001 00:00:00 +0000

Two ways to push work off the parent agent:

Agentic tool wrappers (agentic_read_file, agentic_grep, agentic_research, agentic_fetch_url) — synchronous, bounded, single-purpose. The parent calls them like any other tool; under the hood they spawn a focused subtask on a (typically cheaper) model and return only the digest. Raw tool output never enters the parent’s context.
Background subagents (spawn_agent, list_agents, check_agent, stop_agent) — asynchronous, longer-running, multi-turn. The parent dispatches a goal; the subagent works in its own session until done; alerts and completion summaries land back in the parent’s chat.

This page covers when to use each, how to actually get the model to use them (the model-side adoption story is non-trivial), and the failure modes worth designing around.

Cost efficiency

Mon, 01 Jan 0001 00:00:00 +0000

What actually moves the cost needle on core-agent sessions, in rough order of impact. Built-in price tracking and per-model breakdowns make the tradeoffs measurable rather than guesswork; this page covers how to read those signals and the patterns that consistently reduce cost without sacrificing capability.

For the mechanism details see Context management and Configuration → pricing.

The cost hierarchy

A rough sense of where dollars go on a typical coding session:

Driver	Typical share of session cost	Lever
Input tokens (cumulative across turns)	70-90%	Compaction, checkpoints, agentic wrappers
Output tokens	10-25%	Tighter `AGENTS.md` (“be concise”), structured output formats
Per-turn model rate	Multiplier on both	Model selection (Pro vs Flash for subtasks)
Subtask + compaction LLM calls	2-10%	Bounded by infrastructure, not operator-tunable

The input-token share is the killer. Every turn re-sends the entire conversation history to the model. By turn 30, you’re paying input-token prices for everything that came before, plus the new turn. The single biggest lever on cost is preventing that input-token total from growing without bound — which is exactly what compaction, checkpoints, and agentic wrappers do.