Agentic Architecture & Orchestration
Heaviest domain — roughly 27% of the exam.
Pick the simplest tier that works
Everything goes through one endpoint, but architectures sit on a ladder. The exam loves questions that test whether you reach for the right rung:
- Single call — classification, summarization, extraction, Q&A. One request, one response. If the task is fully specifiable in advance, stop here.
- Workflow — multi-step pipelines where your code controls the sequence and the model fills in steps (often with tool use).
- Agent — the model decides its own trajectory in a loop with tools. Only justified when the task is hard to specify in advance, the outcome is valuable enough to pay for, the model is actually capable at it, and errors can be caught and recovered (tests, review, rollback). Fail any one criterion → drop down a tier.
The agentic loop and stop reasons
The loop is: send request → inspect stop_reason → act → repeat.
tool_use— execute the requested tool(s), return results, continue.end_turn— Claude is done; exit the loop.-
pause_turn— a server-side tool loop hit its iteration limit. Append the assistant content and re-send; the server resumes. Do not inject a "continue" user message. max_tokens— your output cap truncated the response; raise it or stream.refusal— surface to the user; don't blind-retry the same prompt.
When Claude requests several tool calls in one response, execute them all and return all
tool_result blocks in a single user message, each carrying its
tool_use_id. Failures go back as is_error: true with a useful
message so the model can adapt. Set a max-iteration safety cap, but the natural exit is
end_turn.
Tool surface design (the bash-vs-dedicated-tool question)
A bash tool gives breadth but hands your harness an opaque string. Promote an action to a dedicated tool when you need to gate it (confirmation before hard-to-reverse actions like sending email or deleting data), audit it with typed arguments, render it specially, or parallelize it safely. Rule of thumb: start with bash for breadth, promote when the harness needs a hook.
Subagents and orchestration
- The core benefit of subagents is context isolation: exploratory noise stays in the subagent's window; only conclusions return to the orchestrator.
- Caches are model-scoped — switching the main loop's model mid-session destroys its cache. Want a cheaper model for grunt work? Give it to a subagent and keep the main loop pinned.
- Delegate when work fans out across independent items; work directly for single-file reads and sequential steps.
Managing long-running context
- Context editing — prunes stale tool results/thinking within a session.
- Compaction — server-side summarization when the conversation nears the
context limit. Critical detail: append the full
response.contentback each turn — the compaction block is load-bearing state; keeping only the text silently breaks it. - Memory — file-based, survives across sessions. The only one of the three that persists past a restart.
Cost & latency levers
- Effort — lower settings give fewer, more consolidated tool calls and terser output; higher settings explore and verify more.
- Task budgets — a token allowance the model can see and
self-moderate against across a whole loop. Contrast with
max_tokens: an enforced per-response ceiling the model is unaware of. - Tool search — for very large tool sets; discovered schemas are appended, not swapped, so the prompt cache survives.
- Programmatic tool calling — the model writes a script that calls tools; intermediate results stay in the execution container, never entering model context.
- Streaming — mandatory in practice for large outputs; buffered requests
with big
max_tokenshit HTTP timeouts.
Operating agents well
- Give the full task specification up front in one well-specified turn and run at higher effort — drip-feeding scope reduces quality and efficiency.
- Need human approval on destructive actions? Use a manual loop — SDK tool runners auto-execute whatever the model requests.
- Define checkable success criteria and verify against them; separate fresh-context verification beats self-critique.
Independent community study resource — not affiliated with or endorsed by Anthropic. Claude is a trademark of Anthropic, PBC. All study material and practice questions here are original, written from Anthropic's public documentation. Everything runs in your browser; nothing you answer is stored or transmitted.