Prompt Engineering & Structured Output
Roughly 20% of the exam.
Prompt structure
- System prompt — stable role, constraints, and behavioral rules for the whole conversation. Keep volatile per-request data (timestamps, user IDs) out: it belongs later in the messages, and putting it in the system prompt also destroys prompt caching.
- XML-style tags (
<instructions>,<document>,<example>) — unambiguous boundaries so data is never confused with instructions. Claude is trained to respect them. - Few-shot examples — the most reliable lightweight fix for format drift. Two or three concrete examples beat paragraphs of description.
- Messages rules — first message must be from the user; consecutive same-role messages are merged into one turn.
Long-context technique
- Put bulk documents near the top of the prompt, instructions and the question at the end — measurably better on long inputs, and cache-friendly (stable content first).
- For document Q&A, ask Claude to extract relevant quotes first, then answer from them — grounding cuts unsupported claims.
Structured outputs
-
output_config.formatwith a JSON schema constrains the response to valid, parseable JSON — no regex post-processing. -
strict: trueon a tool validates tool parameters against the tool schema. Response shape vs parameter shape — know which feature governs which. -
Schema subset: objects need
additionalProperties: false; recursive schemas and numeric min/max constraints are unsupported; enums, refs, and basic types are fine. -
Assistant-turn prefilling (starting the reply with
{) was the old trick and now returns a 400 on current model generations — structured outputs are the replacement.
Steering modern models
-
Newest models removed sampling parameters (
temperature,top_p,top_k). Want variety? Ask for it in the prompt — e.g. propose several distinct directions, or instruct varied phrasing across responses. - Recent models follow instructions literally. Aggressive scaffolding written for older models ("CRITICAL: you MUST ALWAYS use X") now over-triggers — dial back to plain conditions ("Use X when …").
-
On adaptive-thinking models, "think step by step" scaffolding is largely redundant; depth
is controlled with the
effortparameter. -
Response length: instruct brevity in the prompt; keep
max_tokensas a hard safety ceiling, not the mechanism — hitting it truncates mid-sentence. -
stop_sequenceshalt generation at custom markers (stop_reason: "stop_sequence") — handy for cutting after one section.
Independent community study resource — not affiliated with or endorsed by Anthropic. Claude is a trademark of Anthropic, PBC. All study material and practice questions here are original, written from Anthropic's public documentation. Everything runs in your browser; nothing you answer is stored or transmitted.