Prompt Engineering & Structured Output

Roughly 20% of the exam.

Prompt structure

System prompt — stable role, constraints, and behavioral rules for the whole conversation. Keep volatile per-request data (timestamps, user IDs) out: it belongs later in the messages, and putting it in the system prompt also destroys prompt caching.
XML-style tags (<instructions>, <document>, <example>) — unambiguous boundaries so data is never confused with instructions. Claude is trained to respect them.
Few-shot examples — the most reliable lightweight fix for format drift. Two or three concrete examples beat paragraphs of description.
Messages rules — first message must be from the user; consecutive same-role messages are merged into one turn.

Long-context technique

Put bulk documents near the top of the prompt, instructions and the question at the end — measurably better on long inputs, and cache-friendly (stable content first).
For document Q&A, ask Claude to extract relevant quotes first, then answer from them — grounding cuts unsupported claims.

Structured outputs

output_config.format with a JSON schema constrains the response to valid, parseable JSON — no regex post-processing.
strict: true on a tool validates tool parameters against the tool schema. Response shape vs parameter shape — know which feature governs which.
Schema subset: objects need additionalProperties: false; recursive schemas and numeric min/max constraints are unsupported; enums, refs, and basic types are fine.
Assistant-turn prefilling (starting the reply with {) was the old trick and now returns a 400 on current model generations — structured outputs are the replacement.

Steering modern models

Newest models removed sampling parameters (temperature, top_p, top_k). Want variety? Ask for it in the prompt — e.g. propose several distinct directions, or instruct varied phrasing across responses.
Recent models follow instructions literally. Aggressive scaffolding written for older models ("CRITICAL: you MUST ALWAYS use X") now over-triggers — dial back to plain conditions ("Use X when …").
On adaptive-thinking models, "think step by step" scaffolding is largely redundant; depth is controlled with the effort parameter.
Response length: instruct brevity in the prompt; keep max_tokens as a hard safety ceiling, not the mechanism — hitting it truncates mid-sentence.
stop_sequences halt generation at custom markers (stop_reason: "stop_sequence") — handy for cutting after one section.

→ Drill this domain in practice mode

Independent community study resource — not affiliated with or endorsed by Anthropic. Claude is a trademark of Anthropic, PBC. All study material and practice questions here are original, written from Anthropic's public documentation. Everything runs in your browser; nothing you answer is stored or transmitted.