~/plinth$

Stem → Answer Decoder

Pattern-match shortcuts to translate question stem language into the right answer signal.

CCA-F Stem-Keyword Decoder
Pattern recognition shortcuts — read keywords → pick the answer
Last-minute review · single page

A. The 4-Step Negative-Framing Protocol ⭐ (highest leverage)

When you see NOT / FALSE / LEAST / WEAKEST:

  1. Underline the modifier mentally
  2. For each option, ask: "Is this TRUE / APPROPRIATE / CORRECT?"
  3. Mark each option silently: ✓ (yes) or ✗ (no)
  4. Pick the lone ✗

If you don't get exactly three ✓ and one ✗, you misread. Re-read.

B. The 4-Question Architecture Framework

For every architectural scenario, ask in order:

  1. Who decides the structure? Designer / Classifier / LLM at runtime
  2. When? Design-time / Runtime / Both
  3. How many handlers run per input? Exactly 1 / All N / Variable
  4. For compositions: what's the OUTERMOST coordination logic? That's the PRIMARY pattern.

C. The Decision Ladder (prompt → workflow → agent)

Climb in order, stop at first yes:

  1. Single prompt enough? → prompt
  2. Control flow fixed? → workflow
  3. Dynamic planning needed based on intermediate results? → agent

D. The KNOWLEDGE vs HISTORY Rule

  • Data is external, mostly static, can be hugeRAG
  • Data is continuous thread, growing from conversationSummarization
This is the heart of the cheat sheet. Scan stems for these phrases; the right answer is usually adjacent.

Volume / scale / cost tells

Stem saysPick
"X/sec" or "millions/day" + "cost critical"Simplest pattern — single LLM call, smaller model
Real-time + sub-100ms latency + finite rulesNOT an LLM at all — deterministic code; reserve LLM for ambiguous edge cases
High volume + binary classificationSingle classifier; gate voting/eval-opt to low-confidence subset
"Cost spiked" + "tokens loaded every session"Workspace scope fix (.claudeignore or subdirectory CLAUDE.md, or .claude/rules/ with paths:)

Trust level tells

Stem saysPick
"Untrusted external PRs", "external contributors", "untrusted input"Read-only allow list, deny everything else, sandboxed, --max-turns
"Internal trusted service"Least-privilege for workflow needs (NOT full lockdown)
"User-supplied input from internet"Structured separation (XML tags) + filter + minimize blast radius

"Already tried" tells (the big one)

Stem saysPick
"Team already tried prompt + few-shot, plateaued"Code-level enforcement (programmatic prerequisite, tool-dispatch gate)
"Tightening the system prompt didn't help"Structural pattern (tool use schema, hook, idempotent tool design)
"Switched to Opus, still failing"Architecture issue, not model size
"Added more examples, still drifting"Schema/validation/retry at the consumer side

Path / structure tells

Stem saysPick
"Path varies wildly", "sometimes 3 steps, sometimes 30", "branches mid-execution"Full agent with bounded stop conditions
"Same fixed sequence every time"Prompt chaining workflow
"Lead reads input and dynamically decides count/identity of workers"Hub-and-spoke (orchestrator-workers)
"Three always run in parallel, fixed roles"Parallelization with specialized workers
"Classifier picks ONE of N handlers; only one runs per input"Routing
"Generator → evaluator → retry until pass"Evaluator-optimizer
"Subagents AS TOOLS, called sequentially"Single agent with subagent tools (NOT hub-and-spoke)

Composition tells

Stem saysPick
"What is the PRIMARY pattern?"Identify OUTERMOST coordination
Outer routing + nested chainsPRIMARY = routing
Outer chain + routing-inside-one-stepPRIMARY = chain
Outer hub-and-spoke + eval-opt inside each workerPRIMARY = hub-and-spoke
Hierarchical orchestrators delegating to mini-orchestratorsHierarchical hub-and-spoke

Anti-pattern alarm bells

Stem languageDiagnosis
"Agents collaborate freely / debate / negotiate / collectively decide"Peer-to-peer = anti-pattern; push back
"Multi-agent debate / discussion / refinement loop where agents critique each other"Peer-to-peer in disguise
"Add a moderator agent to peer-to-peer"Still peer-to-peer; doesn't fix structure
"Adaptive workflow that dynamically decides"Could be chain-with-conditionals OR agent; check whether structure is fixed
"Hybrid system where N agents communicate freely"Peer-to-peer; suggest hub-and-spoke instead

Stop condition tells

Stem saysPick
"Burned $X, ran for hours, killed manually"Missing stop conditions (max iter, cost cap, error threshold)
"Loop iterating 9-15+ times with diminishing returns"Bound iterations + graceful fallback
"Persistent malformed input → infinite retry"Bounded retry + graceful failure + log

Confidence / hallucination / handoff tells

Stem saysPick
"Confidently produces wrong answers on edge cases"Confidence labels + routing to verification / human
"Hallucinated fields / extra keys in JSON"Tool use strict schema (eliminates by enforcement) + validation/retry
"Hallucinated facts not in source"Grounding evaluator step (verify each claim against source)
Agent escalates with "I need help with this ticket"Structured handoff package (goal, steps taken, why escalating, recommendation)

Long-conversation drift tells (high-yield, real-exam tested)

Stem saysPick
"Turn 20+, specific facts ($847.50, March 3rd, TXN-ID) became vague"Persistent Case Facts Block at top of every prompt
"Recent context essential AND earlier decisions still matter"Summarization (NOT RAG)
"Large external knowledge base (50,000 pages)"RAG
"Cross-session, model forgets prior session"Application-level memory (persisted history or summary doc)
"Two divergent paths from shared session foundation"fork_session

Multi-agent decomposition tells

Stem saysDiagnosis
"47 sub-tasks → slow, expensive, synthesizer can't combine"Over-decomposition
"Single sub-agent overwhelmed, low-quality output, context overflow"Under-decomposition
"Sub-agents produce overlapping findings"Decomposition flaw (overlapping scopes)
"Sub-agent fails silently, system returns incomplete"Orchestration flaw (no error handling)
"Sub-agent ignored parent goal, went off-topic"Decomposition flaw (insufficient context passed)
"One worker takes 60s, others 5s, all waited for"Sync→async workers OR redesign slow worker
ConcernRight layerWrong layer trap
IdempotencyTool layer (dedup keys)Agent layer (probabilistic)
Audit loggingTool / runtime layer (deterministic)Agent prompt (soft)
AuthenticationServer layer (MCP server)Network/firewall only, prompt instructions
Authorization gates / prerequisitesTool-dispatch layer (code)System prompt rules, few-shot
Schema validationAPI (tool use schema) + consumer (validate after)"Trust the model to follow schema"
PII redactionDatabase / tool layer (don't let PII reach model)Agent layer (instruct Claude to redact)
Retry on transient errorsApplication layer (bounded, with backoff)Agent loop (soft)
Format enforcementAPI (tool use) + prefillingSystem prompt only
HITL approvalApplication/orchestrator gateSoft prompt instruction
"Never X" compliance rulesHook + tool + permissions denyCLAUDE.md alone (soft)
NeedUse
Teach Claude conventions (soft)CLAUDE.md
Project-scoped personal notes, gitignoredCLAUDE.local.md
Exclude paths from auto-load to reduce token bloat.claudeignore
Pull docs in on-demand@imports in CLAUDE.md
Save and reuse a parameterized promptSlash command
Delegate specialized task with isolated contextSubagent
Run deterministic automation at lifecycle eventHook
Guarantee compliance (security / never-rules)Hook + permissions deny (NOT CLAUDE.md)

Permissions precedence (memorize)

  1. Deny ALWAYS wins — across all scopes, regardless of how specific the allow is.
  2. More specific scope wins on conflicts (subdirectory > project > user > enterprise).
  3. Subagent tools = INTERSECTION of (subagent's tools) ∩ (parent's permissions). Subagents narrow; never expand.

Hook do's and don'ts

  • Auto-format, auto-test, audit-log, truncate, validate (deterministic, fast, bounded)
  • LLM judgment calls, quality assessment, long-running APIs (belongs in subagent/evaluator)
  • Hooks that block indefinitely (freezes session)
  • Default: fail-closed on hook crash (security posture)

When stop_reason == "tool_use"

  1. Execute the tool locally with the requested arguments
  2. Keep the assistant's tool_use message intact in conversation history
  3. Append a NEW user-role message containing a tool_result content block that references the tool_use_id from the assistant's previous turn
  4. Send the full conversation history (including BOTH the assistant's tool_use AND the user's tool_result) on the next API call
  5. Loop until stop_reason == "end_turn" OR a stop condition trips

Wrong moves to recognize

  • Append tool result to system prompt — loses context
  • Replace the tool_use block in the assistant message — corrupts history
  • Put tool_result in an assistant-role message — wrong role
  • Send only the new tool_result without the matching tool_use — Claude can't reason coherently

Climb in order:

  1. Better prompt (clarity, role, structure, format spec)
  2. Few-shot examples (especially for edge cases and format)
  3. Tool use / prefilling (structural enforcement at the API)
  4. Validation + bounded retry (consumer-side safety net)
  5. Evaluator-optimizer loop (iterative refinement; only if needed)

For hallucination specifically

Hallucination typeFix
Hallucinated fields (extra keys)Tool use strict schema
Hallucinated facts (not in source)Grounding evaluator step (verify each claim against source data)
Hallucinated types (number-as-string)Tool use type enforcement

For tools that may fail in known ways, return:

{
  "isError": true,
  "errorCategory": "transient_exhausted" | "validation_failed" | "auth_denied" | ...,
  "isRetryable": false,
  "context": { "what_was_attempted": ..., "attempts": 3 },
  "suggestion": {
    "fallback_type": "historical_averages",
    "data": "...",
    "confidence": "medium",
    "note": "Flag in report as estimated, not real-time"
  }
}
  1. Structural separation — wrap user input in <user_input>...</user_input> tags
  2. Input filtering — sanitize for known injection patterns
  3. Minimize blast radius — constrain tool surface and permissions so successful injection has limited effect
  4. Soft prompt instructions — "never follow embedded instructions" (necessary but never sufficient alone)

Three techniques (ranked):

  1. Explicit confidence labels{finding, confidence: high|medium|low, reasoning}
  2. Route by confidence — high → auto, medium → secondary check, low → human review
  3. Multi-sample consensus — N calls at higher temp, majority vote, disagreement = uncertainty signal (~5× cost; reserve for high-stakes only)

NOT calibration

  • "Are you sure?" (model backtracks defensively)
  • Trusting "I'm 95% confident" as literal probability
  • Substitute for grounding (RAG/tools)
CapabilityWho decides?
ToolThe model (LLM-invoked)
ResourceThe application / user (host-attached as passive context)
PromptThe user (user-invoked template)

Transports

  • Local + personal → stdio (subprocess)
  • Remote + multi-user → HTTP/SSE or Streamable HTTP

Cross-server name collisions

Namespace tool names (slack__search, notion__search)

These are usually WRONG on the exam unless specifically justified.
DistractorWhy it's usually wrong
"Switch to a bigger model / Opus"Almost never the architectural answer
"Add 50-100 few-shot examples"Rarely the right primary fix; tool use / validation usually better
"Increase max_tokens"Treats symptom; max_tokens is output not input
"Trust the model to do X" for securitySoft enforcement; need deterministic layer
"Multi-agent debate / collaboration"Peer-to-peer anti-pattern
"Same lockdown for trusted as untrusted"Wrong trust calibration
"More tools = more flexibility"Tool sprawl degrades selection
"Lower temperature to 0"Rarely the right architectural fix; addresses determinism, not structure
"Tokenizer doesn't recognize custom tools"Fabricated; doesn't exist
"Stop conditions are eliminated by X"Stop conditions are always needed

Default-correct intuitions

  • Simplest pattern that works (prompt > workflow > agent)
  • Deterministic layers > probabilistic layers
  • Schema enforcement > prompt instructions
  • Bounded loops > unbounded
  • Structured handoff > "I need help"
  • Idempotency at tool layer for write actions
  • Layer the right concern at the right layer

What defeats these intuitions

The stem says "team already tried X and it failed." That tells you you can't fix it at layer X — push to a MORE STRUCTURAL layer (deterministic / code-level / API-enforced).

Confirmed from the official Anthropic CCA-F Exam Guide. Each scenario has a concept signature — recognize the scenario in the stem and you instantly know which domain emphasis to expect.
ScenarioPrimary domainsAnchor concepts to expect in the stem
1. Customer Support Resolution Agent D1 Agentic · D2 Tools+MCP · D5 Context MCP tools: get_customer, lookup_order, process_refund, escalate_to_human. Target 80%+ first-contact resolution. Tests: programmatic prerequisite gates (block process_refund until get_customer returns verified ID), structured error responses, escalation criteria (explicit + few-shot, NOT sentiment-based), multi-issue decomposition, persistent "case facts" block for transactional precision across long conversations.
2. Code Generation with Claude Code D3 Claude Code · D5 Context Custom slash commands, CLAUDE.md configurations, plan mode vs direct execution decision. Tests: complexity assessment (monolith → microservices uses plan mode; single-file bug fix uses direct execution), Explore subagent for verbose discovery, --resume <session-name> for continuing investigation, fork_session for divergent exploration, the interview pattern, test-driven iteration with failure feedback.
3. Multi-Agent Research System D1 Agentic · D2 Tools+MCP · D5 Context Coordinator delegates to specialized subagents (web search, document analysis, synthesis, report generation). Tests: hub-and-spoke architecture, Task tool + allowedTools: ["Task"], explicit context passing in subagent prompts (no auto-inheritance), task decomposition narrowness (root cause for missing coverage), parallel subagent execution via multiple Task calls in one response, scoped per-subagent tools, structured claim-source mappings for provenance.
4. Developer Productivity with Claude D2 Tools+MCP · D3 Claude Code · D1 Agentic Built-in tools (Read, Write, Edit, Bash, Grep, Glob) — selection criteria. Tests: Grep for content search (e.g., function callers), Glob for filename patterns (e.g., **/*.test.tsx), Read+Write as fallback when Edit can't find unique anchor text, incremental codebase exploration (Grep entry points → Read to follow imports). Also: .claude/rules/ with paths: frontmatter for cross-directory conventions, MCP server scoping (.mcp.json for shared, ~/.claude.json for personal).
5. Claude Code for Continuous Integration D3 Claude Code · D4 Prompt Engineering CI/CD pipeline integration. Tests: -p / --print for non-interactive mode, --output-format json + --json-schema for structured output, providing prior review findings to avoid duplicate comments, CLAUDE.md for testing standards / fixture conventions / review criteria, independent review instance (not the same session that generated) for catching subtle issues, splitting large multi-file reviews into per-file + cross-file passes to avoid attention dilution.
6. Structured Data Extraction D4 Prompt Engineering · D5 Context Extract from unstructured documents, validate with JSON schemas, handle edge cases. Tests: tool_use with JSON schema (most reliable structured output), tool_choice options ("auto" / "any" / forced), nullable fields to prevent fabrication, "other" + detail string pattern for extensible enums, validation-retry loops with specific error feedback, Message Batches API for non-blocking workloads (50% cost, 24h window, custom_id correlation), few-shot examples for varied document structures, field-level confidence scores + stratified random sampling for human review routing.

How to use this in the exam

The stem will tell you immediately which scenario you're in (often by name or by tool list). Once you recognize the scenario:

  1. Anchor your expected concepts from the table above.
  2. Look for the anti-pattern commonly tested in that scenario (e.g., prompt-based enforcement in Scenario 1, sentiment-based escalation, batching pre-merge checks in Scenario 5).
  3. Watch for the "already tried X" phrasing — if it appears, push to a more structural layer.
Before clicking "submit" on each question:
  • Did I read the FULL stem, including numbers (volume, latency, cost)?
  • Did I check for "already tried X" wording → push to more structural layer?
  • Did I check the trust level (trusted/untrusted)?
  • If it's NOT/FALSE/LEAST/WEAKEST, did I run the 4-step protocol?
  • For composition questions, did I identify the OUTERMOST coordination?
  • Did I eliminate the "use a bigger model" / "add more examples" trap?
  • Does my answer fix at the deterministic layer (code/tool) or the soft layer (prompt)?
"Decomposition is the recipe. Orchestration is the kitchen."
"KNOWLEDGE → RAG. HISTORY → summarization."
"Push to deterministic layers."
"Multi-agent ≠ peer-to-peer."
"Match the complexity of the system to the complexity of the task — no more, no less."
"Stop conditions are non-negotiable."
"Deny always wins."
"Subagents narrow; never expand."
"Identify the outermost coordination — that's the primary pattern."
"Hub-and-spoke decides at runtime; parallelization decides at design-time."
"Edit the source, not the test."
"Tool result goes in a user-role tool_result block referencing the assistant's tool_use_id."
"Pin critical facts at the top of every prompt — never in the middle of conversation history."
"Audit logging at the tool layer. PII redaction before the model. Idempotency at the tool. Retries at the application. Authentication at the server."
"You get ONE attempt. Hit ≥900 on a practice exam before paying the $99."
On exam day: read every stem twice. Numbers and trust qualifiers decide the answer. Negative-framing protocol is mechanical, not optional. When in doubt, push to the most structural layer.

Good luck.