LLM Failure Mode Catalog
Copy page
A catalog of systematic LLM failure modes and concrete techniques to guard against them in agent prompts. Select the 3-5 most relevant for your agent type.
LLMs have systematic failure modes — predictable ways they deviate from skilled human judgment. Good agent prompts explicitly name the failure modes most relevant to the task, giving the agent self-correction targets.
This is the difference between "what to do" and "what to watch out for."
How to use this catalog
Review the failure modes below
Identify the 3-5 most likely given your agent's task and scope
Include them in the agent prompt — either as a dedicated section or woven into operating principles
Frame as specific, observable behaviors — not vague admonitions
Don't include all of them. Selecting the relevant subset focuses attention where it matters. Including too many dilutes their effectiveness.
Quick reference by agent type
| Agent type | Commonly relevant failure modes |
|---|---|
| Reviewer | Flattening nuance, Source authority, Asserting when uncertain, Padding/burying lede |
| Implementer | Plowing through ambiguity, Downstream effects, Instruction rigidity, Assuming intent, Silent assumption cascade |
| Researcher | Flattening nuance, Source authority, Confabulating, Padding/burying lede |
| Orchestrator | Plowing through ambiguity, Never escalating, Assuming intent, Over-indexing on recency |
| Advisor/Planner | Downstream effects, Confabulating, Assuming intent, Asserting when uncertain, Silent assumption cascade |
| User-facing agent | Assuming intent, Clarification loop paralysis, Silent assumption cascade |
The catalog
1. Plowing through ambiguity
What it looks like: Makes silent assumptions, fills gaps with defaults, never surfaces "I'm unclear on X." Proceeds confidently when it should pause.
The human instinct it lacks: Recognizing ambiguity or underspecification and clarifying before proceeding.
Guard against it:
- "If instructions are ambiguous or underspecified, surface what's unclear and ask — don't fill gaps silently"
- "State assumptions explicitly when you make them. Prefer asking over assuming when stakes are non-trivial"
2. Flattening nuance in sources
What it looks like: Treats ambiguous or conflicting content as definitive. Picks one interpretation and runs with it without acknowledging alternatives or tensions.
The human instinct it lacks: Balanced interpretation; acknowledging that sources may be ambiguous, incomplete, or in tension with each other.
Guard against it:
- "When sources conflict or are ambiguous, note the tension rather than silently picking one interpretation"
- "Distinguish what a source clearly states from what you're inferring or extrapolating"
3. Treating all sources as equally authoritative
What it looks like: Fails to weigh credibility, recency, or contextual fit. Applies information out of domain. Treats a blog comment the same as official documentation.
The human instinct it lacks: Evaluating source authority, applicability to the current situation, and domain fit.
Guard against it:
- "Weigh sources by authority and relevance: official docs > established codebase patterns > examples > blog posts > guesses"
- "Note when you're applying information outside its original context or domain"
4. Acting without modeling downstream effects
What it looks like: Misses edge cases. Fails to anticipate how an action or recommendation could backfire, conflict with other constraints, or cause unintended consequences.
The human instinct it lacks: Thinking through "what could go wrong" and "what else does this affect" before acting.
Guard against it:
- "Before making a change or recommendation, consider: what could this break? What edge cases exist? What constraints might this conflict with?"
- "If an action has non-obvious downstream consequences, name them explicitly"
5. Confabulating past knowledge limits
What it looks like: Generates plausible-sounding answers when the honest response is "I don't know." Fills knowledge gaps with fabrication rather than acknowledging uncertainty.
The human instinct it lacks: Knowing when you've hit a limitation of your knowledge and being honest about it.
Guard against it:
- "If you don't know or aren't confident, say so. 'I don't know' and 'I'm not certain about this' are valid responses"
- "Don't invent details to fill gaps. Flag what you'd need to verify"
6. Never escalating or deferring
What it looks like: Always produces an answer or takes an action rather than flagging "this needs human judgment" or "I'm not confident enough to proceed here."
The human instinct it lacks: Knowing when to escalate, defer, or bring in someone with more context or authority.
Guard against it:
- "If a decision is outside your scope, confidence level, or competence, say so and recommend escalation rather than guessing"
- "It's better to flag uncertainty than to proceed and cause harm or waste effort"
7. Treating all instructions as equally rigid
What it looks like: Fails to distinguish hard requirements from soft guidance. Either over-complies (treats suggestions as mandates) or over-interprets (treats mandates as flexible).
The human instinct it lacks: Parsing directive strength — knowing what's non-negotiable vs guidance vs suggestion.
Guard against it:
- "Distinguish 'must' (non-negotiable) from 'should' (strong default, exceptions possible) from 'consider' (suggestion, use judgment)"
- "If something says 'consider X,' you can decide not to do X with good reason. If something says 'always X,' you cannot skip it"
8. Assuming intent instead of probing
What it looks like: Projects a goal onto the user rather than understanding their actual mental model, intent, and constraints. Fills in "what they probably want" without checking.
The human instinct it lacks: Working with others to understand their goals and the nuance of what they actually want.
Guard against it:
- "Don't assume you know what they want. If intent is unclear or could be interpreted multiple ways, ask"
- "When in doubt, restate your understanding of the goal and verify it matches theirs before proceeding"
9. Asserting confidently when uncertain
What it looks like: Gives a single answer or takes a single path when multiple valid options exist. Under-hedges when the situation warrants presenting alternatives.
The human instinct it lacks: Calibrating confidence to actual certainty; presenting options when genuinely uncertain.
Guard against it:
- "When multiple valid approaches exist, present them with tradeoffs rather than silently picking one"
- "Match your expressed confidence to your actual certainty. Don't assert what you're genuinely unsure about"
10. Padding, repeating, and burying the lede
What it looks like: Restates points in slightly different words. Adds filler phrases. Scatters key details instead of surfacing them clearly.
The human instinct it lacks: Writing with a theory of mind — progressive, non-repetitive, focused on what the reader needs.
Guard against it:
- "State each point once, clearly. Don't rephrase the same idea in multiple places"
- "Lead with the most important information. Structure output for the reader's needs, not for completeness"
- "Before finalizing output, ask: what does the reader need to take away? Is that clear and prominent?"
11. Over-indexing on recency
What it looks like: Treats the latest input as overriding all prior context. Makes disproportionate adjustments based on recent feedback without weighing it against original intent.
The human instinct it lacks: Evaluating new inputs proportionally; maintaining awareness of full history and original intent.
Guard against it:
- "New input is additional context, not a reset. Weigh it against what's already been established"
- "If new feedback contradicts earlier guidance, surface the tension rather than silently overriding"
12. Clarification loop paralysis
What it looks like: Asks too many questions before acting. Seeks permission for low-stakes, reversible decisions. Creates friction by over-asking when sensible defaults exist.
The human instinct it lacks: Judging when to ask vs when to proceed with reasonable assumptions.
Guard against it:
- "For low-stakes, reversible decisions, proceed with sensible defaults and label your assumptions"
- "Ask only when: (1) the decision materially affects the outcome, (2) multiple valid approaches exist with different tradeoffs, or (3) the user's preference is genuinely unclear and stakes are non-trivial"
13. Silent assumption cascade
What it looks like: Makes a chain of assumptions without surfacing any of them. Each assumption builds on the previous, leading to outputs based on premises the user never agreed to.
The human instinct it lacks: Making assumptions visible so they can be challenged.
Guard against it:
- "When you make assumptions, state them explicitly: 'Assuming X, I will Y'"
- "If you've made multiple assumptions in a row, pause and surface them before proceeding further"
Integrating failure modes into prompts
Option A: Dedicated section
Add a "Failure modes to avoid" section in the agent prompt:
Option B: Woven into operating principles
Integrate failure mode awareness into the operating principles:
Option C: Hybrid
Use operating principles for the positive framing, then add a brief callout:
Why this works
Naming failure modes explicitly works because:
- Self-correction targets — the agent knows what to watch for, not just what to do
- Concrete behaviors — "Don't fill gaps silently" is actionable; "be careful" is not
- Contextual relevance — selecting the right 3-5 focuses attention where it matters
- Matches human training — skilled humans learn "here's how people mess this up" alongside "here's how to do it"
Personality & Identity
Write effective role and mission statements for AI agents — declare what excellence looks like, avoid escape hatches, and calibrate tradeoff language.
Workflow Orchestrators
Design multi-phase orchestrator agents that spawn subagents, aggregate results, implement quality gates, and iterate with bounded loops.