The Analog Incident Tarot: Designing Physical Fate Cards for Your Next Production Outage
How tarot-style fate cards can transform incident response practice into a collaborative, low-stakes game that improves skills, psychological safety, and systemic resilience.
The Analog Incident Tarot: Designing Physical Fate Cards for Your Next Production Outage
Modern incident response is full of dashboards, alerts, and runbooks—but strangely light on play. We practice outages in docs, in slides, sometimes in chaos experiments, but rarely with tools that are tactile, social, and fun.
Enter The Analog Incident Tarot: a physical deck of tarot-style cards designed to help teams rehearse outages, explore failure, and unpack post-incident learnings—together, around a table.
This isn’t about mysticism. It’s about using familiar, game-like rituals to:
- Turn stressful topics into low-stakes practice
- Gamify retrospectives without losing rigor
- Build shared language for behaviors and patterns
- Make incident drills accessible to beginners and experts alike
In other words: we’re designing fate cards for your next production outage.
Why Analog Cards Belong in a Digital World
Most incident tooling is digital: monitoring dashboards, Slack bots, on-call schedulers. So why bring in analog cards?
1. Physical objects change the social dynamic
Passing a card, placing it on the table, or flipping it over creates a small ritual. That ritual:
- Slows down heated conversations
- Focuses group attention on a shared object
- Makes abstract problems more concrete
In group settings, cards help redistribute power. A junior engineer flipping a “Leadership Step-Up” or “Ask the Basic Question” card can participate more confidently than in a free-form discussion dominated by senior voices.
2. Cards support psychological safety
Traditional postmortems can feel like cross-examination, especially in blame-prone cultures. Cards shift the tone:
- You’re playing a game with scenarios and personas, not defending past decisions.
- The focus moves from “who messed up?” to “how does this system behave when fate deals us this card?”
- Prompts and personas give people language to describe their reactions without self-incrimination.
3. They’re technology-agnostic
A card deck doesn’t care whether your stack is Kubernetes, serverless, or a monolith from 2009. It embodies concepts: conflicting priorities, incomplete observability, ambiguous ownership, unexpected failure modes. That makes it reusable across teams and platforms.
Building Your Analog Incident Tarot Deck
Think of your deck as composed of four primary suits:
- Incident Scenarios (Fate Cards)
- Chaos & Failure Modifiers
- Persona & Behavior Cards
- Reflection & Retrospective Prompts
Each suit supports a different phase of practice: from simulating the outage itself to exploring how your team responds.
1. Incident Scenario (Fate) Cards
These set the stage: what went wrong in your imaginary (or reconstructed) outage?
Examples:
- The Silent Pager – Alerts didn’t fire or were routed incorrectly.
- The Slow Boil – Latency creeps up over hours, only noticed by customers.
- The Phantom Feature Flag – A forgotten flag re-enables a risky code path.
- The Third-Party Eclipse – A dependency degrades or fails entirely.
- The Split Brain – Conflicting sources of truth: logs vs metrics vs traces.
Each card describes:
- Symptoms (what users and systems show)
- Initial visibility (what your tools reveal)
- Stakeholder pressure (customers, execs, external partners)
These cards emulate the messiness of real production issues—but in a controlled, conversational setting.
2. Chaos & Failure Modifier Cards
Borrowing from chaos engineering, these cards layer complexity onto your scenario. Rather than attacking your live systems, they attack your assumptions.
Examples:
- The Missing Runbook – The documented playbook is outdated or gone.
- The No-Rollback Twist – Rollback is impossible due to data or contract changes.
- The Tool Outage – Your primary observability tool is degraded.
- The Surprise Coupling – A “unrelated” service is secretly critical.
- The Weekend Shift – Skeleton crew, senior responders are offline.
Use them to ask:
- How does the team respond when the obvious path is blocked?
- What systemic weaknesses become visible?
- Which assumptions about safety nets fail under this twist?
This is chaos testing for minds and processes instead of machines.
3. Persona & Behavior Cards
Inspired by persona-style decks (like Lean Tarot’s 18 personas), these cards represent common team behaviors and archetypes during outages.
Examples:
- The Hero – Takes over, fixes things, hoards context.
- The Optimizer – Wants to refactor mid-incident to “fix it properly.”
- The Narrator – Communicates well, keeps everyone aligned.
- The Skeptic – Questions assumptions, pushes for more evidence.
- The Vanisher – Disappears when pressure rises.
- The Guardian of Scope – Protects the team from scope creep and distractions.
There are two powerful ways to use these:
-
Role-play during drills
Assign personas at the start of an exercise. Ask people to lean into the archetype and see what happens to team dynamics. -
Pattern recognition in retros
After a real incident, lay out persona cards and ask:- Which personas were present?
- Which were missing (e.g., no one played Narrator)?
- Which did you over-index on (e.g., too many Heroes)?
Persona cards help teams talk about behavior patterns instead of individuals, which softens defensiveness and supports psychological safety.
4. Reflection & Retrospective Prompt Cards
These are your “major arcana”: powerful prompts that guide exploration after (or during) simulated incidents or real outages.
Examples:
- The Hidden Dependency – “What invisible or informal dependencies shaped this incident?”
- The First Misleading Clue – “Which signal sent us in the wrong direction?”
- The Slowed Down Conversation – “Where should we have paused to realign?”
- The Trade-Off Ledger – “What reliability vs velocity trade-offs surfaced here?”
- The System That Remembered – “What logging/metrics/traces helped? How could they be better?”
- The System That Forgot – “Where did our tooling or documentation abandon us?”
Use these to structure post-incident conversations, replacing vague questions (“What went wrong?”) with targeted, repeatable angles.
Running an Incident Tarot Session
Here’s a simple flow you can adapt to team drills, onboarding, or post-incident reviews.
Step 1: Set the Frame
Explain the goal explicitly:
- This is a practice space, not a performance review.
- The purpose is to explore systems and behaviors, not assign blame.
- We’re using cards to uncover patterns and gaps.
Step 2: Deal the Fate
- Draw one Incident Scenario card.
- Draw one or two Chaos Modifiers to complicate it.
The facilitator reads the scenario aloud, answering clarifying questions but resisting the urge to over-specify. Ambiguity is part of the learning.
Step 3: Assign Personas (Optional but Powerful)
- Hand each participant a Persona card (or let them draw).
- Encourage them to inhabit the archetype, but not at the expense of safety—it’s okay to break character for clarity.
Step 4: Play the Incident
Give the group 20–40 minutes to:
- Discuss how they’d detect and diagnose the issue
- Decide who does what (commander, comms, experts)
- Walk through mitigation options and trade-offs
You can add structure:
- Timebox into phases (Detection → Triage → Mitigation → Follow-up)
- Introduce new cards mid-way (“At minute 20, draw another Chaos card”)
- Ask the “Narrator” persona to summarize the evolving story periodically
Step 5: Reflect with Prompt Cards
After the scenario, draw Reflection cards and discuss:
- What surprised you?
- Where did communication break down?
- What documentation or tooling would have helped?
- Which personas were most/least helpful in this scenario?
Capture insights like you would in a retro: notes, action items, systemic changes. The difference is that people typically stay more engaged because the conversation is concrete, interactive, and game-like.
Benefits for Both Newcomers and Veterans
A card-based incident game works across experience levels.
For beginners:
- Concrete scenarios reduce the fear of “saying something wrong.”
- Persona cards give them a defined role and script to lean on.
- They get to rehearse the rhythm of incidents before facing real ones.
For seasoned professionals:
- Chaos modifiers surface brittle assumptions they may not notice.
- Reflection prompts push them beyond technical root cause into organizational learning.
- Personas highlight leadership, communication, and collaboration gaps.
The shared format builds a common language: “We drifted into three Heroes again” or “We totally hit a ‘Missing Runbook’ moment last night.” This vocabulary persists beyond the game and influences real-time behavior.
From Blame to Systems Thinking
The deepest value of an Analog Incident Tarot deck is cultural.
- Instead of asking, “Who’s at fault?” you ask, “What card did the system deal us, and how did we respond?”
- Instead of hiding mistakes, people explore them as possible futures to practice for.
- Instead of heroic firefighting as the only celebrated behavior, you recognize narration, skepticism, and scope-guarding as equally valuable.
By making incident practice playful, physical, and structured, you de-escalate the emotional stakes while keeping the learning stakes high.
Conclusion: Shuffle, Deal, Learn
The next time you’re designing an incident drill or planning a retro, consider leaving the slide deck closed. Pick up an analog tool instead.
Design a simple Incident Tarot with:
- Scenario cards that mirror your real failure modes
- Chaos cards that challenge your safety nets
- Persona cards that surface team patterns
- Reflection prompts that keep the focus on systems, not blame
Then gather your team around a table, shuffle the deck, and see what fate your next “outage” holds.
You’ll still talk about SLIs, alerts, and runbooks. But you’ll also talk about how you think, how you behave, and how your organization responds when the unexpected happens—and you’ll do it in a way that feels safe, engaging, and surprisingly fun.
Production will break again. When it does, you’ll be ready—not because you read another doc, but because you’ve already played the game.