The Analog Incident Compass Train Table: Building a Paper Control Surface for Live Outages

Modern incident response is full of dashboards, alerts, graphs, and automation. Yet some of the most powerful insights about how your organization behaves under stress come from something surprisingly low-tech: paper, markers, and a table.

This is the idea behind the Analog Incident Compass Train Table—a simple, physical “control surface” for running tabletop exercises that simulate live outages. Rather than staring at dashboards, teams gather around a paper map of the system and walk through what they would actually do during an incident.

In this post, we’ll unpack what this approach looks like, why it works so well, and how it helps you uncover the gaps, dependencies, and coordination challenges that your tools alone can’t show.

What Is an Analog Incident Compass Train Table?

Imagine spreading a large sheet of paper across a table. On it, you sketch:

Your key systems and services
External dependencies (providers, partners, APIs)
User entry points (apps, websites, devices)
Critical business processes (payments, onboarding, search, etc.)

You then add movable elements—index cards, sticky notes, or markers—for:

Incidents (e.g., “Database latency spike”, “DNS failure”)
Teams (SRE, security, support, legal, comms)
Decisions and actions (rollback, failover, comms updates)

This becomes your train table: a simplified but recognizable model of your operational world. You “drive” incidents across this landscape like trains on tracks, watching how decisions ripple and where collisions happen.

It’s deliberately analog:

No real systems are touched.
No dashboards are required.
No one is typing commands in production.

Instead, you focus on how humans respond—who talks to whom, who decides what, and how the organization collectively navigates a messy, evolving situation.

Why Tabletops Beat Theory: Practicing Before the Real Outage

Most organizations say they have an incident response process: runbooks, escalation paths, on-call rotations. But in a real outage, theory meets reality—and gaps become painfully obvious.

Tabletop exercises (TTX) close this gap by giving teams hands-on practice with realistic, scenario-based discussions before disaster strikes.

In an Analog Incident Compass Train Table session, you:

Present a realistic outage scenario.
Advance time in small increments (“10 minutes later…”, “30 minutes in…”).
Ask each role: What do you see? What do you do? Who do you talk to?
Move cards and markers on the table to represent system changes, user impact, and decisions.

The magic is in the detail:

How quickly is the incident discovered? By whom?
What assumptions do people have about other teams’ actions?
Does everyone know where to find the incident channel, doc, or runbook?

Because it’s low-risk and conversational, people are more willing to admit uncertainty, ask naive questions, and explore “what ifs” that they might avoid during a real outage.

Clarifying Roles and Responsibilities

In many organizations, incident response roles exist on paper, but not in practice. During a crisis, people default to habit:

Engineers dive straight into debugging.
Managers jump into status updates.
Comms teams wait for clarity that never comes.

A tabletop exercise makes you confront the question: Who is actually responsible for what?

Using the train table, you can physically represent roles:

Place a card for Incident Commander in the middle.
Put Operations, Security, Customer Support, and Communications around it.
As the scenario unfolds, trace who actions and decisions are routed through.

This exposes issues like:

Two people both acting as Incident Commander.
No one explicitly responsible for customer updates.
Legal or security being pulled in too late.

By the end of the session, you don’t just have a better plan—you have shared mental models. People leave with a much clearer answer to:

“In a real outage, what exactly is my job, and how do I coordinate with others?”

Stress-Testing Communication and Coordination

Incidents are rarely a purely technical problem. They’re communication problems under time pressure.

During the Analog Incident Compass Train Table, you can simulate this by:

Introducing conflicting information ("Monitoring says X, logs suggest Y").
Having external partners “go dark” or respond slowly.
Adding concurrent business pressures (a big launch, a marketing campaign, an executive demo).

You then watch how teams communicate:

Does engineering proactively brief customer support, or do they find out from Twitter?
Does anyone inform external partners about impact, or are they forgotten?
Are executive stakeholders updated with a clear, non-technical view of risk?

These exercises reveal communication gaps you might never see in documentation:

Unclear escalation paths.
Overreliance on a single senior person.
Teams that assume “someone else will handle” customer-facing updates.

By making these visible in a low-stakes environment, you can redesign communication flows, channels, and expectations long before you’re in a real crisis.

Revealing Hidden Dependencies and Risks

One of the biggest strengths of scenario-based TTX work is how it surfaces hidden dependencies—the things that “obviously work” until they suddenly don’t.

On your train table, dependencies are visually represented:

Lines between services
Arrows from internal systems to external vendors
Notes about data flows and trust relationships

As you move an outage card across the board (e.g., “Primary database unavailable”), you can ask:

What else quietly fails when this goes down?
Which business processes halt, degrade, or become unsafe?
Are there manual workarounds, or are we stuck?

Frequently, organizations discover:

A single third-party provider is a single point of failure for multiple critical workflows.
Monitoring for upstream systems is fine, but there’s no visibility into downstream user impact.
Backup or failover mechanisms exist but are untested or poorly understood.

Because you’re not touching production, you can safely explore uncomfortable scenarios:

"What happens if our primary and secondary regions are both partially degraded?"
"What if we’re dealing with an outage and a major security incident?"
"What if our main status communication channel itself fails?"

This is the true strength of a tabletop: low cost, high learning, zero production risk.

Low-Cost, High-Impact: Making It Easy to Involve Stakeholders

One of the reasons digital simulations and full-scale chaos engineering can be hard to adopt is that they’re resource-intensive and, frankly, intimidating for non-technical stakeholders.

The Analog Incident Compass Train Table flips that dynamic:

Low-cost: Paper, markers, sticky notes, a meeting room.
Low-stress: No one is breaking anything for real.
Accessible: Executives, legal, finance, and partners can all join and contribute.

This format is ideal for involving:

Customer-facing teams who must explain outages to users and clients.
Risk, compliance, or legal teams who care about obligations and exposure.
External partners whose systems or services are critical to your operations.

By bringing these voices into the exercise, you:

Test complex operational and technical plans against real-world constraints.
Make trade-offs visible (e.g., speed vs. regulatory obligations).
Build trust and shared understanding across organizational boundaries.

And since the stakes are low, it becomes a space where people feel safer saying, “I don’t know” or “We’ve never tried that,” which is precisely the honesty you need to improve.

Turning Insights Into a Stronger Incident Program

A tabletop exercise is only as valuable as what you do afterward. The Analog Incident Compass Train Table naturally generates concrete insights you can act on:

Refine incident response plans: Clarify roles, escalation paths, and decision-making authority.
Update runbooks and playbooks: Add missing steps, remove outdated assumptions, and capture new workarounds.
Improve tooling and visibility: Identify missing alerts, dashboards, or status pages that would have changed the outcome.
Strengthen training: Use lessons learned to onboard new team members and align partners.

Over time, running these exercises regularly demonstrates to stakeholders—executives, customers, regulators—that you are not just reacting to incidents but proactively building readiness.

It’s evidence that you:

Understand your risk landscape.
Practice your response before it’s needed.
Continuously improve based on realistic scenarios.

Getting Started: A Simple Playbook

You don’t need anything fancy to begin. Try this:

Pick a scenario
Choose a realistic outage or incident based on your history or top risks (e.g., “Payment provider downtime” or “Authentication service partial outage”).
Draw your map
On a large sheet of paper, sketch your systems, user entry points, and key external providers.
Assign roles
Identify an Incident Commander, technical leads, support, comms, and any relevant stakeholders.
Walk the timeline
Advance time in increments, narrate evolving conditions, and ask participants to explain what they would see, decide, and do.
Capture insights
Keep a visible list of gaps, surprises, and follow-ups. Don’t solve everything live—just record it.
Debrief and iterate
After the exercise, review what you learned, update your plans, and schedule the next session with a different scenario.

Conclusion: Analog Tools for Digital Resilience

The Analog Incident Compass Train Table is deliberately simple. That’s its strength.

By stripping away complex tools and bringing people around a shared paper model of your systems, you:

Give teams hands-on practice with realistic incidents before they happen.
Clarify roles, responsibilities, and decision paths.
Expose communication breakdowns and hidden dependencies.
Involve key stakeholders without touching production.
Turn insights into stronger plans and greater organizational confidence.

In a world obsessed with automation and dashboards, a pen, a marker, and an honest conversation around a table can do something your tools alone never will: show you how your organization actually behaves under stress—and give you the chance to improve it before the next real outage arrives.