The Analog Outage Story Cabinet of Rivers: Hand‑Drawing Paper Currents to See Where Incidents Really Flow
How hand‑drawn ‘paper currents’ and systems thinking can reveal hidden failure paths that SIEM dashboards miss—turning outages into a navigable river system instead of a chaotic storm.
The Analog Outage Story Cabinet of Rivers: Hand‑Drawing Paper Currents to See Where Incidents Really Flow
Modern security operations centers are awash in data. SIEM platforms ingest logs from everywhere—firewalls, endpoints, cloud services, identity providers—and promise real‑time monitoring and faster incident response. Yet in the middle of a major outage or security incident, it often still feels like flying blind.
Why? Because seeing log events is not the same as seeing how failures actually flow through your system.
This is where an unexpected ally can help: paper.
In this post, we’ll explore how hand‑drawn “paper currents”—analog maps inspired by rivers, ocean circulation, and concept maps—can help teams see cascading failures and systemic risk more clearly. We’ll connect ideas from complex systems education (like STELLA models and thermohaline circulation) to security practice, and show how blending analog mapping with SIEM data reveals patterns your dashboards may be hiding.
The Problem: Dashboards Without Direction
SIEM systems are powerful:
- They aggregate log data from diverse sources
- They provide correlation rules, alerts, and dashboards
- They enable real-time monitoring and faster response
But when a complex failure hits—an outage that hops from identity to networking to application tiers—teams often experience:
- Alert floods with no obvious narrative
- Confusion over causality (what broke first?)
- Blind spots in dependencies that no one realized existed
On screen, you might see:
- Authentication failures in one region
- Latency spikes at an API gateway
- Database connection pool exhaustion
All of these appear as charts, tables, and alerts. But how do they connect? What is the path the incident is taking through your infrastructure? That’s a systems question, not just a logging question.
Cascading Failures: When One Leak Floods the Whole River System
Understanding how cascading failures propagate is fundamental to managing systemic risk.
In complex infrastructures:
- A subtle DNS misconfiguration can cascade into authentication failures
- A single overloaded message queue can stall multiple downstream services
- A rate limit in one microservice can back up traffic across an entire region
These are not isolated incidents. They are flows of failure moving through:
- Dependencies (service A depends on B, which depends on C)
- Shared resources (databases, caches, identity providers)
- Implicit coupling (shared libraries, shared control planes, shared credentials)
Researchers in network science and complex systems tell us that not all nodes are equal. Some are critical mediators of failure propagation:
- They sit on many shortest paths
- They broker traffic between sub‑networks
- They quietly tie together subsystems that “belong to somebody else”
Algorithms that focus on local network topology—how a node is connected to its neighbors and how its neighbors are connected—can help pinpoint these critical nodes. Those are the ones that should be:
- Prioritized for hardening and redundancy
- Monitored with extra care
- Included in game days and failure simulations
But to make this actionable in incident response, teams need a way to actually see these propagation paths. That’s where visual, analog representations can be surprisingly powerful.
Why Hand‑Drawn “Paper Currents” Work
Digital diagrams tend to be tidy, static, and idealized. Real outages are messy. Paper currents embrace that mess.
Imagine a large sheet of paper where your system is drawn not as boxes and arrows, but as a network of rivers and tributaries:
- Core services (identity, DNS, messaging, databases) are marked as deep rivers
- Downstream applications are smaller streams flowing from these rivers
- External dependencies (SaaS, cloud providers, payment gateways) are inflows from outside the map
- Controls (rate limits, circuit breakers, WAFs, IAM policies) are locks, dams, and levees
Now, when an incident begins, you:
- Mark the first visible symptom where it appeared (e.g., failed login at a mobile app)
- Trace upstream along the paper river: what feeds this service?
- Mark every component that shows correlated anomalies in SIEM data
- Watch how the “flood” spreads down and sideways to other services
The value of analog mapping:
- It makes invisible flows visible: how an identity outage propagates to billing, support tools, and admin portals
- It encourages collaboration: multiple team members can literally stand around the map, add notes, and argue about connections
- It surfaces assumptions and unknowns: any dashed line or “we think this depends on X” gets called out
Analog maps are not replacing SIEM. They are giving structure and narrative to what SIEM is already telling you.
Borrowing from Complex Systems Education
If this sounds like teaching tools from science classes, that’s intentional. Complex systems educators have long wrestled with how to help students understand flows, feedbacks, and emergent behavior.
Some inspirations worth stealing:
STELLA‑like Flow Models
Tools like STELLA let learners build models using stocks and flows:
- Stocks: amounts of something (e.g., heat in the ocean, CO₂ in the atmosphere)
- Flows: rates of change (e.g., emissions, radiation in/out)
In security and reliability terms, stocks and flows might look like:
- Stocks: authenticated sessions, queued requests, open DB connections
- Flows: login attempts per second, messages per second, connection open/close rates
Thinking in stocks and flows helps teams ask: Where is this incident actually accumulating, and where is it just passing through?
Thermohaline Circulation and Hidden Conveyors
Thermohaline circulation—the global ocean “conveyor belt”—moves heat around the planet in slow, deep currents that are largely invisible from the surface.
Your infrastructure has similar deep currents:
- Background sync jobs
- Replication streams
- Control planes and configuration propagation
A failure may appear first in a “surface” service (a web API) but actually be driven by disruption in a deep current (control-plane configuration stuck in one region). Mapping these as deep underwater rivers on paper helps teams ask: What invisible current might be carrying this failure?
Energy Balance Models and Trade‑Offs
Simple energy balance models show how small changes (e.g., in reflectivity) can shift the overall climate.
Likewise, small tuning changes in:
- Timeouts
- Retry policies
- Rate limits
can dramatically shift system behavior during incidents. On your paper currents map, these become valves and spillways, helping teams reason about: Where can we release pressure, and where are we just pushing the problem downstream?
Concept Maps: Introducing Systems Thinking to Security Teams
If paper currents are the geography, concept maps are the grammar.
Concept mapping is a simple but powerful technique:
- Write important concepts ("Identity Provider", "API Gateway", "Rate Limit Policy") as nodes
- Connect them with labeled arrows ("depends on", "throttled by", "logs to", "secured by")
Concept maps are ideal for introducing systems thinking in security contexts because they:
- Force clarity on relationships, not just components
- Make controls first‑class citizens (e.g., "WAF mitigates injection attacks", "MFA reduces credential stuffing success")
- Support collaborative mapping during design reviews and post‑incident analysis
Combine this with your paper rivers:
- Rivers show where things flow (requests, credentials, messages, failures)
- Concept map labels show how and why they connect ( trust, control, dependency, mediation )
Over time, your wall of paper becomes an Outage Story Cabinet of Rivers—a living archive of how past incidents have flowed through your system and how controls shaped their paths.
Blending Analog Maps with Digital SIEM Data
The most powerful approach is hybrid: analog for structure and narrative, digital for detail and precision.
Here’s a practical workflow:
-
Before incidents
- Run dependency workshops using concept maps and river metaphors
- Identify critical nodes that mediate many flows
- Tag these critical nodes in your SIEM and monitoring systems
-
During incidents
- Start with the paper map on the wall or whiteboard
- Highlight the first failing service
- Query the SIEM for upstream and downstream services and mark them as you go
- Annotate the map with times and evidence (e.g., "12:03 – auth failures spike", "12:05 – queue lag > 5m")
-
After incidents
- Turn the final annotated map into a case study and keep it in your Outage Story Cabinet
- Review which critical nodes were involved and whether existing controls worked
- Feed insights back into:
- SIEM correlation rules
- Runbooks and playbooks
- Architectural decisions and redundancy plans
Patterns begin to emerge:
- The same two or three rivers repeatedly appear as early mediators of failure
- Particular controls (e.g., aggressive retries) regularly turn local issues into systemic ones
- Certain dashboards are consistently consulted too late in the incident
These insights can be hard to see in an ocean of log lines. They are much easier to grasp when you can literally trace the path of the incident with your finger.
Conclusion: Make Failures Flow on Paper Before They Flood in Production
SIEM systems are indispensable for modern security and operations. But they are only half the story. To manage real systemic risk, teams must understand how incidents move through their infrastructure, not just where they surface.
By:
- Borrowing tools from complex systems education (stocks and flows, hidden currents, energy balance)
- Using concept maps to surface relationships and controls
- Drawing paper currents and rivers to depict flows of traffic, trust, and failure
- Integrating all of this with rich SIEM data
you can transform each outage from chaotic log triage into a narrative of flow that everyone on the team can see and learn from.
Over time, your Analog Outage Story Cabinet of Rivers becomes more than a quirky artifact. It becomes a collective memory of systemic behavior—a guide to where future incidents are likely to flow, and where your next investments in protection, redundancy, and monitoring will matter most.
If your incidents still feel like storms on random dashboards, it may be time to pick up a pen, unroll some paper, and start drawing the rivers your failures really follow.