The Two-Notebook Debugger: Splitting Thinking and Doing to Untangle Hard Bugs Faster

Debugging is where real software engineering happens. Features are fun; bugs are where you find out if you actually understand your system.

But under deadline pressure, most of us debug like this:

Stare at the broken behavior
Make a vague guess
Poke at the code
Run it again
Repeat until either the bug or your willpower gives out

There’s a better way—one that lines up both with debugging research and with how our brains actually work: split “thinking” from “doing” using a simple two-notebook system.

In this post, you’ll learn:

Why debugging is mostly about hypotheses and mental models, not just tools
How cognitive load sabotages your debugging
A practical two-notebook workflow for hard bugs
How this decomposes debugging into learnable subskills

Debugging Is a Thinking Problem, Not Just a Tool Problem

Decades of research on debugging and program comprehension converge on a few themes:

Good debuggers form clear hypotheses.
They don’t just say, “Something is wrong in the auth flow.” They say, “I think token.expiry is being interpreted in seconds in service A and milliseconds in service B.” That idea is specific, testable, and falsifiable.
They build and refine mental models.
As they test hypotheses, they continuously update an internal model of how the code, data, and environment behave. When something doesn’t fit the model, that gap is a clue.
They iterate deliberately.
They run experiments, interpret results, and adjust direction instead of thrashing randomly.

Tools like debuggers, profilers, and logs are critical—but only as amplifiers of your thinking. If your mental models and hypotheses are vague, more data just means more noise.

The Hidden Enemy: Cognitive Load While Debugging

Hard bugs usually share one trait: they overload your working memory.

You’re trying to juggle:

Several candidate causes
Partial call stacks and data flows
Edge cases and weird inputs
Whispers from your intuition (“Feels like a race condition… or caching… or both?”)
Pressure from the team or deadlines

Research in cognitive psychology is blunt here: your working memory is small and fragile. When it’s overloaded, your reasoning degrades. You:

Forget which hypotheses you already tried
Rerun the same failed experiment
Tunnel-vision on one theory and ignore contradictory evidence

The solution isn’t to “try harder” mentally. It’s to offload thinking into the right external structure—so your brain is free to reason instead of to remember.

That’s where the two-notebook debugger comes in.

The Two-Notebook Debugger: Overview

Idea: Physically or digitally separate the spaces where you think from where you do.

Notebook 1: Thinking Notebook
Purpose: Capture hypotheses, mental models, plans, and conclusions.
Notebook 2: Doing Notebook
Purpose: Log concrete actions, experiments, commands, and observations.

They can be real notebooks, two documents in your note app, two sections in a debugging template, or even two panes in your knowledge base. The key is conceptual separation.

Why this works:

It structures your debugging as an experiment loop instead of chaos
It reduces cognitive load by offloading memory to external notes
It forces clarity: “What do I actually think is happening?”
It creates a learning trail you can revisit and improve

Let’s walk through how to use each notebook.

Notebook 1: The Thinking Notebook

This is where you plan, model, and interpret. Nothing here is about implementation details like exact commands; it’s about what’s going on conceptually.

A simple template:

1. Problem Statement

Write a clear description of the bug.

Observed behavior: What is actually happening?
Expected behavior: What should happen instead?
Minimal reproduction (if known): Inputs, steps, environment.

This prevents the “bugs that shift under your feet” problem where your understanding of the bug changes mid-debug.

2. Environment & Context

Capture constraints that might matter:

Branch/commit
Services/versions involved
Relevant configs or feature flags

This gives you a snapshot of the world the bug lives in.

3. Current Mental Model

Describe how you think this part of the system works, even if you’re unsure.

For example:

“Request hits Gateway → AuthService → UserService. Auth tokens are validated in AuthService and attached to the request context. UserService trusts that context and doesn’t revalidate.”

You’re not documenting the whole system—just the slice relevant to the bug.

4. Hypotheses List

This is the core.

List each hypothesis as a numbered, testable statement:

H1: “The auth token is missing on requests coming from the mobile client due to an outdated SDK.”
H2: “The token is present but expired, and the expiry check is off by a timezone.”
H3: “The token is valid, but UserService sometimes bypasses auth for cached responses.”

For each hypothesis, add:

Confidence (e.g., 20%, 50%)
Test plan: “Add logging in AuthService to print token + expiry and confirm for mobile vs web.”

You’re turning fuzzy intuition into explicit bets and specific experiments.

5. Results & Model Updates

After each experiment, return here and write:

Which hypothesis you tested
What you expected to see
What you actually observed
How your mental model changes

This is where you cultivate the habit of learning from false hypotheses instead of just discarding them.

Notebook 2: The Doing Notebook

This is your lab log—a chronological record of what you actually did.

Each entry might include:

Timestamp
Hypothesis reference (e.g., “Testing H2”)
Commands run, endpoints hit, data inspected
Key snippets of logs or outputs

For example:

15:42 — Testing H2 (expiry timezone)

Added temporary log in AuthService: log.info("expiry={}, now={}", token.expiry, Instant.now())

Reproduced via mobile client

Observed expiry 5 minutes in the future, now in correct timezone

Conclusion: expiry timezone not the issue → Lower H2 confidence from 40% → 5%

Why keep this separate?

It prevents your Thinking Notebook from becoming a noisy transcript
It lets you reconstruct the sequence afterward for postmortems or docs
It helps you avoid repeating the same failed experiment

You don’t have to be verbose—just enough detail to re-run or understand later.

How the Two-Notebook System Decomposes Debugging Skills

Another benefit of this approach is that it breaks debugging into smaller, trainable subskills:

Hypothesis Generation (Thinking Notebook)
Practice: force yourself to write three plausible causes before testing any. This counters premature fixation on the first theory.
Experiment Design (Thinking Notebook)
Practice: for each hypothesis, design the cheapest, fastest test that can falsify it. Not “rewrite subsystem X,” but “log Y and see if it ever becomes null.”
Experiment Execution (Doing Notebook)
Practice: run the designed test with minimal “just this extra tweak” scope creep.
Result Interpretation & Model Updating (Thinking Notebook)
Practice: always write a one-line summary: “This experiment increased my belief in H3 because…”
Search Strategy (Both)
Over multiple bugs, your notes reveal patterns in how you explore the space of causes. You can refine that strategy deliberately.

By naming and separating these subskills, you make debugging something you can deliberately practice, not just “hope to get better at with experience.”

Choosing Tools for Your Two Notebooks

You don’t need anything fancy, but a bit of structure helps.

Possible setups:

Paper + Digital: Paper for Thinking (easy sketching, freeform), text file or scratchpad in your editor for Doing.
Two Files in Your Repo: bug-123-thinking.md and bug-123-doing.md in a /debug-notes folder.
Note App: One note with two headings: # Thinking and # Doing.

Look for tools that make it easy to:

Add timestamps
Link to code, logs, and PRs
Search past bugs by keyword or component

The more frictionless the notes, the more likely you’ll actually keep them up to date.

Teaching and Coaching Debugging with Two Notebooks

If you mentor junior engineers or teach programming, the two-notebook system also gives you a structure for feedback.

Review their Thinking Notebook: Are the hypotheses specific? Are they updating their model? Or just flailing?
Review their Doing Notebook: Are they running huge, slow experiments instead of targeted ones? Repeating steps?

Because debugging is so cognitively demanding, splitting thinking and doing reduces overload for learners. They don’t have to hold the whole bug in their head; the notes carry part of the weight.

You can even model your own process live: share your screen and narrate while you fill in both notebooks.

Conclusion: Slowing Down to Debug Faster

The two-notebook debugger feels slower at first. You stop to write instead of jumping straight into the code.

But in practice, it:

Cuts down on random thrashing
Makes your reasoning visible and reviewable
Reduces cognitive load, so you think more clearly
Turns debugging into a set of skills you can improve

Next time you hit a gnarly bug, resist the urge to just start poking at it. Open two notebooks—one for thinking, one for doing—and let your notes carry the mental load while you do the real work: building and refining a correct mental model of your system.

You’ll ship fixes faster. More importantly, you’ll understand your code better every time a bug forces you to look closely—which is the most valuable outcome of debugging in the long run.