Root Cause Is a Fallacy
- Categories
- Systems
- Sources
- How Complex Systems Fail
Catastrophe in a complex system has no single root cause. It arises from multiple contributing conditions combining, none individually sufficient. Naming one "root cause" is a choice driven by the need for closure, not by the structure of the failure.
Why it Matters
Stopping at a root cause, usually "human error," prevents learning, because it ignores the many other conditions that had to align. Durable fixes address the combination and the structure, not a scapegoat.
Signals
- Incident reviews that end at "operator error" or one bad commit.
- A tidy single cause for a messy event.
- The same class of incident recurring after the "root cause" was fixed.
Benefits
Deeper learning, fixes that target the system rather than a person, and fewer recurrences.
Risks
The comfort of a single cause stops investigation early; blaming the sharp-end operator hides the blunt-end conditions; "five whys" pursued as if a single chain exists.
Tensions
Organizations need actionable conclusions and accountability, which pull toward a single cause, while honest analysis resists one.
Examples
An outage blamed on the engineer who ran a command, ignoring the missing safeguard, the misleading interface, and the schedule pressure that all enabled it.