esc
Anthology / Yagnipedia / Kübler-Ross Model

Kübler-Ross Model

The Five Stages of Deploying to Production
Phenomenon · First observed 1969 (Elisabeth Kübler-Ross, On Death and Dying); in software, every day since (On Deployment and Debugging) · Severity: Universal

The Kübler-Ross Model is a framework originally developed by psychiatrist Elisabeth Kübler-Ross in 1969 to describe the five stages of grief experienced by the terminally ill. It has since been adopted — without modification, because none was needed — by the software industry to describe the five stages experienced by a developer who has just been paged at 3 AM, or who has discovered that the production database was the staging database all along, or who has been told that the team is migrating to a new framework.

The five stages are: Denial, Anger, Bargaining, Depression, and Acceptance. They are experienced in order, out of order, simultaneously, repeatedly, and — in the case of legacy system maintainers — permanently frozen at stage one.

“Kübler-Ross described the stages of grief. She did not know she was also describing the stages of a deployment pipeline. She did not need to know. The pipeline knows.”
The Lizard, who has watched developers cycle through all five stages in a single standup

Stage 1: Denial

“It works on my machine.”

This is the opening statement of denial. It is the first thing a developer says when confronted with evidence that the software is broken, and it is always true, and it is always irrelevant, and it is said with such conviction that for a brief, beautiful moment everyone in the room believes that the problem is somewhere else — the network, the load balancer, the DNS, the phase of the moon, anything except the code that the developer wrote and tested and deployed with confidence.

Denial manifests in several documented forms:

The Refresh: “Let me just refresh the page.” The developer refreshes the page. The error persists. The developer refreshes the page again, as though HTTP were a slot machine and persistence might change the outcome.

The Cache: “It’s probably cached.” This is the developer’s universal solvent. Every unexplained behaviour is attributed to caching, because caching is invisible, ubiquitous, and — critically — someone else’s responsibility.

The Alert Dismissal: The monitoring dashboard turns red. The developer looks at the dashboard, looks away, and says “that’s probably just a flaky test” or “PagerDuty has been weird lately” or “it’s probably nothing,” which in software is the equivalent of hearing a noise in a horror film and investigating alone.

Denial can last anywhere from thirty seconds (experienced developers) to three years (legacy system maintainers who have decided that the memory leak is a feature and the crash is scheduled maintenance).

Stage 2: Anger

“Who wrote this?”

git blame is the tool of Stage 2. The developer, having accepted that something is broken, now seeks to determine who broke it. The git blame output scrolls. The developer’s eyes scan the commit history with the focused intensity of a detective reading a crime scene.

The commit log shows a name. The developer reads the name. The developer reads it again.

It is their name.

This is the specific anger of Stage 2: the rage of discovering that the person who caused the problem is you, three weeks ago, at 11 PM, in a commit whose message reads “fix stuff” or “temp workaround” or — most damningly — “this should never break.”

“I once watched a developer spend forty-five minutes in git blame, getting increasingly angry, following the trail of bad decisions across six months of commits, each one authored by the same person, until the developer said — and I am quoting exactly — ‘WHO IS THIS IDIOT’ and then realised the idiot was them, six months younger and apparently six months stupider.”
The Caffeinated Squirrel, who has authored its own share of “fix stuff” commits

Stage 3: Bargaining

“If I just revert this one commit, everything will be fine.”

Bargaining is the stage where the developer believes that the problem has a simple cause and a simple fix. One commit. One config change. One environment variable. If they can just find the one thing that changed, they can undo it, and the world will return to the state it was in before the world broke.

This is sometimes true. When it is true, the developer does not pass through stages 4 and 5. They revert the commit, the tests go green, and they spend the rest of the day with the quiet satisfaction of someone who narrowly avoided a disaster.

This is usually not true.

The bargaining escalates:

“If I revert these three commits…”

“If I restart the service…”

“If I restart all the services…”

“If I drop the cache and restart all the services and rotate the credentials and restart the database…”

Each bargain is larger than the last. Each one represents a widening circle of desperation. The developer is no longer fixing the problem. The developer is negotiating with the problem, offering increasingly expensive concessions in exchange for a green dashboard.

The final form of bargaining is the Hail Mary deploy — a new release pushed to production not because the developer has fixed the bug but because the developer has changed enough things that the bug might not manifest in exactly the same way, which is not the same as fixing it but which, at 4 AM, feels close enough.

Stage 4: Depression

The Hail Mary deploy did not work.

Stage 4 is quiet. The developer has stopped typing. The developer is staring at the screen. The screen shows the same error it showed in Stage 1, which means that every action taken in Stages 2 and 3 — the blaming, the reverting, the restarting, the Hail Mary — has produced exactly zero progress. The developer is in the same place they started, but now it is ninety minutes later and the coffee is cold and the sandwich is untouched and the Slack channel has 200 unread messages, all of which say “any update?”

Depression in software is not sadness. It is the absence of hypothesis. Every idea has been tried. Every theory has been tested. The developer does not know what is wrong, does not know how to find out what is wrong, and has begun to suspect that the code is correct and reality is broken.

The head goes on the desk.

This stage lasts between five minutes and the rest of the developer’s career, depending on whether the developer has a senior engineer nearby who can say the five most powerful words in software: “have you checked the logs?”

Stage 5: Acceptance

The developer checks the logs.

The log shows the error. The error is clear. The error was always clear. The error was in the logs the entire time, from the very first moment of Stage 1, sitting quietly in a file that nobody read because reading logs is the first thing you learn and the last thing you do.

The fix takes four minutes.

The developer pushes the fix. The dashboard turns green. The Slack channel receives a message: “Resolved. Root cause: [two sentences]. Fix: [one sentence]. Time to resolve: 94 minutes. Time the fix actually took: 4 minutes.”

The developer opens a new document. Types “Post-mortem” at the top. Begins writing.

This is Acceptance. Not the acceptance that the software is broken — the software is fixed now. The acceptance that the developer is the kind of person who will spend ninety minutes in denial, anger, bargaining, and depression before checking the logs. The acceptance that this will happen again. The acceptance that the post-mortem will recommend “improve observability” and “check logs first” and that next time, at 3 AM, the developer will not check the logs first, because the Kübler-Ross Model is not a description of a process. It is a description of a person. And the person has not changed.

“The five stages are not sequential. They are not linear. They are a loop. The post-mortem closes the loop and opens the next one. The only developers who escape the loop are the ones who quit. The ones who stay learn to cycle faster.”
The Lizard, who does not cycle because the Lizard checks the logs first, always, because the Lizard is not a person and therefore not subject to the model

The Senior Engineer Exception

Senior engineers do not experience five stages. Senior engineers experience two:

  1. Denial (0.5 seconds): “Hm.”
  2. Acceptance (immediate): Opens logs.

The intervening stages — Anger, Bargaining, Depression — have been compressed into a single, subconscious process that occurs between the “H” and the “m” of “Hm.” The senior engineer has cycled through the Kübler-Ross Model so many times that the stages have been JIT-compiled into a single instruction.

This is sometimes mistaken for calm. It is not calm. It is the scar tissue of a thousand 3 AM pages, compressed into an efficiency that looks like wisdom but is actually fatigue wearing a cardigan.

The Organisational Variant

The Kübler-Ross Model applies not only to individuals but to organisations.

Denial: “Our architecture is fine.” (The architecture has not been fine since 2019.)

Anger: “Who approved this migration?” (Everyone approved it. The Confluence page has forty-seven thumbs-up reactions.)

Bargaining: “If we just add a caching layer…” (The caching layer will become the sixth caching layer.)

Depression: The quarterly engineering review, in which a slide deck demonstrates that velocity has decreased by 40% and everyone nods because they already knew and the slide deck is a ritual, not an investigation.

Acceptance: The rewrite. Which begins the cycle anew, because the rewrite will eventually become the legacy system that triggers Stage 1 in the next generation of developers.

Measured Characteristics

Stages:                                         5
Stages experienced in order:                    rarely
Stages experienced simultaneously:              frequently
Time in Denial (junior):                        3 hours
Time in Denial (senior):                        0.5 seconds
Time in Denial (legacy maintainer):             3 years (ongoing)
git blame searches that revealed own name:      100% (eventual)
Commit messages reading "fix stuff":            too many
Commit messages reading "this should never break": all of them broke
Hail Mary deploys that worked:                  some (terrifyingly)
Minutes to resolution:                          94 (typical)
Minutes the fix actually took:                  4
Minutes spent not checking the logs:            90
Post-mortems recommending "check logs first":   all of them
Developers who check logs first next time:      0
The loop:                                       infinite
Senior engineer stages:                         2
Words in the senior engineer's denial:          1 ("Hm")
Caching layers proposed during Bargaining:      always one more
The Lizard's stage count:                       0 (checks logs first)

See Also