Legacy Code

Legacy Code is working software that has been declared unworthy of respect by people who did not write it, do not understand it, and cannot explain what it does — but who are nonetheless certain it should be replaced, preferably with something written in whatever language was on the cover of a technology magazine in the last eighteen months.

The word “legacy” is unique in software engineering for being simultaneously a compliment and an insult, depending on context. In every other human endeavour — architecture, law, literature, family — a legacy is something valuable left behind by those who came before. In software, a legacy is something shameful left behind by those who came before, and the shame attaches not to the people who cannot understand it but to the code itself, which committed the unforgivable crime of being written in the past.

“It was legacy.”
“Did it work?”
“It was legacy.”
“Did it work?”
— The Consultant, asking the question that ends all architecture reviews, Interlude — The Blazer Years

The Definition

Legacy code has been defined in several ways, each revealing more about the definer than the defined:

Michael Feathers (Working Effectively with Legacy Code, 2004): “Code without tests.” This is precise, actionable, and wrong in the way that most precise, actionable definitions are wrong — it captures one dimension of the problem and ignores every other. Code with 100% test coverage that nobody understands is still legacy. Code with zero tests that one developer wrote last week and fully understands is not legacy. The tests are not the issue. The understanding is the issue.

The Industry: “Code that is old.” This is the definition in practice. Code written three years ago is mature. Code written five years ago is legacy. Code written ten years ago is a liability. Code written twenty years ago is infrastructure. The code has not changed. The calendar has.

The Honest Definition: Code written by someone else, or by you more than two years ago, in a style or language that is no longer the one you would choose today. The code works. The code has been working. The code will continue working. But the code is unfashionable, and in software engineering, unfashionable is the most damning adjective available.

The Lizard’s Definition: Code that has survived.

The Legacy Paradox

The longer code runs without failure, the more likely it is to be called legacy. This is the paradox: survival is punished.

A Java application from 2009 has been processing transactions for seventeen years. It has survived OS upgrades, JVM migrations, framework fashions, three CTOs, two reorgs, and the entire React era. It starts in four seconds. It serves twelve million requests per day. Its error rate is 0.001%. It has never been featured in a conference talk.

It is legacy.

A React application from 2023 has been rewritten twice, migrated from Create React App to Vite to Next.js, upgraded through four major versions, and requires a build pipeline with eleven dependencies. It serves three hundred thousand requests per day. Its error rate is 0.3%. It has been featured in seven conference talks.

It is modern.

The Java application processes forty times the traffic at three hundred times the reliability. The React application has a blog post. The industry considers the React application superior, because the industry measures code not by what it does but by when it was written and which Twitter accounts endorse it.

"‘Legacy’ means: old enough to be blamed, too functional to be deleted."
— Technical Debt

The Archaeology of Understanding

Legacy code is not a technical problem. It is an epistemological problem — a problem of knowledge, specifically the knowledge that walked out the door when the original developers left.

The code itself is a perfect record of what was decided. Every if statement is a decision. Every edge case handler is a lesson learned in production. Every comment that says // DO NOT REMOVE — see incident #4732 is a scar from a battle that was fought and won by someone who is no longer available to explain what happened.

The new developer reads the code. The new developer does not understand it. The new developer has two options:

Understand it. Read the code. Trace the logic. Find the tests (if they exist). Talk to the users. Discover why every decision was made. This takes weeks. This is boring. This does not produce a conference talk.
Replace it. Declare the code legacy. Propose Rewrite. Build a new system that handles all the cases the new developer can imagine, which is a subset of the cases the old system handles, because the old system handles cases that can only be imagined after encountering them in production.

Option 2 is chosen approximately 90% of the time. Not because developers are lazy — they are not. Because understanding someone else’s code is one of the hardest things in software engineering, and replacing it feels like progress, and progress feels better than study.

“The code is the documentation. The code is COBOL. The COBOL stays.”
— COBOL

The Consultant’s Question

The single most powerful question in the presence of legacy code is the one The Consultant asked during the Blazer Years:

“What worked before this?”

This question is devastating because it inverts the narrative. The standard narrative is: the old system is broken, we need a new one. The consultant’s question asks: was the old system actually broken, or did it merely become unfashionable?

A financial services company replaced a working monolith with forty-seven microservices. The monolith was “legacy” — it was written in Java, it was five years old, it had 47-millisecond response times, and it ran on one server at £400 per month.

The forty-seven microservices had 2.3-second response times, 17% error rates, and cost £47,000 per month.

The monolith was legacy. The microservices were modern. The monolith worked. The microservices did not.

“Your monolith worked. Your forty-seven microservices don’t. That’s not theory. That’s your Grafana dashboard.”
— The Consultant, Interlude — The Blazer Years

The consultant’s recommendation was the one that nobody wanted to hear: go back to the monolith. Call it something else if you must. Call it a “consolidated service.” Call it a “unified platform.” Call it anything except “legacy,” because the word “legacy” makes the code unsaveable, and the code is not the thing that needs saving — the organisation’s relationship with working software is the thing that needs saving.

The COBOL Testament

COBOL is legacy code elevated to civilisational infrastructure.

Ninety-five percent of ATM transactions. Eighty percent of in-person transactions. Three trillion dollars per day. Written in a language that no university teaches, maintained by developers whose average age is over sixty, running on mainframes that have been continuously operational for longer than most startups have existed.

COBOL is the ultimate test of the “legacy” label. If legacy means “old,” then COBOL is legacy. If legacy means “should be replaced,” then replacing COBOL would cost more than the GDP of most countries, would require rewriting sixty-five years of business logic that nobody documented because the code was the documentation, and would produce — if it succeeded, which it wouldn’t — a system that handles fewer edge cases than the one it replaced.

Every decade, someone announces a project to replace COBOL. The project follows a predictable arc: announcement, discovery (there are 220 billion lines), panic, partial execution, cancellation. The COBOL stays. It always stays. Not because it is good. Not because it is beautiful. Because it works, and working is the only quality that matters at three trillion dollars per day.

The developers who wrote COBOL in 1977 did not know they were writing legacy code. They were writing code. The code worked. It kept working. It worked for so long that it became old, and old became legacy, and legacy became a problem — not because the code changed, but because the industry’s tolerance for age decreased until “working for forty years” was reclassified from “remarkable achievement” to “technical risk.”

The SAP Permanence

SAP represents legacy code’s final form: code so deeply embedded in organisational structure that replacing it would require replacing the organisation.

SAP implementations take 2-7 years. SAP stabilisation takes 3 years. SAP migration to S/4HANA takes 4 years. Some organisations are migrating from systems they only finished implementing five years ago. The cycle is eternal: implement, stabilise, migrate, stabilise, new version. The code is perpetually legacy and perpetually current. The consultants are perpetually billing.

SAP is not legacy in the way that a Java application from 2009 is legacy. SAP is legacy the way that English is legacy — technically replaceable, practically permanent, and deeply weird if you examine it closely, but so thoroughly embedded in everything that replacing it would require a civilisational reset.

Forty thousand configuration parameters. Every parameter interacts with other parameters. The configuration space is effectively infinite. Nobody understands the whole system. Everybody depends on the whole system. This is legacy code’s endpoint: code that outlives not just its authors but its comprehensibility.

The Feathers Method

Michael Feathers, having defined legacy code as “code without tests,” also provided the antidote: don’t rewrite it. Characterise it.

The method:

Identify the change point — the specific place where you need to modify the code.
Find the seams — places where you can alter behaviour without modifying the code itself.
Write characterisation tests — tests that document what the code actually does, not what you think it should do.
Make the change — small, safe, under the protection of the characterisation tests.
Repeat — each change makes the code slightly more understood, slightly more tested, slightly less “legacy.”

This is Refactoring applied to code you don’t understand. It is slow. It is unglamorous. It does not produce a conference talk. It produces something far more valuable: understanding. And understanding is the thing that makes code stop being legacy — not rewriting it in a new language, not migrating it to a new framework, not replacing it with microservices.

Understanding it.

The code was always understandable. The effort was always available. The choice to replace rather than understand was never a technical decision — it was a preference for the excitement of building over the discipline of reading.

Working Effectively with Legacy Developers

A less-discussed variant of the legacy problem is the legacy developer — the developer who understands the legacy code, maintains it, and is routinely undervalued because their expertise is in a system that the organisation has declared unworthy.

The legacy developer knows why the if statement on line 847 checks for a negative date. They know because they were on-call the night the negative date appeared, caused by a timezone conversion in a system three hops upstream, and the if statement was the fix that kept the system running while the upstream team spent six months not fixing the root cause.

The legacy developer is the most valuable person on the team and the least likely to be promoted, because promotions go to people who build new things, not to people who keep old things running. The incentive structure rewards replacement over maintenance, which produces organisations that are very good at starting rewrites and very bad at finishing them, staffed by developers who are very good at greenfield projects and very bad at understanding existing code.

The legacy developer eventually leaves. The knowledge leaves with them. The code becomes slightly more legacy. The cycle continues.

The Lizard’s Position

The Lizard does not have legacy code. The Lizard has code — some of it written yesterday, some of it written six months ago, all of it understood by the one developer who wrote it, who maintains it, who deploys it with scp, and who will never call it legacy because the Lizard does not believe in fashion.

The Lizard believes in function. Does it work? Does it serve the user? Does it run in 47 milliseconds? Then the code is current, regardless of when it was written, regardless of which language it uses, regardless of which conference talk endorses a different approach.

“THE CODE THAT RUNS
IS NOT OLD
THE CODE THAT RUNS
IS ALIVE

THE CODE THAT IS REPLACED
BECAUSE IT IS OLD
DIES YOUNG
AT THE HANDS OF THOSE
WHO COULD NOT READ IT”
— The Lizard

The Litmus Test

The practitioner who encounters code labelled “legacy” is advised to ask:

Does it work? If yes, proceed with caution. Working code has earned the right to be understood before it is replaced.
Have you read it? Not glanced at it. Not skimmed the class names. Read it. Traced the logic. Understood the decisions. If you have not read it, you do not know whether it is legacy — you know only that it is unread, and unread is not the same as unredeemable.
Do you know why every decision was made? If not, the code is not the problem. The documentation is the problem. The solution is documentation, not replacement.
Would the rewrite handle the same edge cases? If you cannot list the edge cases, you cannot rewrite the system. You can only build a new system that will spend its first two years rediscovering the edge cases that the old system handles, at which point the new system will become the old system, and someone will call it legacy.

The word “legacy” in software engineering is not a technical assessment. It is a social judgment — the community declaring that code has aged past acceptability, the way fashion declares that a perfectly functional garment has aged past wearability.

In every other field, a legacy is the best thing you can leave behind.

In software, it is the worst thing you can leave behind.

The code does not know the difference. The code just runs.

Type	Phenomenon
First Observed	The moment after the second program was written and the first became old
Severity	Existential (to the code's reputation, not to its functionality)
Natural Predator	The new architect who doesn't understand it
Tags	principles architecture
Cited in	Anger episode Bar episode Bohrbug episode COBOL episode Coal episode +16 more Denial episode Depression episode EDI episode FORTRAN episode Foo episode Hyrum's Law episode Kernighan's Lever episode Mainframe episode Microsoft episode Oil episode Postel's Law episode Rubber Duck Debugging episode SAP episode Spaghetti Code episode The Hitchhiker's Guide to the Maze of Enterprise IT episode Windows episode