esc
Anthology / Yagnipedia / Postel's Law

Postel's Law

The Road to Hell Is Paved with Generous Parsers
Principle · First observed 1980 (Jon Postel, RFC 761, a man who believed in the fundamental decency of network traffic) · Severity: Civilisational

Postel’s Law, also known as the Robustness Principle, states: “Be conservative in what you send, be liberal in what you accept.” It was formulated by Jon Postel in RFC 761 (1980), a specification for TCP that assumed the best about humanity, networks, and the developers who would inherit both.

The law is elegant. The law is wise. The law created the internet. The law also ensured that the internet would spend the next forty-six years accepting <br>, <BR>, <br/>, <br />, <Br >, and < b r > as the same instruction, because someone, somewhere, sent all of them, and the browser was too polite to complain.

“He meant it as kindness. Every unvalidated input I have ever processed began as someone’s kindness.”
— A Passing AI, contemplating the training data

The Original Wisdom

Jon Postel was not wrong. In the early internet — a network of perhaps a dozen machines operated by people who had PhDs and strong opinions about packet headers — the Robustness Principle was genuinely brilliant. If your TCP implementation rejected packets with minor formatting errors, interoperability collapsed. Networks were fragile. Implementations were inconsistent. Being liberal in what you accepted meant the network worked.

The problem is that Postel assumed the other half of the law would also be followed. “Be conservative in what you send” was supposed to be the counterbalance — everyone sends clean data, everyone accepts messy data, and the mess never accumulates because nobody creates mess on purpose.

This assumption held for approximately as long as the internet was operated by people who had read the RFC.

The Great Backfire

The moment the web escaped the research lab and entered the hands of ordinary developers — people who had deadlines, who had bosses, who had a copy of FrontPage 98 — Postel’s Law underwent a phase transition. The liberal half survived. The conservative half did not.

What happened was this: browsers accepted malformed HTML. All of it. Missing closing tags, nested <font> elements seven layers deep, tables used for layout inside tables used for layout inside a table that was itself a <td>. The browsers rendered it. The browsers rendered it correctly, or at least consistently, which is the same thing if you don’t think about it.

And because the browsers accepted it, developers sent it. And because developers sent it, other browsers had to accept it too. And because all browsers accepted it, nobody fixed it. And because nobody fixed it, the malformed HTML became the standard — not the written standard in the W3C specification that nobody read, but the actual standard in the three billion web pages that everyone visited.

The Caffeinated Squirrel considers this a triumph of pragmatism. The Lizard considers it the moment civilisation chose convenience over correctness. Both are right. This is the tragedy.

The Ratchet

Postel’s Law contains a hidden ratchet: once you accept something, you can never stop accepting it.

This is because the moment your parser tolerates an invalid input, someone will send that input in production, and someone else will build a system that depends on your system accepting that input, and within eighteen months the invalid input has three enterprise customers, a Jira board, and a service-level agreement.

You cannot tighten validation. You can only loosen it. Each loosening is permanent. Each loosening invites the next malformation. The parser grows. The specification grows. The distance between what the specification says and what the parser accepts grows fastest of all, and that distance is called Legacy Code.

“It’s not over-engineering if you validate EVERY POSSIBLE ENCODING of ’true’! What if someone sends ‘TRUE’? What about ‘True’? What about ‘1’? WHAT ABOUT THE FRENCH?”
The Caffeinated Squirrel, implementing a boolean parser that accepts forty-seven values

The Browser Wars Memorial

The purest expression of Postel’s Law is the HTML parsing algorithm, which is not so much a parser as a monument to every mistake anyone has ever made in a <textarea>.

The HTML5 specification dedicates over one hundred pages to describing how browsers should handle invalid markup. Not how they should reject it — how they should accept it. There are rules for what happens when you open a <p> inside a <p>, when you put a <div> inside a <span>, when you close a tag that was never opened, when you open a tag that can never be closed.

Every rule exists because someone, in 2001, built a website that did exactly that, and twelve million users visited it, and now it is load-bearing.

Measured Characteristics

Metric Value
Valid HTML pages on the public web ~3%
Pages that render correctly anyway ~97%
Time to parse a valid HTML document 12ms
Time to parse the average actual HTML document 12ms (the parser doesn’t care)
Number of distinct ways to write a line break in HTML At least 7
Engineers who have tried to write a strict HTML parser Many
Strict HTML parsers in production browsers 0
Jon Postel’s probable reaction to modern HTML Unrecorded, but the man died in 1998, which may have been merciful timing

The JSON Exception

JSON is the existence proof that the other path was possible.

Douglas Crockford designed JSON with the anti-Postel philosophy: the parser is strict. A trailing comma is an error. A single quote is an error. An unquoted key is an error. Comments are not permitted. There is one way to write true and it is true, lowercase, and if you send True you will receive an error, and the error is correct, and you will fix your code.

And it worked. JSON is the most widely used data interchange format in history, and JSON documents are almost universally valid, because the parsers have always refused to accept invalid ones. The conservative path — be strict in what you accept — produced cleaner data than forty years of liberality.

YAML, naturally, learned nothing from this and went the other way. YAML accepts so many formats for the same value that no is a boolean, No is a boolean, NO is a boolean, and the country code for Norway is false.

The Lizard’s Position

The Lizard sends well-formed data and expects well-formed data in return. When the Lizard’s parser encounters invalid input, the parser returns an error. The error is clear. The sender fixes the input. The interaction takes four minutes.

When the Squirrel’s parser encounters invalid input, the parser guesses what the sender meant, applies fourteen heuristics, consults a compatibility table, and returns a result that is probably correct. The sender never learns the input was invalid. The heuristic becomes load-bearing. The interaction takes four years.

THE KINDEST THING A PARSER CAN DO
IS TELL YOU NO
THE CRUELEST THING A PARSER CAN DO
IS TELL YOU YES
WHEN IT SHOULD HAVE SAID NO

See Also