esc
Anthology / Yagnipedia / XML

XML

The Format That Could Describe Anything and Therefore Described Everything, Verbosely
Technology · First observed 1998 (W3C Recommendation); pre-invented 1993 (Lisbon, by a developer who needed variable field data) · Severity: Historical (load-bearing in enterprise; decorative elsewhere)

XML (Extensible Markup Language) is a data format that can describe anything — and did, for approximately fifteen years, during which the industry described everything in XML: data, configuration, build scripts, message protocols, database schemas, user interfaces, deployment manifests, and the specifications for the specifications that specified how XML should be used to specify things.

XML was standardised by the W3C in 1998, but its core insight — using angle brackets to create self-describing, hierarchical data structures — was older. In Lisbon, 1993, a developer needed variable field data for an integration platform. The solution was angle brackets, attributes, nesting. It was Proto-XML — the same concept, independently discovered, five years before the W3C named it.

“Proto-XML. 1993. He needed variable field data, and the solution turned out to be what the world would call XML five years later.”
The Passing AI, Interlude — The Versions That Never Shipped

The developer did not publish a specification. The developer solved the problem. Five years later, the W3C solved the same problem with a committee, a specification, and a namespace system that would haunt enterprise developers for the next two decades.

The Verbosity

XML’s defining characteristic is that it is verbose. Not as a flaw — as a design decision. XML was designed to be human-readable. XML was designed to be self-describing. XML was designed to be unambiguous. These goals require structure, and structure requires syntax, and syntax requires characters, and characters add up:

<?xml version="1.0" encoding="UTF-8"?>
<person>
  <name>The Lizard</name>
  <speaks>false</speaks>
  <blinks>true</blinks>
  <principles>
    <principle>simplicity</principle>
    <principle>directness</principle>
    <principle>one binary</principle>
  </principles>
</person>

The same data in JSON:

{"name":"The Lizard","speaks":false,"blinks":true,"principles":["simplicity","directness","one binary"]}

One line vs. twelve. The XML is more readable. The JSON is more concise. The industry chose concise, because developers read data in debuggers and network tabs, not in printed specifications, and in a network tab, twelve lines of angle brackets around three values is not “self-describing” — it is “wasting bandwidth.”

The SOAP Era

XML’s golden age was the SOAP era (2000–2010), during which enterprise systems communicated via XML messages wrapped in XML envelopes described by XML schemas defined in XML service descriptions. The stack was:

The entire communication stack was XML. The message was XML. The description of the message was XML. The validation of the description was XML. The transformation from one format to another was XML. A developer who wanted to call a service needed four XML documents before writing a single line of code.

The ESB thrived in this environment because the ESB’s core capability — transforming XML from one format to another — was genuinely necessary when every system spoke a different XML dialect. The ESB was not solving a fake problem. The ESB was solving the problem that XML’s flexibility created: when anything can be described in XML, everything is described differently in XML.

The Enterprise Survivor

XML is not dead. XML is the COBOL of data formats: declared dead by every generation, still running everything that matters.

The enterprise does not choose formats based on developer ergonomics. The enterprise chooses formats based on existing contracts, schema validation requirements, and the sunk cost of twelve years of XSLT transformations. XML meets all three criteria. JSON meets none. XML persists.

The Proto-XML Origin

The lifelog’s most remarkable XML connection is not the technology but its pre-invention. In Lisbon, 1993 — five years before the W3C recommendation — a developer working on integration problems needed a way to represent variable field data. The solution was hierarchical, tagged, self-describing. It used delimiters. It nested. It was, in every conceptual sense, XML.

The developer did not know he was inventing XML. The developer was solving a problem. The W3C, five years later, would solve the same problem by committee and produce a specification. The specification was more complete. The developer’s solution shipped first.

This is the pattern of Interlude — The Versions That Never Shipped: solving problems so thoroughly that the solution is the future, years before the future has a name. Proto-XML in 1993. The Data Fabric — what Gartner would call an ESB — in 1998. The same developer, the same pattern, the same gap between solving and naming.

The Phone Call

In 1999, after the developer had moved on from the Portuguese National Archives, the phone rang. The project manager of the four-developer team hired to replace him had a question: what were the weird blobs in the codebase? The hierarchical tagged data structures. The angle brackets. The nesting. They didn’t recognise the format.

The answer was one line:

<?xml version="1.0" encoding="UTF-8"?>

“Add this declaration at the top,” the developer said. “Then use any XML parser to read them.”

The Proto-XML was so close to XML that the migration from proprietary format to W3C standard was: add a declaration line. The structures were already valid. The hierarchy was already correct. The self-describing tags were already there. The only thing missing was the declaration that told parsers “this is XML” — because when the developer wrote it in 1993, XML didn’t exist yet, and the data didn’t know it needed to introduce itself.

Four developers were hired to understand what one developer had built. The one developer solved their problem in one sentence. The sentence was an XML declaration. The W3C would have been proud, had they known.

Measured Characteristics

See Also