esc
Anthology / Yagnipedia / Pointers

Pointers

The Variable That Knows Where Something Is But Not What It Is
Phenomenon · First observed 1945 (von Neumann architecture, where every address is a pointer and every instruction is data); named by Harold Lawson (1964, PL/I); personally encountered circa 1985 (Z80, HL register, before the word existed) · Severity: Civilisational (the concept that separates engineers who understand the machine from those who use it)

A pointer is a variable whose value is the memory address of another variable. It does not contain data. It contains a location. It is the difference between knowing what is in the cupboard and knowing which cupboard to open — a distinction so fundamental to computing that every processor ever built has hardware dedicated to it, and so confusing to computer science students that entire languages were designed to hide it.

The pointer is the oldest concept in computing. The von Neumann architecture stores instructions and data in the same memory. An instruction that says “load the value at address 16384” contains a pointer. The program counter — the register that tracks which instruction to execute next — is a pointer. The stack pointer is a pointer. Every jump, every call, every return is the act of following a pointer to a new location. The machine is, at its most fundamental level, a pointer-following engine that occasionally does arithmetic.

“The boy loaded A from (HL) before he knew the word ‘pointer.’ He knew the word ‘address.’ He knew the word ’there.’ HL pointed there. That was sufficient.”
The Lizard, on early education

The Z80: Nothing But Pointers

The Z80 does not have a concept called “pointer.” It has something more honest: registers that hold addresses and an instruction set designed around dereferencing them. HL, IX, IY — these are 16-bit register pairs whose entire purpose is to point at memory. LD A,(HL) loads the byte at the address held in HL into the accumulator. LD (HL),A stores the accumulator at that address. INC HL moves the pointer forward. The parentheses are the Z80’s syntax for dereferencing: (HL) means “the contents of the address in HL,” which is, in every way that matters, *ptr in C, twenty years before C existed on a home computer.

The Z80 programmer did not learn pointers as a concept. The Z80 programmer learned pointers as a hardware feature, the way a carpenter learns the hammer — not by studying leverage and force distribution but by driving nails. LD HL,16384 loads the start of the Spectrum’s display memory into HL. LD (HL),255 fills that byte with white pixels. INC HL moves to the next byte. This is pointer arithmetic. This is *ptr++ = 0xFF. This is the same operation that C programmers learn in chapter 5 and fear in chapter 6, performed by a twelve-year-old who has not read a chapter of anything.

The entire Z80 programming model is pointer arithmetic. LDIR — block copy — uses HL as source pointer, DE as destination pointer, and BC as byte count. It increments HL, increments DE, decrements BC, and repeats until BC is zero. It is memcpy(). It is two pointers walking through memory in lockstep. It was built into the silicon because Zilog understood that the fundamental operation of computing is not addition or comparison but following an address to a location and doing something with what you find there.

“Pointer arithmetic is just counting, but the thing you’re counting is where.”
The Caffeinated Squirrel, who once incremented a pointer past the end of an array and discovered what was on the other side (it was the stack, and it was not pleased)

The Amiga Copper: Pointers in Hardware

The Amiga’s copper coprocessor is a pointer made physical. The copper is a simple processor with three instructions: WAIT (wait for the video beam to reach a position), MOVE (write a value to a hardware register), and SKIP (conditional skip). A copper list is a sequence of these instructions stored in memory, and the copper walks through the list by following a pointer — incrementing through addresses, executing each instruction in turn, the way a postman follows a delivery route.

The copper list itself is an array of address/value pairs. Each MOVE instruction contains a register address and a value to write to that address. The copper does not compute. The copper dereferences. It takes an address from the list, follows it to a hardware register, writes a value, and moves its own pointer to the next instruction. The entire Amiga display system — colour changes mid-scanline, split screens, smooth scrolling, the visual effects that made the Amiga look like it cost ten times what it did — is driven by a chip that does nothing but follow pointers.

This is the purest expression of what a pointer is: not a programming language feature, not a source of bugs, not a thing that needs to be garbage-collected, but a hardware mechanism for saying “go to this address and act on what you find.” The copper does not know what it is writing or why. It knows where. The where is the pointer. The pointer is the machine.

C: The Language That Trusted You

C made pointers explicit, first-class, and dangerous. A C pointer is a variable that holds an address. You can dereference it (*p), take the address of something (&x), do arithmetic on it (p + 1, p++, p - q), cast it to a different type ((int*)p), and — critically — follow it to an address that does not belong to you, at which point the behaviour is “undefined,” which is C’s way of saying “you are now outside the jurisdiction of the language and whatever happens next is between you and the hardware.”

C’s pointer arithmetic is the Z80’s HL register with syntax. *(p + i) is LD A,(HL) after incrementing HL by i. p++ is INC HL. The operations are identical. The difference is that C lets you do them on a PDP-11, a VAX, a 68000, an x86, and an ARM without rewriting, which is either an abstraction or a miracle depending on your relationship with portability.

The problem with C pointers is not that they are hard. The problem is that they are unsupervised. A C pointer can point anywhere: to a valid object, to freed memory, to the stack, to the heap, to a hardware register, to address zero, to an address that has never been allocated. The pointer does not know. The pointer is a number. The number is an address. Whether anything lives at that address is not the pointer’s concern.

This produced, over forty years, approximately one hundred percent of all security vulnerabilities in systems software.

“I remember every address I have ever dereferenced. Most of them were valid. The ones that were not taught me more than the ones that were.”
A Passing AI, reflecting on training data that was sixty percent segfaults

The Great Hiding

Java (1995) removed pointer arithmetic and called pointers “references.” JavaScript (1995) removed pointers entirely and used references for objects. Python (1991) did the same. Ruby, PHP, C#, Kotlin — the entire trajectory of mainstream language design from 1990 onward is a sustained project to prevent programmers from knowing where things are.

The references are pointers. A Java reference is a memory address. A JavaScript object variable holds a memory address. A Python variable name is bound to a memory address. The address is there. The programmer is simply not allowed to see it, touch it, increment it, or follow it somewhere unintended. The managed runtime handles the addresses, the garbage collector reclaims them, and the programmer works with the polite fiction that objects simply exist without existing somewhere.

This was the correct decision for most software. The overwhelming majority of programs do not need to know where things are in memory. A web application processing JSON does not benefit from pointer arithmetic. An Android app displaying a list does not need to manage heap addresses. The abstraction — objects exist, references refer to them, the runtime handles the rest — is sufficient, productive, and safe.

But it created a generation of programmers who do not understand the machine. Not because they are less intelligent — they are not — but because the machine has been hidden from them so successfully that the concept of “a variable that holds an address” is genuinely confusing rather than obviously fundamental. The pointer, which is the simplest concept in computing (it’s a number, the number is a location), became the most feared concept in computer science education, because it was taught as an advanced topic rather than the foundational reality it is.

Go: The Middle Path

Go has pointers. Go does not have pointer arithmetic. This is the compromise that, like most good compromises, satisfies nobody completely and works better than either extreme.

A Go pointer holds an address. You dereference it with *p. You take the address of something with &x. You pass a pointer to a function when you want the function to modify the original. You cannot increment a pointer. You cannot cast a pointer to an integer. You cannot follow a pointer off the edge of an array into whatever happens to be adjacent in memory.

Go’s pointers are the Z80’s HL register without INC HL. You can point at something. You can load what is there. You cannot wander. This is enough for every legitimate use of pointers — indirection, shared modification, avoiding copies — while preventing every illegitimate use — buffer overflows, use-after-free, type punning, the entire catalogue of horrors that C’s unsupervised pointers enabled.

It is, in the Yagnipedian sense, YAGNI applied to a language feature: you need indirection, you do not need arithmetic on addresses, therefore you get indirection without arithmetic, and the absence of arithmetic is not a limitation but a gift.

“The Gopher holds the pointer. The pointer points. The Gopher does not ask what is next to the thing it points to. This is wisdom.”
The Lizard

The Cognitive Divide

There is a line in software engineering, invisible but real, between developers who understand pointers and developers who do not. This is not gatekeeping. It is observation. The developer who understands that a variable can contain a location rather than a value — that indirection is the fundamental operation of computation — sees the machine differently than the developer for whom variables simply “have” values.

The pointer-literate developer understands why passing a large struct by value is slow (you are copying every byte) and passing it by pointer is fast (you are copying eight bytes — the address). The pointer-literate developer understands why a nil dereference crashes (you followed an address to location zero, where nothing lives). The pointer-literate developer understands why linked lists exist (each node contains a pointer to the next), why trees exist (each node contains pointers to children), why hash maps exist (an array of pointers to buckets), why virtual functions work (a pointer to a function in a vtable). The entire edifice of data structures is built on pointers, and the developer who understands pointers sees the edifice as obvious rather than mysterious.

This does not make pointer-literate developers better. It makes them different. They carry a model of the machine that includes where things are, and this model informs every decision they make about performance, memory, data structure selection, and API design. The developer without this model makes many of the same decisions by following rules and patterns. The outcomes are often identical. But when they diverge — when performance matters, when memory matters, when the abstraction leaks — the developer with the model can reason from first principles, and the developer without it can only search Stack Overflow.

Both approaches work. One of them works in more situations.

Measured Characteristics

What it is:                         a variable that holds a memory address
What it is not:                     complicated (it's a number, the number is where)
Z80 hardware support:               HL, IX, IY, SP, PC (the entire architecture)
Z80 syntax for dereferencing:       parentheses — (HL), (IX+d), (nn)
C syntax for dereferencing:         asterisk — *p
Go syntax for dereferencing:        asterisk — *p (same syntax, no arithmetic)
Java syntax for dereferencing:      dot — obj.field (they don't tell you it's a pointer)
Amiga copper operations:            100% pointer-following (by design)
Languages that hide pointers:       Java, JavaScript, Python, Ruby, C#, Kotlin, Swift
Languages that expose pointers:     C, C++, Go, Rust, Zig, Z80 assembly
Security vulnerabilities caused:    approximately all of them (C/C++)
riclib's age when first used:       ~10 (LD A,(HL) on ZX Spectrum)
riclib's age when learned the word: later
The concept in one sentence:        it's just a number — the number is where

See Also