Story Points

Story Points are a unit of estimation used in Agile software development to express the size, complexity, and effort of a task — without using time, without using lines of code, and without using any other unit that might accidentally be verifiable.

They are not hours. They are not days. They are not difficulty ratings. They are a number — typically drawn from the Fibonacci sequence — that a team assigns to a piece of work based on how the work makes them feel, relative to how a different piece of work made them feel last Tuesday.

The practice was introduced to free developers from the tyranny of time-based estimation, which was always wrong. Story points replaced it with feeling-based estimation, which is also always wrong but cannot be proven wrong because the unit of measurement has no external referent. You cannot be late in story points. You can only be surprised.

“Story points won’t measure the complexity you’re adding with every ‘small improvement.’”
— riclib, Mythology Driven Development — Substack Draft

The Fibonacci Liturgy

Story point values are drawn from the Fibonacci sequence: 1, 2, 3, 5, 8, 13, 21. The theological justification is that the gaps between numbers increase as the numbers grow, reflecting the increasing uncertainty of larger tasks. A developer can distinguish between a 2 and a 3. A developer cannot meaningfully distinguish between a 13 and a 15, so 15 does not exist. The developer must choose 13 or 21 — a gap of eight imaginary units that represents either “somewhat uncertain” or “entirely different task,” depending on the monk.

Why Fibonacci and not, say, powers of two? Because Fibonacci numbers feel more natural. Because the golden ratio is aesthetically pleasing. Because if you are going to build a measurement system on vibes, you might as well build it on beautiful vibes.

Some teams use t-shirt sizes instead — S, M, L, XL — arguing that this is “less precise” and therefore “more honest.” The argument about whether a task is a Medium or a Large is acoustically identical to the argument about whether it is a 5 or an 8. The unit changed. The argument did not.

Planning Poker

Planning Poker is the ceremony by which story points are assigned. Each developer holds a set of cards printed with Fibonacci numbers. The Product Owner describes a task. The developers think privately about the task. On the count of three, everyone simultaneously reveals their card.

The simultaneous reveal is critical. If developers showed their estimates sequentially, the first person’s number would anchor everyone else’s. By revealing simultaneously, Planning Poker ensures that the anchor is instead set by whoever shouts their justification the loudest in the three seconds after the cards are flipped.

The ceremony proceeds as follows:

The Product Owner reads a ticket
The team asks clarifying questions (7 minutes)
Everyone holds up a card simultaneously
Three people say 5, two people say 8, one person says 13
The 13 is asked to explain (4 minutes)
The 13 reveals an edge case nobody considered
The 5s adjust to 8, except one who adjusts to 3 because they now consider the edge case trivial
A second round of voting occurs
The team “converges” on 8
The actual implementation takes however long it takes, unrelated to the number 8

The entire process takes twelve minutes per ticket. A well-groomed backlog has forty tickets. The math is left as an exercise.

"Sprint Planning — A meeting in which the team estimates how much work they can complete in the next two weeks, using story points — a unit of measurement that measures neither stories nor points but rather the team’s collective uncertainty about how long things take. The estimates are wrong. They are always wrong. The team knows they are wrong. The process requires them anyway."
— observed during Scrum ceremonies across the known world

The Calibration Problem

For story points to function, the team must share a common understanding of what a “1” means. This is called calibration. A “1-point story” is a task so trivial that estimating it takes longer than doing it. A “3-point story” is the team’s platonic ideal of a normal task — well-understood, bounded, and achievable within a sprint by a competent developer who is not also attending five ceremonies about how to achieve things within a sprint.

The problem is that no two developers, given the same task, will arrive at the same number, because the number is not measuring the task. The number is measuring the developer’s relationship with the task — their confidence, their familiarity with the codebase, their memory of the last time a “simple” task turned into a three-day rabbit hole, and whether they had coffee.

A five-point story and an eight-point story differ by exactly one theological argument.

Velocity: The Chart That Goes Up

Once story points are assigned and work is completed, the points are summed per sprint to produce velocity — the team’s throughput expressed in imaginary units per arbitrary time period.

Velocity is used to predict future capacity. If a team completed 42 points last sprint, they will probably complete approximately 42 points next sprint. This is technically true in the same way that yesterday’s weather predicts today’s weather: mostly accurate, occasionally catastrophic, and fundamentally unable to account for The Caffeinated Squirrel proposing a DistributedStoryPointReconciliationEngine on a Tuesday.

The velocity chart, over time, goes up. It always goes up. It goes up because teams learn to estimate larger numbers. A task that was “3 points” in January is “5 points” by June — not because the task grew, but because the team discovered that larger estimates make the chart go up, and the chart going up makes management happy, and management being happy means the retrospective is shorter.

Nobody had checked the velocity numbers since the Retro.
— The Watercooler — The Morning Five Identical Strangers Shared a Screen and One of Them Opened a Window to Say Hello to a Sixth

When actual velocity was finally measured for a team of one human and eight Claudes, the numbers told a different story entirely. The Retrospective had reported 36 tickets per week — a 5x improvement. The actual measurement was 53. The projected trajectory was 68. The story points, had anyone been assigning them, would have been irrelevant — because the thing that changed was not the estimation method but the architecture. The code got simpler. Simpler code moves faster through any measurement system, or through no measurement system at all.

The Solo Developer Exception

The Solo Developer does not need story points because the estimator is also the implementer. The feedback loop is one skull wide.

When the person guessing how hard something is and the person discovering how hard something actually is share a brain, estimation becomes a private conversation between the prefrontal cortex and the hippocampus. There is no need to externalize this conversation into Fibonacci numbers. There is no need to hold up cards. There is no need to argue about whether something is a 5 or an 8, because the Solo Developer will simply do the thing and know, by the end, exactly how hard it was.

The Solo Developer’s estimation process:

Step	Action
1	Look at the task
2	Think “that’ll take about a day”
3	It takes three days
4	Learn nothing from this
5	Repeat

This is exactly as accurate as Planning Poker, but it takes four seconds instead of twelve minutes, and no cards are harmed.

Story point estimation: Unnecessary; the estimator is also the implementer.
— Solo Developer, on the arithmetic of working alone

The Theology

Story points occupy a unique position in software methodology: they are a measurement that everyone agrees is not a measurement, used to produce forecasts that everyone agrees are not forecasts, tracked on charts that everyone agrees do not mean what they appear to mean.

They persist because they solve a real problem — the problem of coordinating expectations between people who build things and people who wait for things to be built. The builder cannot say “three days” because they do not know. The waiter cannot accept “I don’t know” because they have a roadmap. The story point is the diplomatic compromise: a number that satisfies the need for a number without committing to what the number means.

It is, in essence, a socially acceptable way to say “I don’t know” while holding up a card with an 8 on it.

Agile Coach facilitates this exchange with the solemnity it deserves. The Fibonacci sequence lends mathematical authority. The simultaneous reveal lends democratic legitimacy. The velocity chart lends historical precedent. None of these change the fundamental reality, which is that software estimation is hard, humans are bad at it, and adding a ritual does not make them better — it just makes them feel better about being bad at it.

“Better standups won’t save you from building features nobody needs. Retrospectives won’t undo the microservices the Squirrel added at 2 AM. Story points won’t measure the complexity you’re adding with every ‘small improvement.’”
— riclib, Mythology Driven Development — Substack Draft

The Squirrel’s Position

The Caffeinated Squirrel loves story points. Not because they are useful, but because they are debatable. Every Planning Poker session is an opportunity to propose that a task is actually much larger than it appears, because what if we also need a CachingLayerWithPredictiveInvalidation? The Squirrel’s estimates trend toward 21. The Squirrel’s justifications involve architectural concerns that will never materialize. The Squirrel once estimated a button color change at 13 because “what about the design system implications?”

The Squirrel’s natural habitat is the moment after the cards are flipped, when someone said 3 and someone said 13, and the room must now negotiate the ten-point gap between pragmatism and imagination.

The Lizard’s Position

The Lizard has never estimated a story point. The Lizard’s process is:

Do the thing.
It took as long as it took.
Next thing.

This is not estimation. This is reality. The Lizard does not predict the future because the Lizard is too busy building it.

Type	Phenomenon
First Observed	Circa 2000 (the mists of early XP), though the practice of assigning imaginary numbers to unknowable work predates software by several millennia
Severity	Theological
Natural Predator	Doing the task and finding out how long it actually took
Tags	agile
Cited in	Backlog episode Backlog Refinement episode Burndown Chart episode Dunning-Kruger Effect episode Goodhart's Law episode +11 more Grooming episode Manager episode Performance Review episode Rapid Application Development episode Retrospective episode Sprint episode Sprint Planning episode Systems Thinking episode The Hitchhiker's Guide to the Maze of Enterprise IT episode User Story episode Yagnipedia episode