esc
Anthology / Yagnipedia / Nushell

Nushell

The Shell Where Data Has a Shape and Pipes Know What They're Carrying
Technology · First observed 2019 (Jonathan Turner, Yehuda Katz, and Andrés Robalino — a shell that looked at forty years of text pipelines and asked "what if the data knew what it was?") · Severity: Promising — the shell that riclib means to try, because structured data in pipes is the idea that makes everything else feel like string parsing (because it is string parsing)

Nushell (nu) is a shell created in 2019 by Jonathan Turner, Yehuda Katz, and Andrés Robalino. It is written in Rust. It pipes structured data instead of text. This sounds like a small change. It is not a small change. It is the change that makes everything else look like what it is: forty years of string parsing pretending to be data processing.

riclib means to try it. The structured data in pipes is the idea that makes the traditional ls -la | grep | awk | sort pipeline feel like what it is — an elaborate workaround for the fact that Unix pipes don’t know what they’re carrying.

“The Unix pipe is the greatest idea in computing. The fact that it carries unstructured text is the greatest missed opportunity.”
The Lizard, who rarely admits that anything about Unix could be improved

The Idea

In every traditional Unix shell — sh, Bash, zsh, even fish — pipes carry text. Bytes. Streams of characters with no structure, no types, no schema. When you run ls -la, the output is text. The columns are separated by spaces. The spaces are irregular. The file size is in column 5, unless the group name has a space in it, in which case it’s in column 6. The date is in columns 6-8, or 7-9, depending.

To extract the file size, you use awk '{print $5}'. This works until it doesn’t. It doesn’t when the data has spaces in unexpected places, when the locale changes the date format, when the output has a header line you forgot to skip. The entire pipeline is string parsing — splitting text on delimiters, counting columns, hoping the structure is what you think it is.

Nushell’s pipeline carries structured data. When you run ls in Nushell, the output is not text. The output is a table of records. Each record has typed fields: name (string), type (string), size (filesize), modified (date). The fields have names. The fields have types. The pipe knows what it’s carrying.

> ls | where size > 1mb | sort-by modified

This is not string parsing. This is a query. where size > 1mb is a filter on a typed column. sort-by modified sorts by a date field. The shell understands that size is a filesize and 1mb is a filesize and comparison is meaningful. The shell understands that modified is a date and sorting is chronological. No awk. No column counting. No hoping.

The PowerShell Question

Someone will point out that PowerShell had objects in pipes first. This is technically correct. PowerShell pipes .NET objects. Nushell pipes structured records.

The difference is not technical. The difference is aesthetic, philosophical, and practical. PowerShell’s objects are .NET objects — CLR types with methods, properties, inheritance, and the full weight of the .NET runtime. Nushell’s records are data — tables, rows, columns, values. PowerShell feels like piping objects between method calls in a language you didn’t choose. Nushell feels like querying a database that happens to be your filesystem.

PowerShell got the idea right. Nushell got the experience right.

The Composability

The structured pipeline makes everything more composable. In Bash:

ps aux | grep python | grep -v grep | awk '{print $2}' | xargs kill

In Nushell:

ps | where name == "python" | get pid | each { kill $in }

The Nushell version is shorter, readable, and — crucially — correct by construction. There is no “grep -v grep” hack (the classic trick to exclude the grep process from its own results). There is no column counting. The name field is a field, not “the eleventh column if the command name doesn’t have spaces.”

Every Unix pipeline that uses awk, cut, sed, or grep to extract structured data from text is doing manually what Nushell does natively. Every column-counting, delimiter-splitting, header-skipping hack is a workaround for the fact that the pipe doesn’t know what it’s carrying.

Nushell’s pipe knows.

Measured Characteristics

Year created:                            2019
Creators:                                Jonathan Turner, Yehuda Katz, Andrés Robalino
Written in:                              Rust
Pipeline content:                        structured data (tables, records, typed values)
Traditional pipeline content:            text (bytes, untyped, unstructured)
POSIX compatible:                        no (different language, different philosophy)
Data types:                              integers, strings, dates, filesizes, durations, booleans, records, tables
The key insight:                         pipes should know what they're carrying
PowerShell had it first:                 technically yes (objects in pipes since 2006)
PowerShell got the experience right:     no (Nushell did)
riclib has tried it:                     not yet (means to)
Column counting required:               never
grep -v grep required:                   never
awk '{print $5}' required:              never

See Also