What we can learn from Unix

Software DesignProgramming

The Unix Philosophy is a wonderfully cohesive way to thing about program execution and composition. Its beauty relies in its simplicity, somewhat unfortunately this is also tied to the fact that Unix is dealing with software on the operating system level. We're going to explore what the Unix Philosophy is and what it allows us to do when working in a Unix shell. All the while exploring how Unix can guide our attempts to create simpler programs at a higher level of the stack.

What is the Unix Philosophy?

The Unix Philosophy as described by Doug McIlroy - Inventor of the Unix pipe - is the following:

  • Write programs that do one thing and do it well.
  • Write programs that work together.
  • Write programs that handle text streams, because its a universal interface.

The core focus of unix is really simplicity. In part this is possible because unix is focused on the operating system level and is therefore dealing with more simple representations, mainly text. As an operating system unix can make more assumptions in regards to what protocols it can use and therefore exert more control over the content.

What is perhaps more critical is that unix commands focused on manipulating text streams. All traditional unix commands work by taking some text as input from a stream, and responding back with more text. For example if I run ls -l in my blog directory, I'll get back the following:

404.html
CNAME
LICENSE
README.md
_config.yml
_data
_drafts
_ideas
_includes
_layouts
_pages
_posts
_projects
_refs
_sass
_site
atom.xml
css
favicon.ico
public

This is just text. Which means I can count the number of lines in it and know how many folders or files I have here, or I can replace some of the text etc. This means that any command can take the input of any another command or be the output to any other command. This is particularly powerful as any possible composition can be made with these small building blocks. This is the most essential ingredient to making unix as useful as it is.

Benefits when working in the shell

Commands like ls, rm or wc do very simple tasks. For those who aren't familiar ls displays the files in your current directory, rm removes a file at the specified path and wc counts characters, words and lines in a file. For example if you want to know how many files or folders you have in a directory you can run the following at the command prompt:

$ ls | wc -l

Here ls gives you a list of files line by line, and wc -l tells you the number of lines in the output of ls. The key idea here is that you can pipe the output of one program into the output of another. This is what it means to pipe one thing into another.

If I want to do something more elaborate that can be done pretty easily as well. Say I'm doing some writing and I want to know how many unique words I've used in a post.

$ cat unix.md | tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq | wc -l

Another even more complicated example. Say you want to know which words in your post you've used a lot of.

$ cat unix.md | tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed 10q

This will print out a listing of the 10 most common words and how often they've been used in the file unix.md. There is a bit of arcane knowledge here, but that isn't necessarily bad. The tr command just replaces whats on the left with whats on the write. So tr -cs A-Za-z '\n' is taking anything that is not a letter and replacing it with a new line.

For comparison here is the equivalent Python script:

from collections import Counter
text = ""

with open("unix.md") as f:
    text = f.read()

c = Counter(map((lambda s : s.lower()),
    filter((lambda s : s.isalpha()), text.split(" "))))

for word, num in c.most_common(10):
    print(num, word)

It's worth noting that the python script will not count words that have a comma after them or some other punctuation. It would not count the word wouldn't as it isn't strictly alphabetical. In order to could it you would have to bring regex or the python string translate method into the mix, but it seemed like overkill.

I hope I've shown that using Unix commands can be incredibly useful in the right scenarios. There's also a lot more than what I've covered here, including AWK and grep.

Can we apply these concepts higher up the stack?

If we know that unix principles are useful, and make building systems and programs easier, how can we apply those insights higher up the stack?

If we write smaller functions that focus on a single task the quality of those functions will go up, the overall quality of the library will increase, but the amount of work an individual will have to put in to learn that library will also increase. Furthermore, the library will probably not be as successful as libraries that give all in one magic methods for common tasks.

Unix requires very active users, who are willing to learn and comb through man pages. Developers are of course willing to do this to an extent, but what they really want are magic boxes to solve their problems. Which if we're honest, who can blame them? The issue is that you end up relying on code that you don't understand and therefore can't debug. This obviously isn't a good situation to be in, and is why full stack developers are sought after. Unix forces you to learn, which slows things down, harms adoption rates, but also produces better developers.

Well we know that writing commands or functions that can be composed is useful, and in order to compose them the functions themselves must generally be stateless. That is if you run a command twice with the same parameters you ought to get the same results. This is true for unix commands and that lesson is being applied daily by functional programmers.

One of the areas in which I think we need to spend more time on is a expansion of text streams as the universal interface. I do think that text is incredibly useful, and very close to being a universal interface outside the unix ecosystem. Where text streams fall short, at least where unix is concerned is when you're working with data formats that are highly nested. For example JSON or XML.

As a result most developers spend the vast majority of their time dealing with parsing data streams and working with text. Of course, the lack of support in unix is not the only issue, but if we look at the manner by which unix modeled text streams they contain characters, which compose the entirely of the stream, as well as special delimiter characters, such as \n or \t. As a result common operations focus on working with characters, lines or words, as defined by the special delimiters or whitespace. There are good simple command line tools like jq for working with JSON in the command line which aim to fix this issue. Essentially the issue boils down to mapping between two different representations, which is not a simple task for a machine to do automatically.

There's a lot to be learned for unix, primarily to keep it simple, and keep it focused. The result is software that is easy to understand, and relatively simple to use.