On a previous post, we talked about Von Neumann’s Turing Complete model for a computer. A CPU, some Memory, and I/O. It was beautiful, but it had its disadvantages, and an unfortunate legacy. Today, John Backus will lead our journey through these matters.
Enter Sandman: Backus
32 years later, a thorough critique was addressed, ironically, by no one else than the pioneer John Backus himself. Backus, the guy who developed the formal notation for a context-free grammar known as the Backus-Naur Form, the guy who led the first compiler’s development in history, that of Fortran, a language deeply rooted in the Von Neumann model, for Von Neumann machines. During his renowned Turing Award speech, prize given for the aforementioned merits, he ironically slashed it.
The paper, later published as “Can programming be liberated from the von Neumann style?”, is a must for any programmer wanna-be. It’s so brilliant, that it’s hard for me to say anything about it other than telling you: go read it yourself. Seriously, read it.
Ok, but I’ll give it a try nonetheless. As an abbreviation, {p.} will hereby denote “paraphrasing Backus again”, and cursives denote literal quotation. I guess you get it.
The Obesity
Let me ask you: in which language do you program or you’re confident the most? Java? C++? JS? Do you think you know everything about the language? Would you be able to build a parser for your language? Answer the most intricate questions? Have you even read the standard documentation? Do you know how big they are?
Let me give you some examples:
- Fortran 66, 39 pages!
- C++. The first official standard is from C++98: 776 pages. Big big revision, C++11: 1368 pages.
- Java 6, 684 pages. Java 8, 778 pages.
The first thing that Backus notes, is how languages have been following this terrible trend of obesity, growing ever more enormous, bot not stronger.
The Bottleneck
Why is this happening? {p.}[§3] When we think that a Von Neumann machine consists in sending words back and forth between the CPU and Memory, and that a large part of this traffic is not data but names of data, no wonder we can call this communication channel a bottleneck. Backus argues [§4] that conventional languages are basically “high level, complex versions of the Von Neumann computer”. This is not surprising at all when you think about it: “V.N. languages use variables to imitate the computer’s storage cells, control statements (conditionals and loops) elaborate its jump and test instructions, and assignment statements imitate its fetching, storing, and arithmetic.”
And the assignment is the bottleneck of these languages: [cci]a = b + 2[/cci] means: fetch [cci]b[/cci] from memory, load 2 from instruction, add it, post the value back to memory in [cci]a[/cci]„s place. A typical program essentially consists, {p.} [§4, p.616], on a series of assignments to be executed sequentially, and the main concern of a program is handling this flow of words through the bottleneck.
The disorderly world
Assignments have many disadvantages. Giving names to values means names must be carefully scoped and hidden to avoid namespace cluttering, name collision and aliasing, and whatnot. It also {p.} splits programming into two worlds. To the right, an orderly world of algebraic properties, to the left, a disorderly world of statements. Let’s give Backus“ example:
A comparison of an inner product
An inner product in a C‑like language:
[cc_cpp]
int c = 0;
for (int i = 0; i < n; i++) {
c = c + a[i] * b[i];
}
[/cc_cpp]
Or in Backus pseudo-language with modern notation:
[cc]
innerProduct = (fold +) . (map [fn (x,y) -> x * y]) . zip
[/cc]
The Von Neumann version is the one we all know. It’s repetitive, and one must mentally execute it to understand it. While it’s concerned with how the product is calculated; the functional version describes what an inner product is, by composing other functional pieces. 1
Big remarks should be done here, so following Backus’s style, I’ll enumerate them:
- C operates on an invisible state, pulling and pumping words through the Von Neumann bottleneck word-at-a-time, while F operates exclusively on whole conceptual units, its arguments
- C does not construct complex entities over simpler ones, and it names its arguments: apart from the hurdles mentioned before, to become generic it requires a procedural declaration; F is obviously a generic composition of smaller pieces.
- C is not pure: the state is merely a glorified global variable, and side-effects might be noticed in other threads, even during intermediate steps; F is pure, with no complex transition rules and no side-effects: data has no names nor addresses that could be leaked.
The problem
And it gets worse, when we realise that {p.} this process of obesity has made language development much less exciting, making it a world of big manuals rather than that of innovative ideas, and diverting all creativity to the fattening of established languages rather than to the exploration of new ideas. Hardware is developed for this languages because they have the biggest market share; languages are developed for this hardware merely for the same reason.
These languages don’t even have any useful mathematical property to reason about. Aliasing and impurity opens the door to incredibly complex specifications and optimizations I’ll talk about in the future; and then again they require Byzantines specifications where the semantics are limited to operational ones: it’s all about specifying a transition function for any given term. Programs can’t be proven correct if there’s no algebraic relation nor logical deduction.
The Hope
Backus, as a good researcher, actually spends most of his paper drafting an alternative. His critiques and propositions are actually one of the biggest advertisements functional programming ever got, and this topic has evolved a lot since then, that’s why I’m not spending a proportional time to the alternatives than to the problem, because I will indeed talk about the whole topic from a modern perspective quite a lot. But I won’t go away without at least outlining the core ideas from Backus.
In [§11–14], Backus presents his new language, called FP. This presentation is divided in four pieces: an informal functional system, an algebra of functional programs, a formal system, and an applicative state transition system. The functional system is made of objects (data), functions (mappings from objects to objects), functional forms (mappings of functions to functions), a notion of function application, and a set of predefined and pre-named functions (projections of a tuple, head and tail of a list, identity, length, fold, Boolean operators, append) anybody acquainted with functional programming is familiar with. Some standard functional forms are composition, tupling, map, fold, zip, condition (possibly undefining), and a while construct (but no mention of a fixed-point operator! 2). This model has semantics decoupled from the state: evaluation, much closer to the $latex \lambda$-calculus model, consists on symbol replacement and equational expansions and reductions, and Backus proves in [§11.4.3] that any other system containing this one, would be just as powerful, hence proving that FP is complete.
Backus then goes and proves, much in the style of how modern languages would prove using constructs from Category Theory, but without knowing anything about Category Theory, certain algebraic properties of this language: tupling and composing are commutative ($latex [f, g] \circ h \equiv [f \circ h,g \circ h]$), properties of the predefined functions, or more complex theorems of recursion and expansion, which Backus proves quite formally. In [§12.10], he remarks some open questions for research in this topic.
At last, an Applicative State Transition System is described to model history-sensitiveness: programs need to have state nevertheless, and this state needs to be remembered, that is, visible in a subsequent execution of the program. But Backus argues semantics need not model small-step state transitions: rather, a state can be computed in big step semantics: “A new state is computed along with output and replaces the old state when output is issued”.
The Summary
At the expense of not going much into detail on the answers Backus gives, I want to remark the questions he opens in this exceptional paper. For anybody working anywhere close to computer sciences, this paper is an eye-opener, a good reminder of how important it is to question everything, and not to take the establishment as granted.
To me, this opens a whole new question: the relationship between hardware and languages. If I have been deeply interested in languages for a long time, it was this paper what motivated me to write about computers publicly, when I realised that, at last, languages are often modelled on top of an abstract machine, often not decoupled away from it, which it should be.
Bear with me when I explore these ideas of abstract machines and hardware development, not to mention Programming Languages Theory along with mathematical constructs such as Type Theory and Category Theory.
- In my revision of Backus’s version, zip is a function that takes two lists and returns a lists of pairs (the element-wise pairs of elements of the input lists); map is a function that takes a function and a list, and returns a new list of elements made of applying the function to each element of the input list; and fold is a function that takes a function and a list, and returns the element of operating the input function to the elements of the input list in a serial way, accumulating the value.
- I’ll go through the fixed-point operator some day in the future.
[…] Last time, we dug into a new debate: are languages really decoupled from the hardware that runs them? Are the semantics of a language modelling mainstream hardware, and should they? Can we escape to new models of hardware, and new models of semantics? […]
[…] Next, we’ll see how this model conditioned Hardware and Programming for times to come. And the criticism will come from perhaps a man that inflicted more damage to these wounds. Stay tuned! […]