Why do some programming languages feel neat and orderly and others seem loaded with inconsistencies?
When I first started trying to write my own programming language I was surprised by how difficult it seemed to be to find best practices on language design. In an industry filled with opinions, where people will fight to the death over tabs -vs- spaces, there isn’t much guidance for would-be program language designers.
Eventually I came to my own conclusions: the pathway to success with program language design is to think of programming paradigms as logical groupings of abstractions and be as intentional about what is included and what isn’t.
What Is a Programming Paradigm?
A programming paradigm can be thought of as a philosophy of structuring and executing code. Unlike styles and conventions, paradigms cannot be leveraged unless they are built into the design of the language. You probably already know many programming paradigms:
- Object-orientated programming
- Functional programming
- Procedural programming
Then there are some more obscure ones you might not know about:
- Logical programming
- Agent-orientated programming
Most mainstream programming languages mix paradigms. Single paradigm languages are not really all that useful, but the languages that we find clean and beautiful tend to implement support for paradigms in a more structured way. The languages that people tend to think are “ugly” present the abstractions as alternative ways of doing the same thing, without any reference to the paradigm it was created to support. When we mix abstractions that have fundamentally different assumptions, we end up with really ugly work arounds and unknown consequences.
Dig into paradigm support of the most common programming languages today, and you’ll start to notice some obvious groupings. Some paradigms are more likely to coincide than others. For example, functional languages tend to also have support for the logical paradigm and/or metaprogramming. This makes sense because in all three the fundamental abstractions are the same. They prefer immutable data with pure functions.
Peter Van Roy’s article Programming Paradigms for Dummies: What Every Programmer Should Know describes a taxonomy for paradigms based on these kinds of abstractions. Specifically he places paradigms based on their alignment across three different axis:
- Deterministic vs Nondeterministic
- Sequential vs Concurrent
- Named State vs Unnamed State
The deterministic vs nondeterministic axis is basically the degree with which functions are allowed to have side effects and data is allowed to be changed. There aren’t many languages that maintain completely pure functions (I/O interactions make that difficult) or completely immutable data. Immutable variables can sometimes be reassigned, properties can be added or changed on immutable objects, some languages will allow a variable to be changed but shadow the old value so that functions defined before the change remain unaffected. How a language handles these issues determine where it is on the axis and what paradigms fit best with it. Functional languages lean towards the deterministic side, whereas imperative languages tend to be more forgiving of nondeterminism.
Sequential vs concurrent is to what extent the parts of the program execute one after the other, or at the same time. Like determinism, there are a range of different implementations in program language design.
Named and unnamed state, however, is about the interfaces that the language exposes to be manipulated in the first place. When something is assigned an identifier (named) it can be programmed, which invites a host of other decisions that have consequences. Toy calculator languages, for example, are immutable by default because you cannot save state in a named variable and therefore cannot change the state of previous parts of any program. Languages where you cannot name functions will struggle to support first order functions. Of course, most languages we program in support both named variables and named functions, but can you name a thread? Or a channel? If you can name it, you can program it. If you can program it, you need to manage conflicts caused by change of state.
Designing a Language with Paradigms
The languages that seem to do best with lots of paradigms are languages that either have strong opinions in other ways (eg: a well designed type system) or are really only used for specific tasks (like Julia and Wolfram Mathematica). To support very different paradigms some languages have specific data structures that are immutable alongside mutable alternatives. Some provide trap doors that allow restrictions necessary for one paradigm to be ignored or turned off.
if you’re using a functional data structure like
Stream, and you map over the
Stream, it doesn’t actually do anything. If you place a
mapfunction, you’re not going to see anything.
A huge part of code safety is what the programmer assumes the behavior of a particular piece of code should be. When multiple paradigms are supported, rarely is it communicated to the user what parts go with which paradigm. It’s no coincidence that the programming languages that do multiple paradigms best tend to also add new paradigms via module. That separates out all the abstractions into one self contained set of documentation.