Next Language?
Recently I’ve been thinking, “I’ve done the Lisp thing. Also, SML and OCaml. What’s the next language I could learn that will change the way I think about programming?”
Erlang seems pretty cool. Really inexpensive parallelism and built-in capability for distributed process on a bunch of machines. Pattern matching. Built-in tuples. All good things… but it’s dynamically typed. The benefit of strong typing (every expression has a type, inferred or declared, at compile-time; and types are checked) is well known. I have taken this to heart, and programmer friends of mine know I get very testy when someone threatens my Hindley and Milner.
Haskell, on the other hand, has a magnificently expressive type system. It has dependent types, generalised algebraic data types, and type classes, all which make for a fruitful compilation step and contribute to code confidence. Type classes in particular provide operator overloading0 in a type-safe way. But if languages are schoolteachers, Haskell is a pushover: it has non-strict semantics. This property is less-accurately1, but more commonly known as laziness.
I don’t like this aspect of Haskell. It seems to make it hard to reason about the complexity of an algorithm. “Am I accumulating stack?” While it might be quick to code something correctly, coding correctly and efficiently is a less obvious process for me.
Ignoring these options, however: who’s to say I’ve really understood Lisp, SML, or OCaml? Some would have it that it takes longer than a few years to really become an expert in any reasonably general-purpose language. I like this, and so my current plan is to stick it out for a while with OCaml (education permitting).
Currently I’m exercising my OCaml-fu by implementing a parallel Sudoku solver in Jocaml. Jocaml is an extension of OCaml that implements the join calculus, a formal language for asynchronous communicating processes. Basically, instead of just wrapping some system threads library, Jocaml provides orthogonal primitives to describe fundamental building-blocks of processes and how they communicate. As a bonus, the whole thing is formal, so you could conceivably prove theorems about a program written in Jocaml, because there is a mathematical theory underlying all the asynchronous bits.
[0]Also called ad-hoc polymorphism, though I’ve yet to discover why.
[1]In my vocabulary, strict semantics is a language property meaning that evaluating an expression involves first completely evaluating its subexpressions. It follows, then, that a language has non-strict semantics when it does not have strict semantics.
One consequence of non-strict semantics is the possibility of embedding non-terminating expressions in a terminating program. For example, suppose use-first-argument is a binary function that does exactly what it says, namely does something with its first argument only, ignoring its second. In a strict language (e.g. everything, mostly), where stands for an expression whose evaluation never terminates, the function call
use-first-argument (X, will never terminate. In a non-strict language (Haskell), that code fragment will return whatever )
use-first-argument returns. If the expression’s value is never needed, it is never computed, and hence the infinite recursion is never triggered.
To return to the point, laziness is the property of a program or data structure — a value can be lazily computed by saving intermediate results until the final result is manually requested, at which point evaluation occurs. Laziness can be written into Java, or OCaml, or any language. One example is a common pattern in OO system implementation: set the values of member variables the first time they are accessed, rather than upon initialisation of the class.
SML/NJ and Threads
It has been observed that “the most important thing in [a] programming language is the name. A language will not succeed without a good name.” However, and I am sure much to that person’s dismay, I have discovered a counterexample: the functional programming language ML, which stands for the astonishingly dull, “meta-language” [1]*.
(It may be countered that ML has in fact not succeeded, due to its non-widespread adoption among major companies to the extent that (e.g.) Java** has penetrated. To these disparagers I reply, “Microsoft”.)
SML/NJ, a principal implementation of ML, has higher-order functions, first-class continuations, strong typing combined with automatic type inference, and threading.
But what’s cool is the function-call stack. In the typical memory layout one has a stack and a heap (along with other stuff, which I elide). The stack is used for function frames and the heap for arbitrary data, usually the kind that persists across function calls and function returns.
However, SML/NJ uses the heap for function frames, ignoring the stack. Conceptually, any time a function needs to allocate a new frame, it allocates it in the heap, and sometime after it’s done with the frame, the garbage collector sweeps up the mess.
This has a few advantages, not the least of which is that functions don’t overflow the stack without overflowing available memory (at which point you’re screwed anyway). The second advantage is that threading support is *dead easy*. Every thread just allocates its own space in the heap, and if that thread has a particularly stack-intensive function, it allocates more space. No problems of thread A scribbling on thread B’s stack space because the compiler failed to separate their base stack frame addresses enough. When a thread has finished execution, the garbage collector again cleans up the mess. No explicit thread reaping need apply.
Nifty, no?
* I hereby affirm the absolute reliability of Wikipedia.
** Opinions forthcoming.
-
Archives
- July 2009 (1)
- March 2009 (2)
- July 2008 (1)
- May 2008 (1)
- April 2008 (1)
- March 2008 (1)
- February 2008 (1)
- October 2007 (1)
- August 2007 (1)
- July 2007 (2)
- April 2007 (4)
- August 2006 (1)
-
Categories
-
RSS
Entries RSS
Comments RSS