Emulating Haskell type classes in F#

One language feature I like most in Haskell is type class. It was originally conceived as a way of implementing overloaded arithmetic and equality operators. It is a clever trick to support ad-hoc polymorphism, which does not require extensive modifications of the compiler or the type system.

F# does not provide type class, but we can emulate it with other F# language features such as operator overloading and inline function. Type Classes for F# shows this trick. Here’s the Functor example taken from the blog:

type Fmap = Fmap with
    static member ($) (Fmap, x:option<_>) = fun f -> Option.map f x
    static member ($) (Fmap, x:list<_>  ) = fun f -> List.map   f x
let inline fmap f x = Fmap $ x <| f

Here, fmap function must be inline because inline functions can have statically resolved type parameters. Without the inline modifier, type inference forces the function to take a specific type. In this case, the compiler can’t decide between option and list and emits an error.

You can use it as in the following:

> fmap ((+) 2) [1;2;3] ;;
val it : int list = [3; 4; 5]
> fmap ((+) 2) (Some 3) ;;
val it : int option = Some 5

I transliterated the TreeRec example of Simon Thompson’s paper, Higher-order + Polymorphic = Reusable into F# by emulating type class in this way.

type Tree<'a> =
    | Leaf
    | Node of 'a * Tree<'a> * Tree<'a>

type LeafClass = LeafClass with
    static member ($) (LeafClass, t:'a list)  = []
    static member ($) (LeafClass, t:Tree<'a>) = Leaf
let inline leaf () : ^R = (LeafClass $ Unchecked.defaultof< ^R> )

type NodeClass = NodeClass with
    static member ($) (NodeClass, l1:'a list)  = fun a l2 -> List.concat [l1; [a]; l2]
    static member ($) (NodeClass, t1:Tree<'a>) = fun a t2 -> Node(a, t1, t2)
let inline node a x1 x2 = (NodeClass $ x1) a x2

type TreeRecClass = TreeRecClass with
    static member ($) (TreeRecClass, l:'a list) = fun f st ->
        let listToTree = function
            | [] -> failwith "listToTree"
            | a::x ->
                let n = List.length x / 2
                let l1 = Seq.take n x |> Seq.toList
                let l2 = Seq.skip n x |> Seq.toList
                (a, l1, l2)
        let rec treeRec' = function
            | [] -> st
            | l ->
                let a, t1, t2 = listToTree l
                let v1 = treeRec' t1
                let v2 = treeRec' t2
                f v1 v2 a t1 t2
        treeRec' l
    static member ($) (TreeRecClass, t:Tree<'a>) = fun f st ->
        let rec treeRec' f st = function
            | Leaf -> st
            | Node(a, t1, t2) -> f (treeRec' f st t1) (treeRec' f st t2) a t1 t2
        treeRec' f st t
let inline treeRec f st x = (TreeRecClass $ x) f st

let inline tSort x =
     // FIXME: Implement sorting!
    let mVal sort1 sort2 v = List.concat [sort1; sort2; [v]]
    let mergeVal sort1 sort2 v t1 t2 = mVal sort1 sort2 v
    treeRec mergeVal [] x

One problem with this approach is that we no longer can group related operations together into a single class. It can’t express that Tree-like type t* has three operations: leaf, node and treeRec. We end up having three distinct types LeafClass, NodeClass and TreeRecClass.

A Critique of Homoiconicity

Homoiconicity is a property of Lisp in which program and data have the same representation, namely S-expressions. This gives Lisp uniquely strong power to write programs that manipulate programs, such as interpreters and compilers. But it also makes it easy for a novice programmer to be confused between program and data.

Philip Wadler described several such confusions in his paper, “A critique of Abelson and Sussman – or – Why calculating is better than schemin“. The original paper compared Scheme and Miranda, but I changed them to more modern languages Clojure and Haskell respectively.

Clojure lists are not self-quoting

In Clojure, lists are not self-quoting while numbers as data are self-quoting. For example, 3 evaluates to 3 itself, but (1 2 3) evaluates as a function call, not as a list. To include list as a datum, one must quote it as in (quote (1 2 3)) or ‘(1 2 3).

user=> 3
3
user=> (1 2 3)
ClassCastException java.lang.Long cannot be cast to clojure.lang.IFn user/eval5 (NO_SOURCE_FILE:4)
user=> (quote 1 2 3)
1
user=> ‘(1 2 3)
(1 2 3)

When combined with the substitution model to explain the evaluation of (list (list 1 2) nil), this is quite confusing because the intermediate steps are no longer legal Lisp expressions.

(list (list 1 2) nil)
-> (list (1 2) nil)
-> (list (1 2) ())
-> ((1 2) ())

In Haskell, one just writes [[1,2], []]. It is already a value, so there is no evaluation to explain.

Further confusion with quote

Surprisingly, (peek ”abracadabra) evaluates to quote.

user=> (peek ”abracadabra)
quote

because (peek ”abracadabra) expands to (peek (quote (quote abracadabra))) which evaluates as in

(peek (quote (quote abracadabra)))
-> (peek (quote abracadabra))
-> quote

Evaluating too little or too much

As Clojure represents program and data in the same form, one often mistakenly evaluates either too little or too much.

(peek (quote (a b)))

The result of evaluating the expression above is a. However, many programmers give the answer quote (too little) or the value of the variable a (too much).

In Haskell, no such problem arises because there is no quote.

Other confusions with lists

A list containing just one element x is (list x) or (x) in Clojure, while it is [x] in Haskell. As people tend to drop parenthesis, (x) often mistakenly read as x.

People also get confused between cons and list because (cons x y) and (list x y) in Clojure look similar while x:y and [x, y] in Haskell look distinct.

These are just notational differences, but some notations certainly make people more confused than others.

Syntax

Programming language research does not talk much about syntax because the preference of one syntax over the others depends much on the taste of a programmer. But the famous S-expression certainly hinders reasoning with algebraic properties, such as associativity. Also most people who are accustomed to infix notations taught at math classes find it hard read S-expressions at first.

Conclusion

A Lisp family language such as Clojure traditionally has been regarded as a good introductory language to students or novice programmers thanks to its simple syntax (or lack of) and semantics. But one of its powerful language feature, homoiconicity is a mixed blessing for beginners as it makes it hard to reason about programs. Maybe this is the reason why other functional programming languages such as OCaml, F#, Scala and Haskell end up not including homoiconicity in the language.

The guilt of impurity

ML is guilty of impurity, so it can’t escape from a restriction. In other words, the presence of mutable ref cells and parametric polymorphism in ML requires the language to be controlled by the value restriction.

Here is a small example which shows that the ML type system is unsound without the value restriction. This allows putting a value of int where we expect a value of type string.

val r = ref NONE (* val r: 'a option ref *)
val _ = r := SOME "hi"
val i = 1 + valOf (!r)

ML fixes this by enforcing the value restriction: a variable-binding can have a polymorphic type only if the expression is a variable or value. ref NONE is a function call which belongs to neither, so the type of ref NONE is filled with dummy types that are no longer usable.

But there is a downside too. The value restriction can cause problems even when we don’t use mutable cells. The type-checker does not know that Lisp.map is not making a mutable reference, so it must impose the value restriction.

val pairWithOne = List.map (fn x => (x, 1))

To circumvent the value restriction, we must wrap it in a function binding (η-conversion).

fun pairWithOne xs = List.map (fn x => (x, 1)) xs

Haskell is free from the guilt of impurity. In Haskell, because IORefs can only be created (and used) in the IO monad, no such restriction is necessary. This is another reason why we love purity of Haskell.