To be empty or to be null, that’s the problem.

Most programming languages including C# and Java use null to represent the absence of value. Null references are convenient when there is no particular value to return as we can simply return null without thinking too much about the consequences.

However, null references are notorious for causing bugs that can be found only at runtime with NullPointerException. Tony Hoare calls it The Billion Dollar Mistake. He admits that null was introduced simply because it was so easy to implement.

One problem I found particularly irritating about null references is that some types have natural encoding of the absence of value. For example, we often represent the absence of a string with an empty string “”. It means we now have two ways to represent the absence of a string: either an empty string or null.

This is the reason why .NET String class provides IsNullOrEmpty. People use both null and String.Empty so we have to check both to be conservative. Checking only one of them often leads a bug.

Having two values that represent the same concept often leads to bugs. Another example is undefined and null in JavaScript. Some prefer null and others prefer undefined. Unless there is a strong convention that can enforced upon the entire JavaScript community, both undefined and null must be checked.

So I think it is crucial to have a single universal way to represent the absence of value. I think Option or Maybe type used in functional programming languages is a good start though hybrid languages such as F# and Scala still have null to interoperate with .NET and JVM respectively.

Value restriction of F#

In F#, we can declare a polymorphic value

> let a = [];; 

val a : 'a list

However, if we make it mutable, fsharpi complains with value restriction error

> let mutable a = [];;

  let mutable a = [];;
  ------------^

/Users/kseo/stdin(1,13): error FS0030: Value restriction. The value 'a' has been inferred to have generic type
    val mutable a : '_a list    
Either define 'a' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.
> let a = [];; 

val a : 'a list

This looks unreasonable at first! But let’s see what happens if we get rid of this restriction.

let mutable a = []
let f x = a <- (x :: a)

f(1);;
f(true);;
f(“Hello World”);;

The type of function f is ‘a -> unit. Hence f(1), f(true) and f(“Hello World”) type check. However, a ends up having three different types [“Hello World”; true; 1]. Oops! So, the type system of F# has a rule called value restriction to prevent this kind of bad behaviour.

In fact, all ML family languages have value restriction. See Notes on SML97’s Value Restriction for further information.

Emulating Haskell type classes in F#

One language feature I like most in Haskell is type class. It was originally conceived as a way of implementing overloaded arithmetic and equality operators. It is a clever trick to support ad-hoc polymorphism, which does not require extensive modifications of the compiler or the type system.

F# does not provide type class, but we can emulate it with other F# language features such as operator overloading and inline function. Type Classes for F# shows this trick. Here’s the Functor example taken from the blog:

type Fmap = Fmap with
    static member ($) (Fmap, x:option<_>) = fun f -> Option.map f x
    static member ($) (Fmap, x:list<_>  ) = fun f -> List.map   f x
let inline fmap f x = Fmap $ x <| f

Here, fmap function must be inline because inline functions can have statically resolved type parameters. Without the inline modifier, type inference forces the function to take a specific type. In this case, the compiler can’t decide between option and list and emits an error.

You can use it as in the following:

> fmap ((+) 2) [1;2;3] ;;
val it : int list = [3; 4; 5]
> fmap ((+) 2) (Some 3) ;;
val it : int option = Some 5

I transliterated the TreeRec example of Simon Thompson’s paper, Higher-order + Polymorphic = Reusable into F# by emulating type class in this way.

type Tree<'a> =
    | Leaf
    | Node of 'a * Tree<'a> * Tree<'a>

type LeafClass = LeafClass with
    static member ($) (LeafClass, t:'a list)  = []
    static member ($) (LeafClass, t:Tree<'a>) = Leaf
let inline leaf () : ^R = (LeafClass $ Unchecked.defaultof< ^R> )

type NodeClass = NodeClass with
    static member ($) (NodeClass, l1:'a list)  = fun a l2 -> List.concat [l1; [a]; l2]
    static member ($) (NodeClass, t1:Tree<'a>) = fun a t2 -> Node(a, t1, t2)
let inline node a x1 x2 = (NodeClass $ x1) a x2

type TreeRecClass = TreeRecClass with
    static member ($) (TreeRecClass, l:'a list) = fun f st ->
        let listToTree = function
            | [] -> failwith "listToTree"
            | a::x ->
                let n = List.length x / 2
                let l1 = Seq.take n x |> Seq.toList
                let l2 = Seq.skip n x |> Seq.toList
                (a, l1, l2)
        let rec treeRec' = function
            | [] -> st
            | l ->
                let a, t1, t2 = listToTree l
                let v1 = treeRec' t1
                let v2 = treeRec' t2
                f v1 v2 a t1 t2
        treeRec' l
    static member ($) (TreeRecClass, t:Tree<'a>) = fun f st ->
        let rec treeRec' f st = function
            | Leaf -> st
            | Node(a, t1, t2) -> f (treeRec' f st t1) (treeRec' f st t2) a t1 t2
        treeRec' f st t
let inline treeRec f st x = (TreeRecClass $ x) f st

let inline tSort x =
     // FIXME: Implement sorting!
    let mVal sort1 sort2 v = List.concat [sort1; sort2; [v]]
    let mergeVal sort1 sort2 v t1 t2 = mVal sort1 sort2 v
    treeRec mergeVal [] x

One problem with this approach is that we no longer can group related operations together into a single class. It can’t express that Tree-like type t* has three operations: leaf, node and treeRec. We end up having three distinct types LeafClass, NodeClass and TreeRecClass.

Forbidden words

There are many scary sounding terms in functional programming. These terms include: “currying”, “homomorphism”, “existential quantification”, “beta reduction”, “category theory”, “algebraic data type”, “Kleisli arrows”, “Curry–Howard correspondence”, “functor”, “applicative”, “monoid” and “monad”. Most functional programming tutorials try to explain what these terms mean as precisely as possible using even more scary looking mathematical notations.

But what about object oriented programming? OOP also includes many scary sounding terms such as “inheritance polymorphism”, “covariance”, “visitor pattern”, “SOLID”. If we explain what these terms mean as precisely as possible, we encounter the exact same situation as FP. Novice programmers would think that OOP is something very scary.

I recently read the articles of F# for fun and profit. Scott Wlaschin explains functional programming concepts without using any of these scary sounding terms. He even has the list of forbidden words. He especially tries hard to avoid the use of words beginning with letter “m”.

I think this is a good approach. Many programmers are interested in functional programming these days. C# programmers want to learn F# and Java programmers want to learn Scala. But they definitely do not want to learn lambda calculus or category theory from the beginning just to learn a new programming language.