Encoding algebraic data type in JavaScript

JavaScript lacks algebraic data type, but we can emulate it because JavaScript has first-class functions.

Here’s a simple Haskell program declaring Tree data type with Bud, Leaf and Tree data constructors. treeSum calculates the sum of leaf nodes by iterating all nodes in the given tree.

data Tree = Bud | Leaf Int | Tree Tree Tree deriving Show

treeSum :: Tree -> Int
treeSum Bud = 0
treeSum (Leaf n) = n
treeSum (Tree t1 t2) = treeSum t1 + treeSum t2

main = do
	let t0 = treeSum Bud
	putStrLn $ show t0
	let t1 = treeSum (Leaf 2)
	putStrLn $ show t1
	let t2 = treeSum (Tree (Leaf 3) (Tree (Leaf 5) Bud))
	putStrLn $ show t2

In JavaScript, we can emulate the data constructors of Tree with higher order functions. The trick used here is to pass b, l and p as arguments to data constructors. Only one of these 3 arguments is used by each data constructor:

  • b is a value for Bud.
  • l is a function taking an integer n as an argument and returns a value for Leaf.
  • p is a function taking left and right trees as arguments and returns a value for Tree.

(ES6 let and arrow are used for conciseness.)

let Bud = b => l => p => b;
let Leaf = n => b => l => p => l(n);
let Tree = t1 => t2 => b => l => p =>
           p(t1(b)(l)(p))(t2(b)(l)(p))

You can think of Bud as a value by itself. For Leaf and Tree, you can create each node by partially applying arguments.

var t0 = Bud;
var t1 = Leaf(2);
var t2 = Tree(Leaf(2), Bud);

With this encoding scheme, defining treeSum is trivial. You just need to pass 3 functions to handle Bud, Leaf and Tree respectively. If w is Bud, it returns 0. If w is a Leaf node, it returns n. If w is a Tree node, it returns the sum of left and right trees.

let treeSum = w => w(0)(n => n)(x => y => x + y);

The client code looks almost the same to Haskell.

var v0 = treeSum(Bud);
console.log(v0);

var v1 = treeSum(Leaf(2));
console.log(v1);

var v2 = treeSum(Tree(Leaf(3))(Tree(Leaf(5))(Bud)));
console.log(v2);

The origin of notation λ in lambda calculus

Here is an extract from the impact of lambda calculus in logic and computer science.

In Russel and Whitehead's [1910-13] Principia Mathematica the notation for function f with f(x) = 2x + 1 is 2x̂ + 1. Church originally intended to use the notation x̂.2x + 1. The typesetter could not position the hat on the of the x and placed it in front of it, resulting in ^x.2x + 1. Then another typesetter changed it into λx.2x + 1.

TypeScript: Best Common Type

TypeScript tries to find the best common type when a type inference is made from several expressions.

Let’s experiment a bit with following types. Square is a subtype of Shape and Circle is also a subtype of Shape.

class Shape {
    color: string;
}

class Square extends Shape {
    sideLength: number;
}

class Circle extends Shape {
	radius: number;
}

The type of xs1 is Shape[] because Shape is the common supertype of Shape, Square and Circle. However, surprisingly the type of xs2 is not Shape[] though Shape is the common supertype of Square and Circle. Here the type of xs2 is (Circle | Square)[] because there is no object that is strictly of type Shape in the array.

var xs1 = [new Shape(), new Square(), new Circle()];
var xs2 = [new Square(), new Circle()];

You can override this type by annotating the type explicitly with Shape[].

var xs3: Shape[] = [new Square(), new Circle()];

Now let’s see how the return type of a function is inferred when the function returns two values with different types.

function createShape1(square: boolean) {
    if (square) {
        return new Square();
    } else {
        return new Shape();;
    }
}

The return type of createShape1 function is Shape as expected because Shape is the common supertype of Square and Shape.

function createShape2(square: boolean) { // Type Error
    if (square) {
        return new Square();
    } else {
        return new Circle();
    }
}

However, if we change the return value of else branch to new Circle(), createShape2 function no longer type checks. TypeScript complains about “error TS2354: No best common type exists among return expressions.” It fails to find the best common type though it can be inferred as either Square|Circle or Shape.

We can satisfy the type checker by giving an explicit type. The return type can be either Square | Circle or Shape.

function createShape3(square: boolean): Square | Circle {
    if (square) {
        return new Square();
    } else {
        return new Circle();
    }
}

function createShape4(square: boolean): Shape {
    if (square) {
        return new Square();
    } else {
        return new Circle();
    }
}

From the examples above, we can see that the type inference rule of TypeScript is not orthogonal. When you mix multiple types in an array, TypeScript finds the best common type that is the union of the types of the element expressions. But when a function mixes multiple return types in return values, it finds the best common type that is the first supertype of each of the others.

You can see the difference in the type inference rules taken from the TypeScript 1.4 specification.

Array Type

  • If the array literal is empty, the resulting type is an array type with the element type Undefined.
  • Otherwise, if the array literal is contextually typed by a type that has a property with the numeric name ‘0’, the resulting type is a tuple type constructed from the types of the element expressions.
  • Otherwise, the resulting type is an array type with an element type that is the union of the types of the element expressions.

Function Return Type

  • If there are no return statements with expressions in f’s function body, the inferred return type is Void.
  • Otherwise, if f’s function body directly references f or references any implicitly typed functions that through this same analysis reference f, the inferred return type is Any.
  • Otherwise, if f is a contextually typed function expression (section 4.9.3), the inferred return type is the union type (section 3.4) of the types of the return statement expressions in the function body, ignoring return statements with no expressions.
  • Otherwise, the inferred return type is the first of the types of the return statement expressions in the function body that is a supertype (section 3.10.3) of each of the others, ignoring return statements with no expressions. A compile-time error occurs if no return statement expression has a type that is a supertype of each of the others.

Thoughts on Intersection Types

Facebook’s Flow is a static type checker, designed to find type errors in JavaScript programs. Flow’s type system is similar to that of TypeScript in that it is a gradual type system and relies heavily on type inference to find type errors.

One interesting feature of Flow that is missing in TypeScript is intersection types. For example, in Flow you can represent a function which takes either Bar or Foo with intersection types.

/* @flow */
class Foo {}
class Bar {}
declare var f: ((x: Foo) => void) & ((x: Bar) => void);
f(new Foo());
f(true); // Flow will error here.

The function f has type ((x: Foo) => void) & ((x: Bar) => void). The notation A & B represents the intersection type of A and B, meaning a value of A & B belongs to both A and B. We can capture this intuition with the following sub-typing rules:

[1] A & B <: A
[2] A & B <: B
[3] S <: A, S <: B ~> S <: A & B

In Flow, intersection types are primarily used to support finitary overloading. func takes either number or string and returns nothing. This can be represented as intersection of two function types as in the following:

type A = (t: number) => void & (t: string) => void
var func : A;

However, TypeScript doesn’t need intersection types to support finitary overloading because it directly supports function overloading.

interface A {
  (t: number): void;
  (t: string): void;
}
var func: A

Or in TypeScript 1.4, you can represent the same function more succinctly with union types.

interface A {
  (t: number | string): void;
}
var func: A

This is not a coincidence! Intersection types are the formal dual of union types. When an arrow is distributed over a union, the union changes to an intersection.

(A -> T) & (B -> T)  <:  (A | B) -> T

So it means the following two types are equivalent:

  • (t: number | string): void
  • (t: number) => void & (t: string) => void

So far intersection types seem redundant once a language have union types (or vice versa). However, there are some real world use cases where intersection type are actually required.

Let’s check how ES6’s new function Object.assign is typed in TypeScript (lib.core.es6.d.ts).

interface ObjectConstructor {
    /**
      * Copy the values of all of the enumerable own properties from one or more source objects to a 
      * target object. Returns the target object.
      * @param target The target object to copy to.
      * @param sources One or more source objects to copy properties from.
      */
    assign(target: any, ...sources: any[]): any;
    ...
}

Currently, both the argument types and the return type of Object.assign is any because the type system of TypeScript can’t represent the type correctly. Let’s try to type this function correctly. First, we can make this function polymorphic by assigning A to target and B[] to sources.

Object.assign<A,B>(target: A, ...sources: B[]): ?;

Okay. We are now stuck with the return type. The return value has all the properties of both A and B. It means with structural typing, the return value is a subtype of both ‘A’ and ‘B’ (belongs to both A and B). Yeah! We need intersection types to represent this value. So the correct signature of this function is:

Object.assign<A, B>(target: A, ...sources: B[]): A & B;

The same reasoning also applies to mixins.

mixins<A,B>(base: { new() :A }, b: B}) : { new(): A & B} 

TypeScript developers are still discussing on adding intersection types. Many people think that there are not enough compelling use cases. Also intersecting two primitive types like string & number makes no sense, which makes intersection types less desirable.

I think adding intersection types to TypeScript makes the language more consistent because these two concepts are the dual of each other and can’t really be separable. But I also think that the language design must be conservative. So let’s just wait until we have more compelling reasons to add intersection types.