To be empty or to be null, that’s the problem.

Most programming languages including C# and Java use null to represent the absence of value. Null references are convenient when there is no particular value to return as we can simply return null without thinking too much about the consequences.

However, null references are notorious for causing bugs that can be found only at runtime with NullPointerException. Tony Hoare calls it The Billion Dollar Mistake. He admits that null was introduced simply because it was so easy to implement.

One problem I found particularly irritating about null references is that some types have natural encoding of the absence of value. For example, we often represent the absence of a string with an empty string “”. It means we now have two ways to represent the absence of a string: either an empty string or null.

This is the reason why .NET String class provides IsNullOrEmpty. People use both null and String.Empty so we have to check both to be conservative. Checking only one of them often leads a bug.

Having two values that represent the same concept often leads to bugs. Another example is undefined and null in JavaScript. Some prefer null and others prefer undefined. Unless there is a strong convention that can enforced upon the entire JavaScript community, both undefined and null must be checked.

So I think it is crucial to have a single universal way to represent the absence of value. I think Option or Maybe type used in functional programming languages is a good start though hybrid languages such as F# and Scala still have null to interoperate with .NET and JVM respectively.

Forbidden words

There are many scary sounding terms in functional programming. These terms include: “currying”, “homomorphism”, “existential quantification”, “beta reduction”, “category theory”, “algebraic data type”, “Kleisli arrows”, “Curry–Howard correspondence”, “functor”, “applicative”, “monoid” and “monad”. Most functional programming tutorials try to explain what these terms mean as precisely as possible using even more scary looking mathematical notations.

But what about object oriented programming? OOP also includes many scary sounding terms such as “inheritance polymorphism”, “covariance”, “visitor pattern”, “SOLID”. If we explain what these terms mean as precisely as possible, we encounter the exact same situation as FP. Novice programmers would think that OOP is something very scary.

I recently read the articles of F# for fun and profit. Scott Wlaschin explains functional programming concepts without using any of these scary sounding terms. He even has the list of forbidden words. He especially tries hard to avoid the use of words beginning with letter “m”.

I think this is a good approach. Many programmers are interested in functional programming these days. C# programmers want to learn F# and Java programmers want to learn Scala. But they definitely do not want to learn lambda calculus or category theory from the beginning just to learn a new programming language.

Scala Option.fold vs Option.map/getOrElse

Scala Option offers two different ways to handle an Option value.

Option.map/getOrElse

val name: Option[String] = request getParameter "name"
name map { _.toUpperCase } getOrElse ""

Option.fold

val name: Option[String] = request getParameter "name"
name.fold("") { _.toUpperCase }

On the spark-dev mailing list, there was a discussion on using Option.fold instead of Option.map/getOrElse combination.

Two idioms look almost the same, but people seem to prefer one over the other for readability reasons. Here is the summary of the discussion:

  • Option.getOrElse/map is a more idiomatic Scala code because Option.fold was only introduced in Scala 2.10.
  • Fold on Option is not obvious to most developers.
  • Option.fold is not readable.
    • Reverses the order of Some vs None.
    • Putting the default condition first there makes it not as intuitive.
    • When code gets long, the lack of an obvious boundary with two closures is confusing. (“} {” compared to getOrElse)
  • Fold is a more functional idiom in general.

It seems people are getting used to functional idioms such as map and filter, but still are reluctant to accept more functional idioms such as Option.fold.

I prefer Option.getOrElse/map because I think putting the default value first is not intuitive and much of Scala code is already written with Option.getOrElse/map. However, both options are fine as long as only one style is used through the project. Consistency is more important than taste!

Functions are Objects in Scala

Scala is a multi-paradigm language which supports both object-oriented programming and functional programming. So Scala has both functions and objects. But in the implementation level, function values are treated as objects. The function type A => B is just an abbreviation for the class scala.Function1[A, B],

package scala
trait Function1[A, B] {
    def apply(x: A): B
}

There are traits Function2, …, Function22 for functions which take more arguments.

An anonymous function such as (x: Int) => x * x is expanded to

new Function1[Int, Int] {
    def apply(x: Int) = x * x
}

A function call such as f(a, b) is expanded to f.apply(a, b).

So the translation of

val f = (x: Int) => x * x
f(7)

would be

val f = new Function1[Int, Int] {
    def apply(x: Int) = x * x
}
f.apply(7)

This trick is necessary because JVM does not allow passing or returning functions as values. So to overcome the limitation of JVM (lack of higher order functions), Scala compiler wraps function values in objects.

The following method f is not itself a function value.

def f(x: Int): Boolean = ...

But when f is used in a place where a Function type is expected, it is converted automatically to the function value

(x: Int) => f(x)

This is an example of eta expansion.

The code examples in this article are taken from Martin Odersky’s Functional Programming Principles in Scala lecture.

Classes and Substitutions in Scala

In functional programming, it is conventional to define the meaning of a function application using a computation model based on substitution. In Scala, the meaning of classes and objects are also defined using substitution model. Let’s assume that we have a class definition,

class C(x1; …; xm){ … def f(y1; …; yn) = b … }

  • The formal parameters of the class are x1; …; xm.
  • The class defines a method f with formal parameters y1; …; yn.

Then, the expression new C(v1; …; vm).f(w1; …; wn) is rewritten to

[w1/y1; …; wn/yn][v1/x1; …; vm/xm][new C(v1; …; vm)/this]b

  • the substitution of the formal parameters y1; …; yn of the function f by the arguments w1; …; wn,
  • the substitution of the formal parameters x1; …; xm of the class C by the class arguments v1; …; vm,
  • the substitution of the self reference this by the value of the object new C(v1; …; vn)

Martin Odersky explained the evaluation of Scala classes and objects using this model in his lecture, Functional Programming Principles in Scala.