We need to simplify the following expression:
e=(1+0×x)×3+12.
Luckily for us, we won't have to remember any elementary school arithmetic, because Harris's excellent Handbook of
Practical Logic and Automated Reasoning begins with a simple algorithm to do exactly that. It's not complicated, but
it's a pretty good barometer of how painful a programming language will be for the kind of hybrid (probabilistic logic,
or statistical relational) approaches I work with. Here, I compare the implementations of Harris' simple algorithm in
F#, Scala, Haskell, C++, and Julia. No programming languages were hurt while writing this post. It's not a competition, and I avoided talking about languages I dislike. Sum types are discussed in length because they are awesome and useful for this problem (and many, many others).
Practical Logic and Automated Reasoning begins with a simple algorithm to do exactly that. It's not complicated, but
it's a pretty good barometer of how painful a programming language will be for the kind of hybrid (probabilistic logic,
or statistical relational) approaches I work with. Here, I compare the implementations of Harris' simple algorithm in
F#, Scala, Haskell, C++, and Julia. No programming languages were hurt while writing this post. It's not a competition, and I avoided talking about languages I dislike. Sum types are discussed in length because they are awesome and useful for this problem (and many, many others).
The ML family
Harris' book uses OCaml, a popular language for solvers. F#, Haskell, and Scala all share roots with OCaml, with F# being the closest thing to an OCaml dialect. I'll start with F#:
/// A sum type for the expression.
/// An expression is either a var (which is a string), a constant
/// (which is an integer), an addition (made of two expressions)
/// or a multiplication (also made of two expressions).
type Expr =
| Var of string
| Const of int
| Add of Expr * Expr
| Mul of Expr * Expr
/// Simplify a single component of the expression. This function
/// takes an expression and use pattern matching to select the
/// right approach based on type and value. For example, if we
/// add a constant 0 to some x (which can be expression), then
/// we return x.
let simplify1 e =
match e with
| Add (Const 0, x)
| Add (x, Const 0)
| Mul (x, Const 1)
| Mul (Const 1, x) -> x
| Mul (x, Const 0)
| Mul (Const 0, x) -> Const 0
| Add (Const a, Const b) -> Const (a + b)
| Mul (Const a, Const b) -> Const (a * b)
/// Recursive function to simplify an entire expression.
| _ -> e
let rec simplify e =
match e with
| Add (x, y) -> Add (simplify x, simplify y)
| Mul (x, y) -> Mul (simplify x, simplify y)
/// Return the value string if the expression can be reduced to a constant.
| _ -> e
|> simplify1
let exprStr e =
match e with
| Const x -> string x
| _ -> "The expression could not be simplified to a constant."
/// The |> operator sends the result on its left to its right, for example
/// "5.0 |> log |> sqrt" computes log(5.0) and then the square root of the
/// result. This is nice because it allows a more natural left-to-right
/// flow for functional programming.
[<EntryPoint>]
let main argv =
Add (Mul (Add (Const 1, Mul (Const 0, Var "x")), Const 3), Const 12)
|> simplify
|> exprStr
|> printf "%s"
0 /// F#'s main returns 0 for success à la C
It's almost the same as the OCaml version in Harris' book. The key trick is to define an expression (Expr) as a variable (string) or a constant (integer) or an addition or a multiplication (both made of two expressions). The or is important, object-oriented programming languages focuses on hierarchies of objects, while sum types define a type as a series of alternatives. Sum types are important for another reason: they provide an easy way to express things like "this function might return an integer", for example in Haskell if we want a data structure that maps keys to values:
import Data.Map (Map)
import qualified Data.Map as Map
capitals = Map.fromList [("Finland", "Helsinki"), ("France", "Paris"),
("Japan", "Tokyo"), ("South Korea", "Seoul"), ("Arrakis", "Arrakeen")]
lookupCapitals country = case Map.lookup country capitals of
Just capital -> "The capital of " ++ country ++ " is " ++ capital ++ "."
Nothing -> "Is " ++ country ++ " even a country?"
The point is that a key-value store will only return a value if the key is present. In this example, the map takes a country (string) and returns its capital (string). However, when we try to take a value from the map with the lookup function, Haskell returns a Maybe type with either Just String, if the string provided is found in the map, or Nothing if the key is absent. We then use pattern matching to deal with these possibilities in the lookupCapitals function. One of the most common mistake in programming is to return a null and not deal with it properly. The solution with sum types is to return a wrapped value and handling possibilities explicitly with pattern matching. It solves with types what many languages would solve with exceptions and try-catch apparatuses.
ghci> lookupCapitals "Arrakis"
"The capital of Arrakis is Arrakeen."
ghci> lookupCapitals "Canada"
"Is Canada even a country?"
Speaking of Haskell, the code for the algorithm is:
data Expr =
Var String
| Const Int
| Add Expr Expr
| Mult Expr Expr
simplify1 :: Expr -> Expr
simplify1 e = case e of
Add (Const 0) x -> x
Add x (Const 0) -> x
Add (Const a) (Const b) -> Const $ a + b
Mult x (Const 0) -> Const 0
Mult (Const 0) x -> Const 0
Mult x (Const 1) -> x
Mult (Const 1) x -> x
Mult (Const a) (Const b) -> Const $ a * b
_ -> e
simplify :: Expr -> Expr
simplify e = case e of
Add x y -> simplify1 $ Add (simplify x) (simplify y)
Mult x y -> simplify1 $ Mult (simplify x) (simplify y)
_ -> simplify1 e
e = Add (Mult (Add (Const 1) (Mult (Const 0) (Var "x"))) (Const 3)) (Const 12)
s = simplify e
main = putStrLn $ case s of
Const x -> show x
_ -> "Could not simplify the expression to a constant."
It's quite similar to F#. I decided to add types explicitly to simplify1 and simplify, but Haskell is smart enough to deduce the type without this. Arguably the only thing worth explaining is the $ operator. The operator forces Haskell to evaluate the expression to the right of the operator in priority, and if it reminds you of parentheses, you are absolutely right. x and y have the same value here:
x = log (sqrt (exp 5.0))
y = log $ sqrt $ exp 5.0
The operator is there to reduce visual clutter. In my opinion, F# is easier to read because the |> operator enforces left-to-right reading, which is more natural than reading code inside-out:
let z = exp 5.0 |> sqrt |> log
Although it's trivial to simulate this operator in Haskell:
(|>) :: t0 -> (t0 -> t1) -> t1
(|>) x f = f x
-- Now valid Haskell:
z = exp 5.0 |> sqrt |> log
And now for something a bit different: Scala. It's also a static functional programming language with sum types, but its greater integration with the object-oriented paradigm is evident:
object Simplify {
sealed abstract class Expr { override def toString = show(this) }
case class Variable(name: String) extends Expr
case class Const(value: Int) extends Expr
case class Add(left: Expr, right: Expr) extends Expr
case class Mult(left: Expr, right: Expr) extends Expr
def evalOne(e: Expr): Expr = e match {
case Add(Const(0), r) => r
case Add(l, Const(0)) => l
case Add(Const(a), Const(b)) => Const(a + b)
case Mult(Const(0), r) => Const(0)
case Mult(l, Const(0)) => Const(0)
case Mult(Const(1), r) => r
case Mult(l, Const(1)) => l
case Mult(Const(a), Const(b)) => Const(a * b)
case _ => e
}
def eval(e: Expr): Expr = e match {
case Add(l, r) => evalOne(Add(eval(l), eval(r)))
case Mult(l, r) => evalOne(Mult(eval(l), eval(r)))
case _ => e
}
def show(e: Expr) = e match {
case Const(x) => print(x)
case _ =>
print("The expression could not be simplified to a constant.")
}
def main(args: Array[String]) {
var e = Add(Mult(Add(Const(1), Mult(Const(0), Variable("x"))),
Const(3)), Const(12))
var s = eval(e)
print(s)
}
}
Everything is an object in Scala. Thus, we have to define the functions to simplify as methods inside a singleton object. I named the functions evalOne and eval since it has a bit odd to have a function named simplify inside a Simplify object.
C++
Few understand every corner of C++'s monstrous standard. It's huge. Surely, with so many features, there must be something a quick way to solve this problem in C++. Well... no. It's a well-known lacuna with C++, see Open and Efficient Type Switch for C++ for a library built to implement pattern matching (the effort is directed by the creator of the C++ language). That said, here I'll use the boost library (A) because solutions based only on the standard library are contrived and (B) because boost is almost standard, and I don't want to rely on third-party libraries.
#include <iostream>
#include <string>
#include <boost/variant>
struct Add;
struct Mult;
using Expr = boost::variant<
int,
std::string,
boost::recursive_wrapper<Add>,
boost::recursive_wrapper<Mult>>;
struct Add {
Expr left, right;
Add(const Expr &left_, const Expr &right_) : left(left_), right(right_) {
}
};
struct Mult {
Expr left, right;
Mult(const Expr &left_, const Expr &right_) : left(left_), right(right_) {
}
};
struct add_visit : public boost::static_visitor<Expr> {
Expr operator()(int l, int r) const {
return Expr(l + r);
}
template<class X> Expr operator()(int l, const X &x) const {
return l == 0? Expr(x) : Add(Expr(l), Expr(x));
}
template<class X> Expr operator()(const X &x, int r) const {
return r == 0? Expr(x) : Add(Expr(r), Expr(x));
}
template<class X, class Y>
Expr operator()(const X &x, const Y &y) const {
return Add(Expr(x), Expr(y));
}
};
struct mul_visit : public boost::static_visitor<Expr> {
Expr operator()(int l, int r) const {
return Expr(l * r);
}
template<class X> Expr operator()(int l, const X &x) const {
return l == 0? Expr(0) : (l == 1? Expr(x) : Mult(Expr(l), Expr(x)));
}
template<class X> Expr operator()(const X &x, int r) const {
return r == 0? Expr(0) : (r == 1? Expr(x) : Mult(Expr(r), Expr(x)));
}
template<class X, class Y>
Expr operator()(const X &x, const Y &y) const {
return Mult(Expr(x), Expr(y));
}
};
struct simplify1 : public boost::static_visitor<Expr> {
Expr operator()(const Add &a) const {
return boost::apply_visitor(add_visit(), a.left, a.right);
}
Expr operator()(const Mult &m) const {
return boost::apply_visitor(mul_visit(), m.left, m.right);
}
template<class X> Expr operator()(const X &x) const {
return Expr(x);
}
};
struct simplify : public boost::static_visitor<Expr> {
Expr operator()(const Add &a) const {
auto left = boost::apply_visitor(simplify(), a.left);
auto right = boost::apply_visitor(simplify(), a.right);
auto add_lr = boost::apply_visitor(add_visit(), left, right);
return boost::apply_visitor(simplify1(), add_lr);
}
Expr operator()(const Mult &m) const {
auto left = boost::apply_visitor(simplify(), m.left);
auto right = boost::apply_visitor(simplify(), m.right);
auto mul_lr = boost::apply_visitor(mul_visit(), left, right);
return boost::apply_visitor(simplify1(), mul_lr);
}
template<class X> Expr operator()(const X &x) const {
return x;
}
struct print_expr : public boost::static_visitor<std::string> {
};
std::string operator()(int x) const { return std::to_string(x); };
std::string operator()(const Expr &e) const {
return "The expression could not be simplified to a constant.";
};
};
int main() {
auto e = Expr(Add(
Expr(Mult(Expr(Add(Expr(1), Expr(Mult(Expr(0), Expr("x"))))),
Expr(3))), Expr(12)));
auto s = boost::apply_visitor(simplify(), e);
std::cout << boost::apply_visitor(print_expr(), s) << std::endl;
return 0;
}
This is boost::variant in action. My biggest qualm with this type of clever header-heavy code is that you get to see a big chunk of the developers' lifework unroll before your eyes every time a small mistake is made. If you want to know how this code works, you need to read a bit on the visitor pattern.
C++ template metaprogramming is both awesome and terrifying. Mostly terrifying. Someone might find a simple way to use C++11/C++14 features to build a small and nice matching library.This library is halfway there (with boost::variant, it might already allow matching on type but I have yet to try). C++ is great for scientific computing, with tons of awesome libraries, so I'd love to see a nice way to simulate pattern matching and sum types.
Julia
Julia is an attempt to build a fast and flexible replacement for R/Python/Matlab. An issue with most dynamic languages is that there is no elegant way to switch on type. To be fair, you cannot really do it with most static languages either, see previous section... However, Julia supports multiple-dispatch based on type annotation. To be clear, it's quite different from the F#/Scala/Haskell approach. In these languages, it is possible to define sum types and do pattern matching on their constructors. With Julia, we define a function with type annotation and let the interpreter dispatch based on runtime type information. Multiple dispatch is supported in Julia for performance: it allows the interpreter to compile optimized functions and use the best one, adding predictability while keeping the language dynamic (for some reason...). Here's the algorithm in Julia:
abstract Expr
type Const <: Expr; val::Int end
type Var <: Expr; name::String end
type Add <: Expr; left::Expr; right::Expr end
type Mult <: Expr; left::Expr; right::Expr end
add(x::Const, y::Const) = Const(x.val + y.val)
add(x::Const, y::Expr) = x.val == 0? y : Add(x, y)
add(x::Expr, y::Const) = add(y, x)
add(x::Expr, y::Expr) = Add(x, y)
mult(x::Const, y::Const) = Const(x.val * y.val)
mult(x::Const, y::Expr) = x.val == 1? y : (x.val == 0? Const(0) : Mult(x, y))
mult(x::Expr, y::Const) = mult(y, x)
mult(x::Expr, y::Expr) = Mult(x, y)
simplify1(a::Add) = add(a.left, a.right)
simplify1(m::Mult) = mult(m.left, m.right)
simplify1(e::Expr) = e
simplify(a::Add) = simplify1(Add(simplify(a.left), simplify(a.right)))
simplify(m::Mult) = simplify1(Mult(simplify(m.left), simplify(m.right)))
simplify(e::Expr) = e
printExpr(c::Const) = print(c.val)
printExpr(e::Expr) =
print("The expression could not be simplified to a constant.")
e = Add(Mult(Add(Const(1), Mult(Const(0), Var("x"))), Const(3)), Const(12))
s = simplify(e)
printExpr(s)
Unlike pattern matching, we can only dispatch on type, so we need an if expression (the ? operator in Julia, just like C), or I could've used the match macro, but it's overkill here. It's not too inelegant, and at first I thought it was a good enough way to simulate sum types and pattern matching. It matters to the Julia ecosystem because these features are very useful to build solvers, logic and theorem proving systems, etc etc. Pretty nice for a technical computing platform. Unfortunately, while Julia does well with this simple example, I think an oddity with the language would soon bite us: return type declarations are not allowed, and yes, it is a big deal.
First, it's a question of correctness: you can return a float thinking you're returning an integer. That's awful. Also, since annotation is not allowed for the return value, it's also impossible to add annotation for a higher-order function (a function taking functions as input). As a concrete example, first-order logic has predicates mapping objects to a boolean, and functions mapping objects to objects. We'd like to do:
solve(pre::(Object -> Bool), ...)
solve(fun::(Object -> Object), ...)
But instead, we'd have to test the type of the return value inside the function. That said, Julia is young and it might get return type declarations at some point.
Conclusion
Sum types and pattern matching are awesome.
let world = "世界" in print $ "Hello " ++ world ++ "!"
let world = "世界" in print $ "Hello " ++ world ++ "!"
Post a Comment