Friday, 16 February 2007

Benefits of pattern matching

Modern functional programming languages like OCaml provide many beneficial features not found in older languages like Lisp. ML-style pattern matching is one such feature that is heavily used by OCaml programmers both because it is fundamental to the language and because it is useful in many different situations.

In a recent thread on comp.lang.functional entitled "ML vs. Lisp", Paul Rubin asked what a Lisp programmer learning ML should focus on. I suggested pattern matching as one of the more important topics, giving the example of OCaml functions to simplify symbolic expressions:

let rec ( +: ) f g = match f, g with
| Q n, Q m -> Q (n +/ m)
| Q (Int 0), e | e, Q (Int 0) -> e
| f, Add(g, h) -> f +: g +: h
| f, g -> Add(f, g)

let rec ( *: ) f g = match f, g with
| Q n, Q m -> Q (n */ m)
| Q (Int 0), e | e, Q (Int 0) -> Q (Int 0)
| Q (Int 1), e | e, Q (Int 1) -> e
| f, Mul(g, h) -> f *: g *: h
| f, g -> Mul(f, g)

let rec simplify = function
| Q _ | Var _ as e -> e
| Add(f, g) -> simplify f +: simplify g
| Mul(f, g) -> simplify f *: simplify g

Two Lisp programmers responded with Lisp implementations of this function. Nathan Froyd gave an implementation resembling the OCaml code after pattern match compilation:

(defun simplify-common-funcs (xexpr)
(declare (optimize (speed 3)))
(if (atom xexpr)
xexpr
(labels ((make-expr (op f g)
(list op f g))
(optimize-plus (f g)
(cond
((eql f 0) g)
((eql g 0) f)
((and (numberp f) (numberp g)) (+ f g))
(t (optimize-associatively '+ #'optimize-plus f g))))
(optimize-multiply (f g)
(cond
((or (eql f 0) (eql g 0)) 0)
((eql f 1) g)
((eql g 1) f)
((and (numberp f) (numberp g)) (* f g))
(t (optimize-associatively '* #'optimize-multiply f g))))
(optimize-associatively (op opt-func f g)
(cond
((and (listp g) (eq op (first g)))
(let ((a (funcall opt-func f (second g))))
(funcall opt-func a (third g))))
(t (make-expr op f g)))))
(declare (inline make-expr optimize-associatively))
(let* ((op (first xexpr))
(f (simplify-common-funcs (second xexpr)))
(g (simplify-common-funcs (third xexpr))))
(cond
((eq op '+) (optimize-plus f g))
((eq op '*) (optimize-multiply f g)))))))

Pascal Constanza gave a Lisp implementation that used structs instead of s-exprs to represent the symbolic expression and leveraged method dispatch to gain some of the benefits of pattern matching:

(defgeneric simplify-add (x y)
(declare (optimize (speed 3)))
(:method ((x number) (y number)) (+ x y))
(:method ((x (eql 0)) y) y)
(:method (x (y (eql 0))) x)
(:method (x (y add))
(simplify-add (simplify-add x (add-x y)) (add-y y)))
(:method (x y) (make-add :x x :y y)))

(defgeneric simplify-mul (x y)
(declare (optimize (speed 3)))
(:method ((x number) (y number)) (* x y))
(:method ((x (eql 0)) y) 0)
(:method (x (y (eql 0))) 0)
(:method ((x (eql 1)) y) y)
(:method (x (y (eql 1))) x)
(:method (x (y mul))
(simplify-mul (simplify-mul x (mul-x y)) (mul-y y)))
(:method (x y) (make-mul :x x :y y)))

(defgeneric simplify (exp)
(declare (optimize (speed 3)))
(:method (exp) exp)
(:method ((exp add))
(simplify-add (simplify (add-x exp)) (simplify (add-y exp))))
(:method ((exp mul))
(simplify-mul (simplify (mul-x exp)) (simplify (mul-y exp)))))

Firstly, note that the expressiveness of OCaml's built-in pattern matching makes for very concise code (almost half the size). Secondly, note that OCaml's pattern matching provides static checking, catching many possible errors in pattern matches at compile time. Thirdly, performance measurements show that the simple OCaml implementation is 3-20x faster than the Lisp implementations.

So ML-style pattern matching has many benefits and is one of the most interesting aspects of this family of languages.

For more information on pattern matching, see here

Cheers,
Jon.

No comments: