Sunday, 7 February 2010

Recursive descent parsing with OCaml's streams

The OCaml distribution includes a mutable stream type that allows sequences to be dissected efficiently. The camlp4 syntax for OCaml includes support for pattern matching over streams.

The following example uses the camlp4 stream-parsing syntax extension to implement a recursive descent parser for a simple calculator:

let rec factor = parser | [< ''('; e = expr; '')' >] -> e | [< 'c >] -> int_of_string (String.make 1 c) and term = parser | [< e1 = factor; e = term_aux e1 >] -> e and term_aux e1 = parser | [< ''*'; e2 = term >] -> e1 * e2 | [< ''/'; e2 = term >] -> e1 / e2 | [< >] -> e1 and expr = parser | [< e1 = term; e = expr_aux e1 >] -> e and expr_aux e1 = parser | [< ''+'; e2 = expr >] -> e1 + e2 | [< ''-'; e2 = expr >] -> e1 - e2 | [< >] -> e1 let rec statement = parser | [< e = expr; '';'; ''.'; stream >] -> Printf.printf "= %d\n%!" e; statement stream let rec read() = match input_char stdin with | ' ' | '\t' | '\n' -> [< read() >] | c -> [< 'c; read() >] let () = statement (read())

This program may be compiled and run as follows:

$ ocamlc -g -dtypes -pp camlp4o parser.ml -o parser $ ./parser 1+2+3;. = 6 1+2*3;. = 7 1+2*3+4;. = 11

No comments: