Monday, 25 December 2006

Discrete wavelet transforms

Following an interesting thread in comp.lang.python, I spent some time implementing convolution and lifting-scheme based implementations of the discrete wavelet transform in C++, OCaml and F#.

The original Python implementation used the lifting scheme algorithm and was implemented using slicing:

max_level = int(math.floor(math.log(x.shape[axis],2)))
if level is None:
level = max_level
assert level <= max_level, "Level param too high"
coeffs = []

cA = x.swapaxes(0, axis)
while level > 0:
level -= 1
even = cA[::2]
odd = cA[1::2]
even += C1*odd
odd[0] -= C2*even[0] + C3*even[-1]
odd[1:] -= C2*even[1:] + C3*even[:-1]
even[:-1] -= odd[1:]
even[-1] -= odd[0]
even *= C4
odd *= C5
cA, cD = even, odd

coeffs.append(cD.swapaxes(axis, 0))
coeffs.append(cA.swapaxes(axis, 0))

return coeffs

The Python implementation eagerly allocates intermediate arrays that are trivial combinations of small numbers of elements from existing arrays.

There are several different ways to translate this into OCaml. If we start from scratch with the task of computing this wavelet transform then we are most likely to write in a C style using convolution:

let rec d4_aux tmp a n =
let n2 = n lsr 1 in
for i=0 to n2-2 do
tmp.[i] <- a.[i*2] *. h0 +. a.[i*2+1] *. h1 +. a.[i*2+2] *. h2 +. a.[i*2+3] *. h3;
tmp.[i+n2] <- a.[i*2] *. g0 +. a.[i*2+1] *. g1 +. a.[i*2+2] *. g2 +. a.[i*2+3] *. g3;
tmp.[n2-1] <- a.[n-2] *. h0 +. a.[n-1] *. h1 +. a.[0] *. h2 +. a.[1] *. h3;
tmp.[2*n2-1] <- a.[n-1] *. g0 +. a.[0] *. g1 +. a.[1] *. g2 +. a.[2] *. g3;
Array.blit tmp 0 a 0 n;
if n > 4 then d4_aux tmp a (n lsr 1)

let d4 a =
let tmp = Array.create (Array.length a) 0. in
d4_aux tmp a (Array.length a)

This convolution based approach is 5x faster than the Python. However, some people argued that this is an unfair comparison because the implementations in different languages are using different algorithms.

We can also mimic the slicing approach used by Python. A naive translation generates many intermediate arrays (7 array allocations per iteration) but this is easily reduced by composing closures rather than allocating new arrays (a deforesting optimisation):

let rec d4_aux cA = function
1 -> [cA]
n ->
let n = n/2 in
let even i = cA.(2*i) in
let odd i = cA.(2*i + 1) in
let even i = even i +. c1 *. odd i in
let odd = function
0 -> odd 0 -. c2 *. even 0 -. c3 *. even (n-1)
i -> odd i -. c2 *. even i -. c3 *. even (i-1) in
let even = function
i when i = n-1 -> even i -. odd 0
i -> even i -. odd (i+1) in
let even = init n (fun i -> c4 *. even i) in
let odd = init n (fun i -> c5 *. odd i) in
odd :: d4_aux even n

let d4 a = d4_aux a (Array.length a)

This implementation allocates only two arrays and fills them from a composition of closures. This is slower than the cache-friendly convolution-based approach but is still 60% faster than the Python.


Saturday, 18 November 2006

Camlp4 exposed by Bruno

Bruno De Fraine did an excellent job answering a question I posted about using camlp4 to statically differentiate an OCaml expression:

Didn't occur to me to get familiar with camlp4's representation of OCaml's AST by playing with it in the top level. It's actually really easy... :-)

Friday, 17 November 2006

OpenGL demos on-line

We recently revamped the site of my OCaml book. People might be interested in the OpenGL demos that are on-line there:

Let me know what you think of the new design!


Will Farr's blog

I recently discovered an excellent blog by Will Farr. Check out this entry on symbolic differentiation:
Welcome to my first ever blog site. I'll be uploading anything of interest that I find or make.

In the mean time, please buy a few copies of my OCaml book: