Posts

Showing posts with the label quicksort

Parallel generic quicksort in Haskell

Haskell has a history of making easy problems difficult. Perhaps the most infamous example was the Sieve of Eratosthenes, which is easily implemented in any imperative language but was so difficult to write in Haskell that almost all of the solutions that had been taught in universities and used in research for the preceding 18 years had been wrong until Melissa O'Neill published a seminal paper The Genuine Sieve of Eratosthenes that gave a beautiful description of what they had been doing wrong and how it should be corrected. Melissa's solution was to use a priority queue to implement a rolling wheel of numbers. The correct solution turned out to be 10× longer than a much simpler F# solution and a whopping 100× longer than the original bastardized algorithm in Haskell. Today, quicksort is the new Sieve of Eratosthenes. Again, the academics have addressed Haskell's failure by bastardizing the algorithm , trading orders of magnitude in performance for something that Haskel...

OCaml vs F#: Burrows Wheeler Transform

The Burrows-Wheeler Transform (BWT) is a block-sorting data compression algorithm that acts as a preconditioner to dramatically improve the compression ratios of subsequent compression phases in many cases. This algorithm is the core of the bzip2 data compression utility that is ubiquitous on Unix and, in particular, is often much more effective than zip and gzip. The core of the BWT algorithm is easily written in F# : let cmp (str: _ array) i j = let rec cmp i j = if i=str.Length then 1 else if j=str.Length then -1 else let c = compare str.[i] str.[j] in if c 0 then c else cmp (i+1) (j+1) cmp i j let bwt (str: byte array) = let n = str.Length let a = Array.init n (fun i -> i) Array.sortInPlaceWith (cmp str) a Array.init n (fun i -> str.[(a.[i] + n - 1) % n]) However, this implementation is very slow and, in particular, is many times slower than the extremely heavily optimized C implementation found in the bzip2 program. The main ca...