sutton036

Posts

Showing posts with the label haskell

October 16, 2018

Some objective quantitative measurements I once posted on Usenet: Jon Harrop wrote: > Andreas Rossberg wrote: >> That is a wild claim, and I doubt that you have any serious statistics >> to back it up. > > Here are some statistics on the proportion of lines of code devoted to > type annotations from 175kLOC of production OCaml and 5kLOC of production > Haskell: > > OCaml: > Hevea 9.0% > ADVI 8.6% > FFTW3 5.2% > Unison 3.5% > MLDonkey 2.5% > LEdit 1.4% > MTASC 0.0% > HLVM 0.0% > > Haskell: > XMonad 19% > Darcs 12% For further comparison, here are some statistics for compilers written in OCaml and Standard ML: OCaml: 6.3% of 217kLOC MosML: 13% of 69kLOC

Background reading on the reference counting vs tracing garbage collection debate

January 31, 2018

Eight years ago I answered a question on Stack Overflow about the suitability of OCaml and Haskell for soft real-time work like visualization: " for real-time applications you will also want low pause times from the garbage collector. OCaml has a nice incremental collector that results in few pauses above 30ms but, IIRC, GHC has a stop-the-world collector that incurs arbitrarily-long pauses " My personal experience has always been that RAII in C++ incurs long pauses when using non-trivial data (i.e. nested, structured, collections of collections of collections, trees, graphs and so on), non-deferred reference counting has the same problem for the same reason, tracing garbage collectors like OCaml work beautifully but there are many notoriously bad tools like Java that have given tracing garbage collection a bad name. Now that I am revisiting this issue I am surprised to find many individuals and organisations repeating exactly the same experimental tests that I did and coming...

Efficient Parallel Stencil Convolution in Haskell

September 04, 2013

Until recently, all research on parallel Haskell programs neglected CPU caches and existing high-performance software solutions like BLAS , LAPACK , FFTW and OpenCV . We criticized the paper Regular, shape-polymorphic, parallel arrays in Haskell for comparing parallel Haskell programs only with poorly-written C code, particularly because this buries the real performance differences in unnecessary cache misses. In 2011, Ben Lippmeier, Gabriele Keller and Simon Peyton-Jones published a paper entitled Efficient Parallel Stencil Convolution in Haskell that is the first to address caches and locality in the context of parallel Haskell programs and to compare parallel Haskell programs to existing high-performance solutions. The paper discusses an algorithm for edge detection in computer vision that was 10x slower using the old REPA approach and walks through the design and implementation of a more advanced solution using new approaches that take CPU caches into account and, ultimately, a...

Hash table insertion performance: F# vs OCaml vs Haskell

September 03, 2013

Time taken to insert ten million int→T bindings for T as each of int , float , complex and string using F# on .NET 4.5, OCaml 3.12 and GHC 7.4.2: On a 3.4GHz Intel Core i7-3770 running Windows 7. Value types and reified generics facilitate .NET's very efficient Dictionary implementation when handling value types like int , float and complex (unmatched by any other language/VM including C, C++ and Java). F# is 50% faster than OCaml and Haskell using the reference type string perhaps because the .NET write barrier is significantly more efficient. Note: we use "int" to refer to ordinary 32-bit ints in F# and Haskell but 31-bit ints in OCaml. Using 32-bit ints in OCaml would be much slower because they are boxed (i.e. heap allocated).

IO throughput: Haskell vs F#

January 04, 2011

Many applications require the ability to churn through gigabytes of data. In such cases, high IO throughput is valuable. This article examines high-throughput IO in the Haskell and F# programming languages. The Haskell solutions use the ByteString library and the F# solutions use ordinary .NET 4 IO. An eager Haskell solution that loads the entire file into memory may be written as follows: import System import Bits import Data.List import Data.ByteString as B main = getArgs ...

Getting paid to remove features

December 15, 2010

Although the Industrial Haskell Group has yet to garner its first industrial member since its inception almost two years ago, they have managed the impressive feat of getting paid to remove a feature from Haskell. Specifically, to make it easier to build programs written in Haskell that do not rely upon the GNU Multiprecision library for arbitrary-precision arithmetic (bignums). We made this interesting observation when considering adding bignums using GMP as a primitive type for HLVM . Apparently, having bignums in the language is not very useful beyond irrelevant microbenchmarks like computing the digits of π . The slides here also criticize the CAML Consortium (which has garnered 11 members) for charging too little and states that the IHG aimed to garner five members each paying £12k per annum. Why has this target not yet been reached? Our guess is insufficient sales and marketing directed at decision makers in industry who could benefit from using Haskell. As an aside, we believ...

Texas Multicore Technologies on Haskell

December 05, 2010

US-based startup Texas Multicore Technologies have published some of the results they obtained using Haskell for parallel programming. Their results mirror our own : Thanks to Manuel Chakravarty of the University of New South Wales for drawing this to our attention .

More OCaml trends

August 06, 2010

Paolo from Italy pointed out that the number of blog posts on OCaml has continued to increase in recent years (6,420, 10,500 and 12,100 in 2007/8/9 according to Google blog search) and referred to the success of this year's OCaml meeting with 80 delegates in France. These are certainly encouraging results but it may be worth bringing more data to the table. Firstly, Google Trends can be used to graph the proportion of Google searches for different search terms over time. The following graph shows the trends for the keywords OCaml and F# since 2004: As you can see, the proportion of searches for OCaml (blue) is in steady decline whereas the proportion of searches for F# (red) is on the increase and the two crossed over in 2007. In fact, we have found that Google Trends correlates very strongly with our revenue. Secondly, we can examine statistics about the job market. The following bar chart illustrates the change in UK jobs from 2008 to 2010 for four functional programming langu...

Parallel generic quicksort in Haskell

August 01, 2010

Haskell has a history of making easy problems difficult. Perhaps the most infamous example was the Sieve of Eratosthenes, which is easily implemented in any imperative language but was so difficult to write in Haskell that almost all of the solutions that had been taught in universities and used in research for the preceding 18 years had been wrong until Melissa O'Neill published a seminal paper The Genuine Sieve of Eratosthenes that gave a beautiful description of what they had been doing wrong and how it should be corrected. Melissa's solution was to use a priority queue to implement a rolling wheel of numbers. The correct solution turned out to be 10× longer than a much simpler F# solution and a whopping 100× longer than the original bastardized algorithm in Haskell. Today, quicksort is the new Sieve of Eratosthenes. Again, the academics have addressed Haskell's failure by bastardizing the algorithm , trading orders of magnitude in performance for something that Haskel...

Haskell's hash tables revisited: part 2

July 22, 2010

Our previous blog post contained several benchmark results comparing the new GHC 6.12.3 with F#. We have since discovered some other points of interest regarding this benchmark. Firstly, the results for Haskell rely on the use of a garbage collector that prevents parallelism. If the more modern multicore-friendly GC is used (by compiling with -threaded and running with +RTS -N8 ) then the time taken increases from 4.5s to 10.6s. This is over 2× slower than before and now over 13× slower than F#. Naturally, the F# was already using the multicore-capable .NET garbage collector so this was an unfair bias in favor of Haskell. Secondly, the Haskell code exploits an algorithmic optimization on the assumption that the keys are unique. This is often not the case in practice and, again, the F# code did not exploit such an assumption so this was another unfair bias in favor of Haskell. A fairer comparison may be obtained by changing from the insert function to the update function in the Hask...

Haskell's hash tables revisited

July 18, 2010

Update: We have since discovered that these results were biased towards Haskell. Mikhail Glushenkov recently announced the Haskell Platform 2010.2 RC for Windows. In particular, this is the first release to include a version of the Glasgow Haskell Compiler (6.12.3) that has the new garbage collector fix to address the performance problems Haskell programmers have been experiencing with mutable arrays of boxed values over the past 5 years, such as the spines of hash tables. Naturally, we couldn't resist benchmarking the new release to see if it lives up to the promise of decent hash table performance. Even though this installer for the Haskell Platform is just a release candidate, we found that it installed smoothly and ran correctly first time. First up, a repeat of our previous benchmark which inserted 10,000,000 bindings with int keys mapping to int values into an initially-empty hash table: GHC 6.12.1: 19.2s GHC 6.12.3: 4.48s F# .NET 4: 0.8s The new version of GHC is clear...

Purely functional games

July 05, 2010

A recent blog post entitled " Follow up to functional programming doesn't work " recently caused a bit of a stir, encouraging Haskell programmers to dredge up the nearest things Haskell has to real computer games. A Super Mario clone , Frag (a reimplementation of part of Quake), a Gradius clone , 4Blocks and a game that blows all of the others away called Bloxors: This beautiful little OpenGL-based puzzle game weighs in at just 613 lines of purely functional code (many of the other games use unsafe* functions to introduce uncontrolled side effects). This is also one of the few programs that cabal install actually works on. Check it out here !

Haskell's hash table woes (again)

June 18, 2010

Alessandro Stamatto recently asked a question on Stack Overflow about the current state of Haskell's hash tables following our rediscovery last year that Haskell's hash tables were up to 32× slower than common alternatives like C++ and .NET and even slower than Python. Don Stewart edited Alessandro's original question , deleting the citation to the original study that provided a verifiable experiment with complete code, before writing an answer claiming that " the original citation doesn't present any code or benchmarks ". Although Don provided a variety of Haskell-based performance measurements, he omitted any measurements of decent hash table implementations like those readily available from C++ or any .NET language that are still 10× faster than anything that can be written in Haskell. Norman Ramsey's answer took this a step further by drawing the conclusion that "functional data structures perform pretty well" despite the overwhelming evid...

Regular, shape-polymorphic, parallel arrays in Haskell

June 02, 2010

This recent paper describes the state-of-the-art in Data Parallel Haskell (DPH). Some of the code referred to in the paper is available here . This post is in response to a question from one of the coauthors, Roman Leshchinskiy, asking what is wrong with the paper. The problem with this paper is, in a word, cache . Specifically, the widespread adoption of CPU caches two decades ago forced a change in the way programmers traverse data structures when performance is of interest. The old style was to iterate directly using a for loop. The new style is either a cache aware set of loops that exploit knowledge of the sizes of caches in the machine, or a cache oblivious style that uses divide and conquer to subdivide the problem until it fits in-cache regardless of the cache's size. This paper uses neither of the modern approaches, opting instead to use direct loops that have not been seen in real numerical code for several decades. The paper incorrectly asserts that these are "wi...

Why is Haskell used so little in the industry?

May 31, 2010

Bugspy asked an interesting question on Stack Overflow recently: It is a wonderful, very fast, mature and complete language. It exists for a very long time and has a big set of libraries. Yet, it appears not to be widely used. Why ? I suspect it is because it is pretty rough and unforgiving for beginners, and maybe because its lazy execution makes it even harder Jeff Atwood, cofounder of Stack Overflow, has since closed the question for being subjective and argumentative. Sadly, Marc Gravell chose to edit my answer and delete half of my comments so the discussion no longer makes sense. However, my response will be of interest to anyone considering gambling their own money on Haskell so I shall repeat it here. The first point of interest is Norman Ramsey's meta-answer which was written from his point of view as an academic. Norman explains why he believes that major institutes have already made "great use" of Haskell: "Don't tell that to Credit Suisse or Standa...

Purely functional Heap Sort in OCaml, F# and Haskell

May 30, 2010

Here's Markus Mottl's OCaml translation of Okasaki's purely functional leftist heap: module LeftistHeap (Element : ORDERED) : (HEAP with module Elem = Element) = struct module Elem = Element type heap = E | T of int * Elem.t...

ML's powerful higher-order module system

April 11, 2010

For around 30 years, the ML family of languages have been characterised by several unusual features including a novel and powerful higher-order module system. This language feature was pioneered by Dave MacQueen in the early 1980s and found its way into the Standard ML and OCaml languages. Higher-order modules allow one module to be parameterized over another module in the form of functors that map modules to modules. Higher-order modules turn out to be especially useful for parameterizing concrete data structures over more primitive data structures. Perhaps the most compelling application of higher-order modules comes from graph theory, where they allow algorithms over graphs to be abstracted over the concrete data structures used to represent the graphs. This application was discussed in our most recent OCaml Journal article " The A* algorithm " that describes a generalization of Dijkstra's shortest path algorithm that is widely used for route finding in game AI. Th...