Friday, 9 November 2007

Most popular functional languages on Linux

The Linux operating system is unique in providing a wide variety of tools for developers. In particular, Linux offers an incredible variety of programming languages. This post describes our attempt to measure the popularity of functional programming languages on Linux.

There are many language popularity comparisons out there. The TIOBE programming community index is a famous one based upon the number of search hits indicated by various search engines. Like every comparison, the TIOBE results are flawed in various different ways. Some of the most important problems with this particular measure are:

  • Legacy: older languages have more out-of-date web pages.
  • Unpopularity: this metric is an equally good measure of the unpopularity of a language.
  • Subjectivity: the estimated number of search results returned by search engines is highly dependent upon unrelated factors like Google's algorithm du jour.

We are going to try to measure language popularity on Linux using a more objective metric: the results of the Debian and Ubuntu package popularity contests. Amongst other things, the results allow us to determine how many installations there are for core development tools for each language. Summing the number of installations gives a much more accurate estimate of the number of people actually developing in each language.

Before we go into detail, let's consider some of the flaws in this approach. Firstly, the absolute number of installations is not equivalent to the number of users. Many users will have their favorite language installed on several different systems. Secondly, programmers using languages with multiple different implementations are likely to have several different compilers for that language on each machine. This will bias the results in favor of languages with multiple implementations (such as GHC and Hugs for Haskell). Finally, these results only apply to Ubuntu and Debian users who elected to contribute to the popularity contests. We are assuming that other Linux distributions will give similar results and we can test this to some extent by comparing the results between Debian and Ubuntu.

The results were compiled by summing the contributions from the following major development packages for each language:

  • Erlang: erlang-base
  • OCaml: ocaml-nox
  • Haskell: ghc6 and hugs
  • Lisp: clisp, sbcl, gcl and cmucl
  • Scheme: mzscheme, mit-scheme, bigloo, scheme48 and stalin
  • Standard ML: smlnj, mosml and mlton
  • Eiffel: smarteiffel
  • Mercury: mercury
  • Oz: mozart

The results are illustrated in the graph above. Sure enough, the number of installations is similar between Debian and Ubuntu and, therefore, it seems likely that these results will reflect the trend for most Linux users.

We found the results surprising for several reasons:

  • Lisp is often cited as the world's most popular functional programming language yet it comes 4th after OCaml, Haskell and Erlang in our results.
  • There is no clear preference for a most popular functional programming language. Instead, we find that OCaml, Haskell and Erlang are all equally popular.
  • Despite the bias against OCaml because it is unified by a single implementation, this language still appears to be among the most popular functional programming languages on Linux. This is even more surprising before there are few OCaml books.

Following Microsoft's productization of their OCaml derivative F#, it seems likely that OCaml will continue to grow in popularity on the Linux platform.

No comments: