Wednesday, December 31, 2008

What's Wrong With Reduce?

Functional programming seems to be on the rise—there's even an O'Reilly book on Haskell, now. I'm not quite sure how that increase in popularity happened, but would guess that Paul Graham's advocacy of Lisp had something to do with it, as did Douglas Crockford's work with JavaScript. I also suspect that the higher-order constructs in Python played an important role, allowing imperative programmers to gradually start using a functional style.

However, there is something curious about the functional constructs in Python. One of the higher-order functions in Python—reduce—seems not to find much acceptance. Probably the best example of this is from the BDFL himself, who wrote:
…almost every time I see a reduce() call with a non-trivial function argument, I need to grab pen and paper to diagram what's actually being fed into that function before I understand what the reduce() is supposed to do. So in my mind, the applicability of reduce() is pretty much limited to associative operators, and in all other cases it's better to write out the accumulation loop explicitly.



The fate of reduce() in Python 3000

Guido van Rossum, 2005

I don't think the above is out of line with most Python users think, but it clashes severely with conventional thinking of users of functional programming languages.

For example, in an influential paper, John Hughes wrote
…a little modularisation can go a long way. By modularising a simple function (sum) as a combination of a “higher order function” and some simple arguments, we have arrived at a part (reduce) that can be used to write down many other functions on lists with no more programming effort.
and
…functional languages allow functions which are indivisible in conventional programming languages to be expressed as a combination of parts—a general higher order function and some particular specialising functions. Once defined, such higher order functions allow many operations to be programmed very easily. Whenever a new datatype is defined higher order functions should be written for processing it. This makes manipulating the datatype easy, and also localises knowledge about the details of its representation. The best analogy with conventional programming is with extensible languages—it is as though the programming language can be extended with new control structures whenever desired.


Why Functional Programming Matters

John Hughes, 1984

My own experience with ML is quite in line with what Hughes wrote: reduce, or foldl as it is also known, is used all the time, and seems very natural.

Why the difference? I'm not aware of any case where GvR has demonstrated much knowledge of functional programming—maybe he was just speaking outside his area of expertise, and got it wrong? The PLT Scheme people may have thought so, satirizing him shortly after in a (quite funny!) April Fool's Day announcement.

However, I think that explanation is overly facile. Personally, while I use folds in functional languages, I rarely do in Python. I think there are real reasons for that, and will examine them in future blog posts.

No comments: