Tuesday, January 6, 2009

W3R-5: Cart Follows Horse

Higher order functions can be used to implement control abstractions, but that doesn't mean all higher order functions provide useful control abstractions. As we've seen, a useful control abstraction in one language may not work so well in another. In SML, folds are frequently used, while the analogous reduce is uncommon in Python. In retrospect, this doesn't seem so surprising: SML is principally a functional programming language depending on declarative (i.e., recursive) control flow, while Python is a hybrid procedural and object-oriented programming depending on imperative control flow. Where the two languages overlap is where reduce becomes useful, but that overlap is relatively small.

If reduce abstracts over the wrong pattern of control flow, then what are the correct patterns for Python? Let's return to map and see what we can conclude. An imperative definition of map is natural, along the lines of:
def map_imper(function, iterable):
acc = []
for x in iterable:
return acc
This version of map utilizes a typical imperative pattern of control, consisting of mutating an object based on values encountered as we iterate though a collection using a for loop. Python has no higher order procedure capturing this imperative pattern; there is such a higher order procedure for Standard ML, where it is called app.

Let's define app for Python. It's easy:
def app(mutate, iterable):
for x in iterable:
Note that app has no useful return value, always returning None, emphasizing that it is always called for its side effects. Using app to define map, we obtain
def map_app(function, iterable):
acc = []
def mutate(x): acc.append(function(x))
app(mutate, iterable)
return acc
The definition of map_app is no simpler than that of map_imper, but does illustrate an appropriate control abstraction.

Although not needed for map, several generalizations of app could be made. One possibility would be to include optional arguments in order to allow slicing of the iterable:
import itertools
def app(mutate, iterable, start=None, stop=None, step=None):
if start is not None or stop is not None or step is not None:
iterable = itertools.islice(iterable, start, stop, step)
for x in iterable:
Many other variations are possible. I'll not suggest which is correct, as that is a question best addressed by empirical analysis of existing code bases.

Now let's reconsider our other motivating example, sum. An imperative implementation is quite similar to that for map:
def sum_imper(numbers):
acc = 0
for x in numbers:
acc += x
return x
This is the same pattern of control as used in map, mutation within a for loop, but it cannot be expressed using app. In sum, the mutation is that of the variable acc. We can't pass in a variable to app and expect it to be modified, as only the local variable would be changed. That is, app mutates values, while sum mutates a variable.

I can think of two solutions. First, we could introduce another higher order procedure to abstract over what happens in sum. That procedure—function, in this case—is just reduce. Second, we could adapt the value from an immutable number to some mutable object. Which to choose is pretty much a matter of taste. Having two substantially similar control flow mechanisms may be undesirable, but reduce is also a known quantity. The adaptor approach is not as well explored, but may turn out to be more natural. Let's take a look.

When combined with app, the adaptor needs to produce the same effect as reduce. Since app is handling the control flow, the adaptor just needs to handle the accumulation. One possible definition is to use an Accumulator class, defined as:
class Accumulator(object):
def __init__(self, function, init):
self.function = function
self.value = init

def update(self, x):
self.value = self.function(self.value, x)
As an example of usage, here is an accumulator for summation:
>>> acc_sum = Accumulator(operator.add, 0)
>>> acc_sum.update(4)
>>> acc_sum.value
>>> acc_sum.update(37)
>>> acc_sum.value
Putting together an Accumulator instance and app, we can give another definition of sum:
def sum_adapt(numbers):
acc = Accumulator(operator.add, 0)
app(acc.update, numbers)
return acc.value
As a second example, here is yet another definition of a fold:
def fold_app(function, init, iterable):
acc = Accumulator(function, init)
app(acc.update, iterable)
return acc.value
Personally, I find these definitions to be quite natural and appealing; app and Accumulator would fit in well with my programming style.

No comments: