reduce
in Python, we need to come to grips with what reduce
actually is. I find that the documentation for reduce
is rather poor, requiring a fair amount of thinking to determine just what you need to pass in as arguments. Instead, let's examine the origins of reduce
to better understand it.Reduce
originates outside of Python. It comes to Python through a contributed patch: About 12 years ago, Python aquired lambda, reduce(), filter() and map(), courtesy of (I believe) a Lisp hacker who missed them and submitted working patches.That's not very helpful, so let's derive
The fate of reduce() in Python 3000
Guido van Rossum, 2005
reduce
by abstracting from some examples. Instead of Lisp, I'll use Standard ML. SML has two key advantages over Lisp: the syntax is more approachable for the uninitiated, and its robust (Hindley-Milner) type system and pattern matching will make things a little clearer. I'll follow the definitions in SML rather than Python, so we'll refer to reduce
-like functions as folds, with argument order matching those in the Standard ML Basis Library.Consider adding up a list of integers. In SML, that list is a linked list, so we'll have a typical recursive solution:
fun sum [] = 0I've made use of pattern matching to define
| sum (h::t) = h + sum t
sum
in terms of two cases, that of an empty list and of a non-empty list. The double colon is a cons operator, joining the head h
and tail t
of the list.To define
map
, a similar recursion scheme is used:fun map f [] = []
| map f (h::t) = (f h) :: (map f t)
Map
is written as a curried function, so different list processors can be defined by specifying just the function map f
, only giving the list later.1 Just about any function over a list uses the same recursion scheme. Let's extract that recursions scheme into a higher order function called
foldr
:fun foldr f acc [] = acc
| foldr f acc (h::t) = f(h, (foldr f acc t))
Foldr
deconstructs a list from right to left, starting with the tail and accumulating the values. As with map
, foldr
is a curried function. The type of
foldr
can make it easier to understand:('a * 'b -> 'b) -> 'b -> 'a list -> 'bThe first term, in the parentheses, describes the function
f
; it takes a list element and an accumulator, returning an updated accumulator. The second argument to foldr
is an initial value for the accumulator. Third is the list to process. Importantly, the types of the accumulator and the list elements can differ. We can use
foldr
to define sum
and map
. Here's sum
:val sum = foldr Int.+ 0For
map
, we'll introduce a helper function and use foldr
on that:fun map f lst = letTo get the helper function
fun maphead(x, acc) = (f x) :: acc
in
foldr maphead [] lst
end
maphead
right, I just looked at the types needed for foldr
and wrote the obvious function to satisfy those types. While
foldr
allows elegant definition of our two example functions, there is a problem. As written, foldr
walks through the entire list before starting to do any work. If the list is long, this can cause a stack overflow. We need a tail-recursive version, leading us naturally to foldl
:fun foldl f acc [] = accIn contrast to
| foldl f acc (h::t) = foldl f (f(h, acc)) t
foldr
, foldl
processes the list from left to right, starting with the head. This definition of foldl
is tail recursive, so operates in constant space. It can also be used to define a tail recursive foldr
by reversing the list before passing it to foldl
. Foldl
is substantially similar to reduce
in Python. Next time, I'll translate the above definitions to Python, to make the connection transparent.1 Having a curried
map
in SML makes for a big conceptual difference from the map
in Python. It's quite reasonable to view map
as transforming a function from acting on values to a function acting on lists of values, without map
itself actually having anything to do with the list.
No comments:
Post a Comment