wiki:Tutorial9

Folds

This is the classic foldl function. It "folds" a list from left to right, applying a given function to each element of a list and a value accumulated so far. It is a proper tail-recursion.

(define foldl
  (lambda (fn acc xs)
    (if (null? xs)
	acc
	(foldl fn (fn acc (car xs)) (cdr xs)))))

If we wish to apply + function to a list of numbers we would get the sum of all its elements

1 ]=> (foldl + 0 '(1 2 3 4))
;Value: 10

There is some simple visualization of how it works with each recursive call

1 ]=> (foldl d 0 '(1 2 3 4))
acc: 0, x: 1
acc: 1, x: 2
acc: 3, x: 3
acc: 6, x: 4
;Value: 10

Everything is pretty straightforward - imperative, constant space recursive function, we are traversing a list from left to right in simple and efficient way.

The true power of foldl comes from the possibility of supplying one's own combining function of two arguments (lambda (acc x) ...) while the elements of a list could be of any type. Foldl is a polymorphic function, it deals with the structure of a list, not with its elements.

So far so good, what if we would like to traverses a list from right to left?

Well, there is a naive, straightforward and very inefficient attempt to rewrite foldl:

(define foldr
  (lambda (fn acc xs)
    (if (null? xs)
	acc
	(foldr fn (fn (last xs) acc) (butlast xs)))))

For those, who remember college - last is O(n) and butlast is O(n), but it works correctly

1 ]=> (foldr + 0 '(1 2 3 4))
;Value: 10

Traditionally we switch the order of arguments for supplied function for right-to-left folds - the element comes first, and accumulator the second (lambda (x acc) ...) so, our visualization becomes:

1 ]=> (foldr d 0 '(1 2 3 4))
x: 4, acc: 0
x: 3, acc: 4
x: 2, acc: 7
x: 1, acc: 9
;Value: 10

This makes sense, if we imagine that a given function "eats" one element form the end of the list and accumulates the result. In case of foldl a function "eats" an element from the head of a list.

The story continues. There are very common situation, when the initial element for the accumulator comes from the list itself. In case of right fold it could be the last element of a list.

There is a function called foldr1 - it folds list from right to left, and uses its rightmost element as the initial value for an accumulator.

(define (last? xs) (null? (cdr xs)))

(define (foldr1 f xs)
   (cond ((null? xs) xs)
         ((last? xs) (car xs))
         (else (f (car xs)
                  (foldr1 f (cdr xs))))))

This is a moment to appreciate the beauty of recursion.

1 ]=> (foldr1 d '(1 2 3 4 0))
x: 4, acc: 0
x: 3, acc: 4
x: 2, acc: 7
x: 1, acc: 9
;Value: 10

It traverses the list only once, until it reaches the base case, and then, while unfolding the recursion, performs the computation in a proper order.

This means we could define foldr in terms of foldr1 and make it a bit more efficient:

(define foldr
  (lambda (fn acc xs)
    (foldr1 fn (append xs (cons acc '()))))

append is still O(n) function, and foldr1 itself is O(n) too, but it is much better than calling last and butlast for each element. This is still a naive reasoning.

Now meet the beautiful, classic, declarative but, unfortunately, NOT tail-recursive fold

(define (fold fn b xs)
    (if (null? xs)
        b
        (fn (car xs)
            (fold fn b (cdr xs)))))

This fold[r] function should be printed, framed and nailed to a wall, along with map and filter functions.

Another tricky possibility is to define foldr as a reverse of forldl, but in this case we must keep the order of arguments for combining function in the left fold's way (lambda (acc x) ...) - there is flip for it

(define (flip f)
   (lambda (x y) (f y x)))

(define (foldr fn acc xs)
    (foldl (flip fn) acc (reverse xs))))

These two also worth to be printed.

What about foldl1? Well, it is just foldl on the list without the first element

(define foldl1
  (lambda (fn xs)
    (foldl fn (car xs) (cdr xs))))

And here we go:

1 ]=> (foldl1 + '(0 1 2 3 4))
;Value: 10

1 ]=> (foldl1 d '(0 1 2 3 4))
acc: 0, x: 1
acc: 1, x: 2
acc: 3, x: 3
acc: 6, x: 4
;Value: 10

Now a few important ideas to realize.

  1. The structure of the data, lists in this case, determines the structure (or shape) of a function that traverses it.
  1. Due to asymmetrical nature of a list the functions also has different shape - to traverse a list from head to tail is not the same as from tail to head, due to the very nature of a list as a recursive data-structure.
  1. Sometimes it is much better define more general functions in terms of more specialized ones and vise versa.

Folding functions, by the way, considered to be very difficult to grasp, but I cannot see any difficulties. All we need is lambda and attention to details.

Last modified 7 years ago Last modified on Jul 30, 2013, 8:31:53 AM
Note: See TracWiki for help on using the wiki.