wiki:Folds

Folds

TODO:

last     :: [a] -> a
last xs  =  foldl (\_ x -> x) lastError xs
-- a necessary cludge
lastError  :: a
lastError  =  errorEmptyList "last"

Foldr

foldr encapsulates a recursive traverse of a list (of nested conses and the empty list - (cons 1 (cons 2 (cons 3 '()))) with '() as the base-case of recursion, and uses the initial element for a given binary combiner procedure.

Another way of "visualizing" what right fold or foldr does is to imaging that it transforms a whole given list (re-writes it) into a new expression (data-structure), such that all the "conses" (:) will be replaced with given binary combining function f and the initial value i will be used instead of [] at the end.

Then this "new expression" will be evaluated to produce a final value.

Another way of visualizing what folds are doing is to realize that expression foldr (+) 0 [1,2,3,4,5] is equivalent of transforming the list [1,2,3,4,5] into an expression 1 + (2 + (3 + (4 + (5 + 0))) (parenthesis in Haskell are used for precedence, like in Math) and then evaluate it, while expression foldl (+) 0 [1,2,3,4,5] "produces" an expression ((((0 + 1) + 2) + 3) + 4) + 5 and evaluates it.

Think first in terms of plain substitution, without recursion.

foldr f z [1,2,3,4] = f 1 (f 2 (f 3 (f 4 z)))
foldl f a [1,2,3,4] = ((( a `f` 1) `f` 2) `f` 3) `f` 4

Right fold (foldr) is a non-tail-recursive process - O(n) growth of stack space.

(define (foldr f z xs)
    (if (null? xs)
        z
        (f (car xs) (foldr f z (cdr xs)))))

same in Haskell

foldr f z []     = z
foldr f z (x:xs) = f x (foldr f z xs)

and Erlang

foldr(F, Z, [H|T])  ->  F(H, foldr(F, Z, T));
foldr(F, Z, []) when is_function(F, 2) ->  Z.

In lazy languages such as Haskell foldr could be applied to so-called infinite lists like [1..] or let ones = 1 : ones when f is non-strict function. Each call to foldr then would "go only one level deep into the list xs" and a thunk (or a promise) would be generated, to be called on the next recursive call for foldr.

Also notice that recursion on "infinite lists" has no base case, so early termination must be encoded into f. Think of take or short-circuiting or.

Strict function on an infinite list will not terminate (no base case will be ever reached).

Now realize that using foldr with a strict function (like +) in a lazy language is a "bad idea", because it generates a long chain of thunks "up to the base case" - exactly like it would be in case of a strict language like Erlang.


Giving that we could re-define (re-write) almost any recursive procedure which does traverse of a list of nested conses in terms of foldr

First, realize that foldr (:) [] replaces all the (:) with (:) and [] with []. In other words, it builds a new list.

We could define a copy procedure like this

copy = foldr (:) []

so

> copy [1,2,3]
[1,2,3]

builds a new list (technically - a new "spine" of a list aliasing old values).

In Scheme we have to write explicit lambda while Haskell uses "section" - another bit of syntactic sugar.

(define (copy xs)
   (foldr (lambda (x xs) (cons x xs)) '() xs))

In Haskell (:) is an abbreviation for \x -> \y -> x : y which is roughly equivalent to (lambda (x y) (cons x y)) given that Haskell uses curried procedures to "emulate" a procedure with multiple arguments, using syntactic sugar for definition and application for curried procedures. Details are here

There are few more classic recursive procedures on lists defined in terms of foldr and a combine procedure:

product = foldr (*) 1

Note that for multiplication 1 is an identity. 0 would turn everything to 0. it is identity for addition.

and = foldr (&&) True

notise the initial value.

length = foldr (\_ a -> a+1) 0

which is nothing but

(define (length xs)
   (foldr (lambda (_ a) (+ 1 a)) 0 xs))

every using list comprehensions

every f xs = and [f x |x <- xs]

which is the same as with explicit map

every f xs = and (map f xs)

which is

every f = foldr (\x a -> f x && a) True

remember that f must be a predicate function of type a -> Bool,

Here is tail-recursive reverse which uses concatenation procedure (++) also known as append

reverse []     =  []
reverse (x:xs) =  reverse xs ++ [x]

It is efficient (it will be TCO-ed by compiler) but (++) itself is non-tail recursive:

[] ++ ys     =  ys
(x:xs) ++ ys =  x : xs && ys

So, in terms of foldr we could define it as

xs ++ ys = foldr (:) ys xs

which shows chow exactly foldr encapsulates recursion.

There is another definition of {{append}}} which captures reverse in order of arguments ys xs

append = flip (foldr (:))

Now reverse

reverse = foldr (\x xs -> xs ++ [x]) []

Notice that append accepts only lists as its arguments, so we have to transform x into [x] before applying ++ and that we have reversed order of concatenation.

Here is the same in Scheme

(define (reverse xs)
   (foldr (lambda (x xs) (append xs (list x))) '() xs)

where

(define (append xs ys)
   (foldr (lambda (x xs) (cons x xs)) ys xs))

note how we have to write explicit lambda for consing.

map in terms of foldr and compose - (.)

map f = foldr ((:) . f) []

where

f . g = \x -> f (g x)

which is just

(define (compose f g)
   (lambda (x) (f (g x))))

but we are using lambda instead of compose

(define (map f xs)
   (foldr (lambda (x xs) (cons (f x) xs)) '() xs))

because we don't have implicit currying in Scheme, and our cons requires exactly two arguments, while in Haskell (:) is a curried procedure.

and filter

filter p = foldr f []
  where
    f x xs
        | p x       = x : xs
        | otherwise = xs

Foldl

Left fold foldl is a tail-recursive process (a constant-space loop with an accumulator, that's why we are using a instead of i for a binding).

(define (foldl f a xs)
    (if (null? xs)
        a
        (foldl f (f a (car xs)) (cdr xs))))

or in Haskell

foldl f a []     = a
foldl f a (x:xs) = foldl f (f a x) xs

or in Erlang

foldl(F, A, [H|T])  ->  foldl(F, F(H, A), T);
foldl(F, A, []) when is_function(F, 2)  -> A.

foldl is tail-recursive, will be TCO-ed.

In Haskell, due to its lazy evaluation the accumulator will "grow" by building up an expression like (((0 + 1) + 2) + 3) + 4 without evaluating it, until foldl will reach the base case of recursion.

There is a quick fix for this

foldl' :: (a -> b -> a) -> a -> [b] -> a
foldl' f a []     = a
foldl' f a (x:xs) = let a' = f a x in seq a' (foldl f a' xs)

which just "evaluates" accumulator a' on each recursive call, like an eager evaluation version (Scheme) would do.

Right fold could be defined as a "flipped" Left fold on a reversed list, to avoid non-tail recursion.

(define (foldr f z xs)
    (foldl (flip f) z (reverse xs))))

where

(define (flip f)
    (lambda (x y) (f y x)))

This is more efficient definition of reverse - it runs in O(n) but still does consing (which is inefficient in non-pool-allocated implementations).

(define (reverse xs)
   (foldl (flip cons) '() xs))

same in Haskell

reverse = foldl (flip (:)) []

Notice that there is no append.

foldl goes from left to right and flip does the job.

Last modified 4 years ago Last modified on Feb 28, 2017, 9:14:07 PM
Note: See TracWiki for help on using the wiki.