wiki:ListComprehensions

List Comprehensions

List Comprehensions in Haskell is a concise, clear declarative way to define lists. This syntax is inspired by traditional list comprehension notation they use in Math.

Some talking heads are excited about what they call "generators" (abstracted stateful loops) while we are dismissing the generator pattern due to being stateful and clumsy. Python loves it. We are in love with "Iteratees".

In Haskell List Comprehensions is a special syntax (syntactic sugar) which could be considered as a sub-language (or a DSL) for defining lists in declarative way.

The simplest example could be something like this

> [x^2 |x <- [1..10]]
[1,4,9,16,25,36,49,64,81,100]

This should be read like in Math - a list of squares of x while x is drawn from the set of positive integers form 1 to 10 inclusive.

There is a lot of what's going on.

First, that [1..10] expression. It consists of two parts. Creating a list, using the standard notation, like this

> []
[]
> [1,2,3]
[1,2,3]

which is nothing but a different syntax for "in-place" definitions of lists, like we do with numbers or strings

> '()
()
> '(1 2 3)
(1 2 3)

Second, instead of writing down all the elements of the list, we are using another piece of special syntactic sugar - the range notation - 1..10 which means "all integers form 1 to 10",

The rough equivalent could be something like this:

> (define (range x y)
     (if (> x y)                       ; inclusive
         '()
         (cons x (range (+ 1 x) y))))
range
> (range 1 10)
(1 2 3 4 5 6 7 8 9 10)

So, we could think of this [x..y] notation as a syntactic sugar for calling the range x y procedure for lists.

What they call generator expression is another bit of special syntax v <- expr which roughly means "loop over", or as we would prefer, "traverse" the whole list.

So, the whole expression [x^2 |x <- [1..10]] is equivalent to

> (map (lambda (x) (* x x)) (range 1 10))
(1 4 9 16 25 36 49 64 81 100)

using the definition of range for lists given above and remembering that map is a convenient way to transform a whole list given a procedure which transforms a single element.

It is important to realize, that Haskell's list comprehensions is almost entirely declarative definition of a list, while in Lisps we must describe the way to perform a transformation - by using the map procedure.

But there is much more than that.

List comprehensions in Haskell could be nested.

> [(x,y) |x <- [1,2,3], y <- [4,5]]
[(1,4),(1,5),(2,4),(2,5),(3,4),(3,5)]

The meaning of this is very straight-forward - "for each element of the first list, take each element of the second".

This could be also described in imperative way as "nested loops" - all the iterations of the "inner loop" (which iterates over the second list) is executed with every iteration of the "outer loop".

In Haskell we could use bindings in nested "generators" (think of them as mere implicit lambdas)

[(x,y) |x <- [1..3], y <- [x..5]]

this is a lot like nested lets (or a let* expression). You could use bindings of the outer let (or lambda) inside the inner let (or lambda).

Remember that

> (let ((a 1))
   (let ((b a))
      (+ a b)))
2

is nothing, but

> ((lambda (a)
     ((lambda (b)
        (+ a b)) a)) 1)
2

in a Racket-like pseudo-code we could write

...
   (for/each ((x (in-range 1 3)))
      (for/each (y (in-range (x 5))) ; we could use x as a free variable
         (cons (cons x y) ... )))))

which is ugly! Nested lambdas would be much better.

Now let's look at this

concat xss = [x | xs <- xss, x <- xs]

It could be read as follows: create a list of x by taking each one element from xss as xs and then take each element of xs as x.

This is beautiful! Think of this as two nested recursive procedures

This re-formatting could convey the notion of nesting better

concat xss = [x | xs <- xss
                , x  <- xs]

In Scheme it could be

(define (concat xss)
  (if (null? xss)
      '()
      (append (car xss) (concat (cdr xss)))))

where append is, of course

(define (append  xs ys)
  (if (null? xs)
      ys
      (cons (car xs) (append (cdr xs) ys))))

Again, in Scheme we must define how do we traverse a list recursively, while in Haskell we omit all the details and just "declare" which values we need.

Guards are predicates, so we could use a predicate on x as a "filter"

> [x | x <- [1..10], even x]
[2,4,6,8,10]

which is semantically the same as

> (filter even? (range 1 10))
(2 4 6 8 10)

where filter is

(define (filter f xs)
  (cond ((null? xs)   '())
        ((f (car xs)) (cons (car xs) (filter f (cdr xs))))
        (else         (filter f (cdr xs)))))

and again we have to explicitly define all the details - consing and the base case of recursive traversal

Guards are nothing more than a syntactic sugar for a conditionals, so a predicate expression must be of type a -> Bool

So, it looks like List Comprehensions syntax in Haskell is a kind of DSL or a sub-language. We have seen something like this before - in Common Lisp looping constructs - loop, dolist macros.

But Haskell syntax is much more clever. We could think of a notation from Math - x | x <- [1..10], even x (which could be expanded into a list generation and its recursive traversal inside the filter procedure) inside a list [ ... ].

Now we could understand Haskell better. There is no miracles in it, only a syntactic sugar fest.)

Last modified 5 years ago Last modified on Nov 20, 2014, 3:12:18 PM
Note: See TracWiki for help on using the wiki.