wiki:Bootstraping

Take 5

The magic and beauty of a Lisp comes from multiple causes, or, to be more precise, from the mutual evolution (a-la mutual recursion) of all its core features.

This unique set of features is remarkable small and carefully chosen, or we shall say miraculously discovered long time ago by John McCarthy and his disciples.

No feature could be removed, since the whole marvel will collapse and, as strange as it sounds, no features should be added, because it ruins the balance, hence the magic and beauty of a Lisp. Everything suddenly becomes cluttered and messy.

The reasoning behind the choices are simple and powerful - very few selected concepts which are absolutely necessary to keep the whole system in harmony, well-balanced and, most importantly - good-enough.

The concepts are:

  • first-class high-order functions (mostly without side-effects)
  • uniform prefix notation for lisp code (lisp forms)
  • one uniform representation for the code and data (cons-cells)
  • automatic memory management (garbage-collection)
  • persistent data structures (mostly immutable)

The most important feature is about using the same underlying list structure (made out of cons-cells) for the code (lisp forms, which are lists) and the data. Logically, there is no difference between compound procedures and compound data, and any lisp forms (code) could be transformed or created by procedures the same way they process the data.

This feature, together with prefix notation, provides an unique mechanism for performing syntactic transformation of lisp expressions (forms), keeping its semantics unchanged. The procedures which describes the transformations are written in Lisp itself, which provides absolutely unmatched flexibility. Lisp thus become a Meta-language for itself.

High-order first-class functions, together with a meta-language (of macros) is the most convenient way to define domain-specific languages (DSLs) in Lisp "withing" a Lisp (a-la recursion again).

As long as structure of code is uniform (the same list of symbols in prefix notation) without introducing additional data-structures to the reader function (we should faithfully use lists), everything remains simple and transparent. Lisp forms could not just be manipulated or even created at run-time, but, and this is unique feature of Lisp, transformed at read-time by special kind of procedures written in Lisp. This is how DSLs are implemented.


Let's try to bootstrap an abstract Lisp system from the most fundamental special forms and primitive procedures.

So-called primitive procedures are pre-compiled procedures which are bound to corresponding symbols in the environment.

Special forms are Lisp expressions, which must be evaluated (by eval) according to its own special rules, in contrast to the uniform general evaluation rule for all other Lisp forms (expressions).

There are our mantras: All you need is lambda. Lambda is the ultimate form. Even conditionals could be defined in terms of lambdas.

There is a rough outline of the process.

  • All we need is lambda.
  • Out of lambdas we have blocks of code and compound procedures.
  • Out of lambdas we have conditionals.
  • With primitive procedures 'cons 'car 'cdr we have Pairs.
  • With Pairs we glue everything together.
  • Out of Pairs we create Lists.
  • Reader procedure (read) transforms Lisp forms into Lists.
  • Primitive-procedures are pre-loaded and bound to symbols in the environment.
  • So are few special forms.
  • Compound-procedures are composed out of Lisp forms (which are lists).
  • Some compound-procedures are special, they called reader macros.
  • Macros are procedures that transform Lisp forms (lists) at read-time.
  • Interpretation is a recursive process of invoking 'read 'eval 'apply 'print procedures.
  • Resulting compound-procedures are evaluated (eval) at run-time.
  • Special forms are evaluated (by eval) according to their own rules.
  • Macros are evaluated by the 'read procedure at read-time.

It seems that lots of other languages have almost the same set of features, but they cannot come even close to Lisps in its expressive power and flexibility. What this means is that other languages have more semantic restrictions and its syntax is much more cluttered and more cumbersome than Lisp's.

Why? Because a Lisp has almost no syntax, just this uniform prefix notation for defining Lists using parenthesis, which then transformed by the 'read procedure into "natural" Lists made out of cons cells.

At this time special kind of procedures - the reader macros, which perform syntactic transformations of Lisp forms (keeping semantics unchanged) are evaluated. These procedures themselves are written in Lisp, which means that the whole language is available as its-own meta-language. (recursion?)

With this meta-language whole layers of DSLs could be build "within" Lisp (or "on Lisp") in order to "grow" and adapt Lisp to match the natural (human) language people are using to describe the domain. Cons cells, therefore, is a common "glue" that holds together the layered code, data and macros.


Special forms. Each special forms has its own evaluation rules:

  • lambda - lambda (creates/produces an anonymous function in-place)
  • if - conditional (evaluates only one expression, depending on given condtiton)
  • define - binding (associate a symbol with a value in the environment)
  • set! - assignment (overwrite the value of a symbol)

user-defined special forms:

  • let - local bindings for a block of code (implicit application of a lambda)
  • progn/begin/do - a block of code (implicit lambda) for side-effects

Primitive procedures for Pairs:

  • cons - construct (creates a pair)
  • car - returns car of a pair
  • cdr - returns cdr of a pair

primitive procedures of the reader (macros):

  • quote - do not evaluate (returns a symbol of a form "as is")
  • quasi-quote - do not evaluate these symbols (but do parse the form)
  • unquote - evaluate (de-reference) a symbol
  • unquote-splicing - flatten the list, evaluating each element (into a list)

rules:

  • unquote should be used only "inside" (quasi-quote)
  • unquote-splicing should be used only "into" a list inside (quasi-quote

Lets start with foundation: procedures and its application, cons cells and lists made out of them, and macros. These things are not separate, but interviewed with one another.

There are two kinds of procedures - so-called primitive procedures, which are pre-loaded and associated with corresponding symbols in the environment, and compound procedures, which are "user-defined" procedures, composed out of lisp forms (expressions) also associated with symbols (names) in environment.

Logically, both kinds are the same - a compound procedure which was created using lambda form (and associated with a symbol by define) is immediately available for use and is no different from any other pre-defined procedure.

So, all we need is lambdas

((lambda () 1))         ; in-place creation and invocation of a procedure which
1                       ; behaves like constant - has no parameters and always returns 1

((lambda (x) x) 1)        ; in-place creation and invocation of the identity function
1

((lambda (x) (* x x)) 2)  ; this is the famous square procedure
4

and lists

((lambda xs xs) 1 2 3)    ; in-place creation of a list
(1 2 3)

((lambda xs xs))
()

How does this work and why? It is because lisp forms are lists and the read function performs the conversion. Let's trace what is going on here. To do this we need some knowledge we still do not posses, so watch out.

First, evaluation of the lambda expression creates a procedure with variable number of arguments. How so? Because we explicitly indicated that the second part of the expression is not a list (there are no parentheses around it) but just a symbol. This means that whatever it could be, we just reference it by this symbol.

So what exactly could it be? Well, it will be (of course) a list. Why? Because a procedure supposed to have so-called parameter list (and almost everything in Lisp is a list). This list could have zero or more formal parameters. If we wish to have a procedure with fixed arity, that is with zero, one or two arguments, we shall explicitly define its parameter list as () or (x) or (x y). Notice that these are lists - look at the parenthesis.

When we do not know in advance how many arguments we would have, then we just use a symbol instead of explicit list, thus we indicate to the interpreter that this procedure has variable number of arguments and there is a symbol to which the actual argument list shall be bound after invocation. Everything is simple and logical when one understands hows and whys.

Notice how a lambda expression relies on the list structure (it is a nested list itself) and how list structures are made by invoking a procedure (if not, recur till enlightenment). This is an example of how all the features of Lisp are supporting and using each other.

Remember that Lists are made out of cons cells. For cons cells we need three procedures: constructor, which creates a cons cell

(cons 1 2)
(1 . 2)

and selectors such that they fulfill the contract:

(car (cons 1 2))      ; car is the first part of a cons cell
1
(cdr (cons 1 2))      ; cdr is the second part of a cons cell
2

There are some more magic forms which would be explained later (Well, these are binding forms, they associate symbols with values in an environment. Notice again how interdependent all the features of Lisp are).

(define x 1)                          ; this creates a binding between the symbol x and value 1
(define ys ((lambda xs xs) 1 2 3))    ; this binds a symbol to a list

For macros we need "list-aware" 'read procedure which we already have (by magic)

(read)

and four "primitive" procedures the 'read function will use:

(quote x)             ; returns a following lisp form "as is", unevaluated
x
(quasi-quote (f x))   ; returns list of symbols unevaluared but "looksinside it" for..
(f x)

(unquote x)           ; evaluates (de-references) a given symbol
(unquote-splicing xs) ; flattens a list evaluating all its elements into a list     

such that

(quasi-quote (unquote x))
1
(quasi-quote ((unquote-splicing ys)))
(1 2 3)

Remember the rules. First thing is mere evaluation of a symbol (de-referencing), while second one is more complicated - it evaluates all the elements of a given list ys and places (splices) them into some "outer" list inside a quasi-quote. It should be invoked only from inside of a list, inside a quasi-quote.

Why do we need such things? Because lists are the basic building block of everything in Lisp itself, so we need a few nice little helper procedures.

Because the reader procedure is "aware of" and runs these quoting procedures, it is possible to abbreviate them in a very clever way by binding another symbols to them (aliasing). Such abbreviation of syntax is called "syntactic sugar".

  • quote then is associated with a quotation symbol '
  • quasi-quote is a symbol we used to call back-quote `
  • unquote becomes comma , (beautiful, yeah?)
  • and unquote-splicing is aliased as two symbols ,@ (comma-at)

Now instead of typing

(quote 1)
1
(quote x)
x
(quote +)
+

we could just say

'1
'x
'+

or

,1
,x
,+

which is not semantically correct, because unquote should be used only within outer back-quote.

Now lets see some simplest transformations

`(x ,x)
`(xs ,xs)
`(xs ,@xs)

See how it works?

Now let's talk about the 'list procedure

This very beautiful definition (meditation on which could lead to a premature enlightenment)

(define list (lambda xs xs))

(list)
()

is possible only when we already have the working 'read procedure (part of Lisp's REPL) but how could we define the 'list procedure which can be used to write the 'read procedure?

(define (list . xs)
    (if (null? xs)
        '()
        (cons (car xs)
              (apply list (cdr xs)))))

where 'apply is another pre-defined procedure of Lisp which applies a given function (with variable number of arguments) to a given list of values.

Let's talk about the 'if special form and conditionals later.

Last modified 7 years ago Last modified on Dec 15, 2013, 4:40:25 AM
Note: See TracWiki for help on using the wiki.