Version 5 (modified by schiptsov, 9 years ago) ( diff )


There are some very basic concepts to understand.


  1. The concepts and notations from Math are too complicated for a programming language.
  2. Using indentation to represent nested elements of a structure is a brilliant idea, but we cannot use it without breaking unity.

List is what a kid will say if one ask him what he like to eat - just N words:


If we wish to write down those words it will form a sequence:

apples oranges grapes

According to the rules of writing language we should place comas between elements of a list:

apples, oranges, grapes

So far, so good, but what if we want to refer to this list an unique artifact, an unique object. How could we distinguish it from another sequences of words?

In the realm of writing language we could say it in two sentences:

This kid likes fruits: apples, oranges, grapes. We will bring some next time.

or we can use parenthesis to explain something without breaking a sentence:

This kid likes fruits (apples, oranges, grapes) so we will bring some next time.

What is the more obvious way to convey the structure of a sentence - using colons and comas or using parenthesis?

I believe, parenthesis is more natural and more readable. Notice, that with parenthesis comas become redundant.

(apples oranges grapes) 

Less symbols, less noise - more clarity.

This will not look so silly if you think how children, and especially autistic children, cope with writing language. When they cannot get complex rules they just drop them, because inability to grasp gives them unpleasant, physical sensation.

It is OK to use colons, comas and dots in the end to convey a structure of a sentence, but parenthesis, like a borders, are more natural and less complicated way to do so. But we cannot put each sentence into parenthesis? Can't we?

Consider the following:

This is notation of a set from Math.

fruits: {apples, oranges, grapes}

This is expression in a pseudo-code.

fruits = {apples, oranges, grapes}

This is a list structure.

define fruits (apples oranges grapes)

Same information, same structure, different symbols.

Now, how could we abstract this definition as an unique object? Exactly the same ways as before - by just adding two parenthesis:

(define fruits (apples oranges grapes))

Don't you see? This notation is less noisy, it feels natural. Isn't it good-enough?

Sophisticated enough person could point out, that in the previous two notations we are also conveying the type of what "fruits" is a name of. It is a set.

The questions are: is it absolutely necessary to always specify the exact type of every object? Should everything always be an object? Isn't it better to say that it is just a generalization - a list (a sequence) or a table, or a tree?

This is how to define a matrix in GNU Octave or Mathlab:

m1 = [1, 2, 3; 4, 5, 6; 7, 8, 9]

And this is list notation:

(define m1 ((1 2 3)
            (4 5 6)
            (7 8 9)))

Again, in the first example we convey the type (a matrix) withing the syntax. But we can add just one more word to convey the same information:

(define-matrix m1 ((1 2 3)
                   (4 5 6)
                   (7 8 9)))

or we can use annotations when it is necessary:

(define m1 ((1 2 3) (4 5 6) (7 8 9)) :matrix)

Why it bother? Because with this unified notation we could think of a matrix as a table with tree rows each one of which is a list of three elements, and this is the most natural way of thinking - by decomposing into a smaller, familiar parts.

There is a sentence

We have some fruits: 4 apples, 6 oranges and 1 kg of grapes.

with a list of very familiar number - noun pairs like "10 dollars":

4 apples
5 oranges
1 kg of grapes

We may be not consciously aware of it, but there are pairs which is so familiar to us that we don't notice them.)

So, lets try to represent it in the form of a list of pairs

fruits: apples - 4, oranges - 6, grapes - 1 kg.

or, more formally:

fruits: {apple: 6, orange: 6, grapes: 1kg}

and if we add indentation and remove redundant symbols:

  apple:  4
  orange: 6
  grapes: 1kg

it will look like a table with tree rows. (nice to meet the YAML notation)

This is too good - indentation conveys the structure, but we lost clarity and unity. If we reformat this expression as a sentence we will still have the meaning and the structure, but it will be much less readable, because it requires an effort to see the boundaries.

Which which sybmols are better, colons or dashes? The answer is - none of them.

(fruits ((apple  6)
         (orange 6)
         (grapes 1kg)))

How to read this expression? This is a list of pairs or it is a table of three rows, each row is a pair, or its each row is a list of two elements.

What is the type of this expression? Well, it is a table, and it is an associative list, and it is a list of pairs, and it is list of lists.

What is underlying machine representation of this? Lets call it a bunch of tags and pointers for now.)

Now lets consider some functions

y = x^2


f(x) = x^2 

Some famous function have their own names such as sin or cos or exp or sqrt but the wast majority are nameless, anonymous, too ordinary.

The way to represent a function without name is by writing down its body:

x * x

or in the prefix notation

(* x x)

So, an expression (of a function) is mere a list of symbols. Symbols are just names of (tags or pointers to) something, their meaning are their values - what is behind a name.

To evaluate an expression is to get the value of that expression. For a symbol (a name) it will be its value (a number or a list - another expression). There is no limit. It is called recursion.

There is some special notation in Math for anonymous functions - the lambda notation. Because it is boring to write Greek symbols on qwerty-keyboard, people are using the word lambda instead with the same meaning:

(lambda (x)
	(* x x))

which means "there is a procedure without a name, with one parameter named 'x' which multiplies it by itself".

This lambda expression then (which is a mere list of lists) is the special form to define a procedure without a name.

Now, how to evaluate a procedure without a name? The only way is just to write it down here and then invoke it with an actual value (3):

((lambda (x)
	 (* x x))

This is the small miracle of the prefix notation - this is why we write (* x x) instead of (x * x).

The reason is that the body of a procedure (which is a list of expressions) always comes as the first element (car) of the invocation expression (which is a list itself).

The rest (cdr) of the invocation expression is the list of values of parameters of a procedure. This list could be infinite (it is limited only by hardware) or even undefined (to be produced at the time of invocation on demand).

Now lets say it: To square something is to multiply it by itself.

; the definition of square:
;         \     |
(define (square x) (* x x))
;                  /  |  \

Which is just an abbreviation for: define a procedure named square to be a multiplication of the parameter named x by itself:

;              WITH AN ARGUMENT
;                  /   NAMED X
(define square
        (lambda (x) (* x x)))
;          |           |
;                      |
;             THAT RETURNS THE
;             RESULT OF
;             MULTIPLYING X BY X

The word "square" here is just the name to be called, the tag to be searched, the symbolic pointer to the actual lambda expression to evaluate.

Everything is a list of expressions (of lists). This is the symbolic representation (using the prefix notation) of.. of everything. Code is a list of expressions, expressions are lists of lists themselves. Code is data.

This is called the unity of the language.

Note: See TracWiki for help on using the wiki.