It is rather difficult to find any other subject with such amount of nonsense written about it (only Java is way ahead). Let us try to add a bit of our own.

Let us define some very basic functions first:

Priorities

```infixr 9  .
infixr 5  ++
infixl 4  <\$
infixl 1  >>, >>=
infixr 0  \$
```

Constant function

```const :: a -> b -> a
const x _  =  x
```

Function composition

```(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g  =  \x -> f (g x)
```

Function application

```(\$) :: (a -> b) -> a -> b
f \$ x  =  f x
```

## List

Now we must clean a few misconception related to the most fundamental data structure - a List, or to be more technical, single-linked list.

This is what a List really is.

```data [] a = [] | a : [a]
```

It is a recursive data structure made out of "conses". Although no one uses this term anymore, it reflects the nature of a List, and a procedure of adding an element to the head of the (empty) list is still named `cons` which stands for "constructor" from the times of first Lisps.

Giving only this good-enough definition of a list we could define a few fundamental functions for manipulating Lists.

Cons is rather tricky. It is so fundamental that its code is really part of a language run-time, and is implementation specific, so we just write a stub here.

```infixr 5  :

(:) :: a -> [a] -> [a]
(:) x []  =  [x]
(:) x xs  =  Implementation specific
```

Append

```(++) :: [a] -> [a] -> [a]
(++) []     ys  =  ys
(++) (x:xs) ys  =  x : xs ++ ys
```

Mapping

```map :: (a -> b) -> [a] -> [b]
map _ []     =  []
map f (x:xs) =  f x : map f xs
```

Folding

```foldr :: (a -> b -> b) -> b -> [a] -> b
foldr _ y []     =  y
foldr f y (x:xs) =  f x (foldr f y xs)
```

As one could see, we need nothing extra to work with classic Lists.

Now lets try to analyze all the stuff they believe must be added to the definition of what is a List.

## Functor

```class  Functor f  where
fmap :: (a -> b) -> f a -> f b
(<\$) :: a -> f b -> f a
(<\$) = fmap . const
```

What is the meaning of this? This is a declaration of so-called typeclass which defines a new "complex" Abstract Data Type, or ADT for short. The meaning of this declaration is: to be a Functor is to have a procedure named `fmap` defined, such that it conform to the type signature `(a -> b) -> f a -> f b` or to be more precise, `(a -> b) -> (f a -> f b)`. The second procedure `(<\$)` is optional and the default implementation (which could be re-defined) is given.

Now the most important question: What for?. The idea is this. Functor is a generalization of uniform "mapping" of a given procedure over a "container" or an aggregate (lets not discuss what a "container" means for now). The advantage is that only this `fmap` procedure "knows" the actual implementation of this particular "container", how to "walk over" it and how to "get" an element out of it. The "user" of an instance of the Functor typeclass is supposed to call `fmap` supplying a "transformer" procedure which is defined for a single element and `fmap` will uniformly apply this procedure to each element of a given "container".

So, a List is also a Functor if the mapper procedure `fmap` is defined (and it is just the classic `map`).

```instance Functor [] where
fmap = map
```

So what is an advantage over just using `map`? It is rather subtle. The compiler will type-check the code and refuse to compile an expression if a procedure supplied to `fmap` as its first argument is of a different type than `(a -> b)`, or when its second argument is not a list.

Now a List is also a Functor because there is a procedure named `fmap` defined for this particular implementation of a List. It is not List's essential property, it is just an additional rule of having a "mapper function" defined, it "accidentally" confirms to.

Now let's take another perspective. It is possible to develop so-called "high level" procedures which could perform an operation (a transformation) of entire container, no matter whatever it is. It could be a List, but it also could be a Tree, or a Set. This is the biggest selling point.

The point is that there is no magic or a free lunch. One have to implement the actual `map` or `treeMap` or `setMap` or whatever, so, again, the whole thing is just about some additional type-checking by compiler.

## Applicative

This is a complete-enough declaration of Applicative typeclass.

```class Functor f => Applicative f where
pure  :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
```

What is the meaning? To be an Applicative is to be a Functor (which means having a procedure named `fmap` defined) plus to have two additional procedures with given type signatures defined - `pure` and `(<*>)`.

To understand the purpose of these procedures look at type signatures. Logically, the first procedure is used to lift a given value of some type into a new context. Technically it is application of a type-constructor (which is just a special kind of procedure) to a value.

The best example to illustrate what is going on is using a well-knows List type-constructor - `[]`.

Consider a uniform mapping of a procedure over a List, treating it as an instance of Functor

```λ> fmap (\x -> x * x) [1,2,3]
[1,4,9]
λ>
```

The equivalent expression in terms of an instance of Applicative would be:

```λ> [(\x -> x * x)] <*> [1,2,3]
[1,4,9]
λ>
```

or more idiomatically

```λ> pure (\x -> x * x) <*> [1,2,3]
[1,4,9]
λ>
```

Indeed, there are laws which must hold for relation between an Applicative and a Functor, given that former is a subclass of the later:

```-- fmap f xs  ==  pure f <*> xs
-- fmap f xs  ==  liftA f xs
```

where liftA is just

```liftA :: Applicative f => (a -> b) -> f a -> f b
liftA f a  =  pure f <*> a
```

Why on Earth should we "lift" a pure function on values into a world (context) of Lists in order to map it over another List? What is the point?

Well, one is the same as in previous abstraction - type-checker will catch more errors of mismatching type signatures.

Another point is more subtle. This control abstraction is an attempt to generalize a sequential application, or abstract out the way to apply a sequence of procedures to a given container (in a given context).

```λ> [id, (+1), (\x -> x * x)] <*> [2]
[2,3,4]
λ>
```

So, Applicative instance for Lists is:

```instance Applicative [] where
pure x    = [x]
fs <*> xs = [ f x | f <- fs, x <- xs ]
```

Which is, indeed a nice control abstraction.

## Currying

Now let's take a break and consider some features of Haskell which gives to these control abstractions even more sense.

Logically, every function in Haskell has exactly one argument. Those that seems to have more than one actually a curried functions.

A curried procedure takes its first argument and returns another procedure to be applied to the next argument.

With this understanding we could say that `fmap` procedure takes a function defined on values as its argument and returns a procedure defined on Lists (its domain and range are Lists) or whatever container it happen to be defined to implement.

The type signature for `fmap` should rather look like this: `fmap :: (a -> b) -> (f a -> f b)` So, by adding extra parenthesis (we would get a Lisp!)

```λ> (fmap (\x -> x * x)) [1,2,3]
[1,4,9]
λ>
```

We emphasize that actually Haskell applies fmap to the given lambda procedure first, and then applies the returned procedure to a given list.

In Lisp this could be explicitly defined like this:

```(define (fmap f)
(lambda (xs)
(map f xs)))
```

and explicit procedure application will look like

```1 ]=> ((fmap (lambda (x) (* x x))) '(1 2 3))
;Value 13: (1 4 9)
```

This technique is called partial evaluation (or partial application in terms of Haskell).

Now we realize that `fmap` makes more sense - it "lifts", when partially applied, a pure function defined on values into a context of "aggregates" or containers.

We could also say that partially applied `fmap` creates a closure over a given pure function on values to be applied later to a given aggregate. And everything will type-check.

Such partially applied procedure produces a thunk or a promise and delayed evaluation is yet another, separate topic, also called lazy application.

The real cleverness of Haskell is that it does it implicitly and we need not to define everything explicitly as we must do in Lisps. Is it that good is quite another question.

Let me first recite a passage form Typeclassopedia: In the end, despite all the hoopla, Monad is just another typeclass. Yes, it is just another Abstract Data Type defined using Haskell typeclass notation.

Here it is in a full glory.

```class Applicative m => Monad m where
return :: a -> m a
(>>=)  :: m a -> (a -> m b) -> m b
(>>)   :: m a -> m b -> m b
fail   :: String -> m a

pure = return
m >> k  =  m >>= \_ -> k
fail s  =  error s
```

Actually, only `return` and `(>>=)` are required for a minimum complete definition of what a Monad is. There are also laws which must hold for a Monad, more about them later.

First of all, it looks much like Applicative, and indeed, `return` procedure is exactly the `pure` procedure, both type-signature, and implementation.

Of course it has the same meaning and purpose - to "lift" a value into a realm (context) of what an instance of Monad happen to be.

The difference between them is quite subtle (and hence these names are different). The `return` procedure is supposed to be used inside a supplied "action" procedure (to "lift" the returned value "back" into a Monadic context). Sound terrible, so let's ask usual how ans why questions.

Logically it is a-la pipe operator in shell scripts `m | f`

Technically, it is "wraps" each expression into a context (which means just applying a type-constructor procedure to a value) so type-checker (compiler) could ensure that there is enforced segregation between pure and impure code (which has side-effects). There are no other benefits of Monad as a control abstraction, only costs.

Indeed, separation of a code which causes "side effects" is a great idea in itself. So, a Monad is just a abstraction which allow the compiler to perform necessary type-checking to ensure that "impure" functions never "leave" its context, whatever it is, IO, for example but it could easily be just `[]`.

Logically, it is like a Pair of an initial state (whatever it might be - self-evaluating value or a function), and a transition function, which takes a value(s) from the evaluation of the initial state, and produces a new state using the `return` procedure.

When curried, `(>>=)` captures its left argument - the Monad and returns a procedure, which accepts a special kind of procedure (a transition procedure) which must use the `return` function to yield the result, lifting it "back" into another Monad, whatever it was.

The second argument to `(>>=)` (a transition procedure) must be a procedure of type `a -> m b`.

So, to be a Monad only one more procedure must be defined - `(>>=)`, and, the Monad type class itself could be nothing, but:

```class Applicative m => Monad' m where
(>>=) :: m a -> (a -> m b) -> m b
```

This is the instance of a Monad typeclass for Lists.

```instance  Monad []  where
return x  =  [x]
m >>= k   =  foldr ((++) . k) [] m
m >> k    =  foldr ((++) . (\_ -> k)) [] m
fail _    =  []
```

Now let's play.

```λ> [] >>= return
[]
λ>
```

Magic!

```λ> [1,2,3] >>= return . id
[1,2,3]
λ>
```

which is, of course, nothing but

```λ> pure id <*> [1,2,3]
[1,2,3]
λ>
```

or exactly

```λ> map id [1,2,3]
[1,2,3]
λ>
```

The expression

```λ> [1,2,3] >>= return . (+1)
[2,3,4]
λ>
```

is rather a wired way to say

```λ> map (+1) [1,2,3]
[2,3,4]
λ>
```

and this

```λ> [1,2,3] >>= \x -> return \$ x * x
[1,4,9]
λ>
```

is just this

```λ> map (\x -> x * x) [1,2,3]
[1,4,9]
λ>
```

So, the question is What for?

Well, to be make sure that everything type-checks. To be able to use "high order" procedures like `(>>=)` and others `(>>)`, `ap`, etc. without "knowing" the actual implementation of this or that particular instance of a Monad typeclass.

Nevertheless, each time a new instance of a Monad to be defined it is necessary to implement `return`, and `(>>=)` in terms of the underlying representation.

For classic Lists these are just `[]` and `map`. There is absolutely no magic.

Let me left you with my axiom, which was inspired by the classic Lisps: Mostly-functional programming with high-order procedures and lists (and occasionally arrays and hash-tables) is good-enough.