# Abstract Data Types

An abstract type is a type equipped with a *set of operations* on its values, which are the only operations applicable to that type.

Such type forms an Abstraction Barrier such that its representation can be changed — perhaps to a more efficient one — without affecting the rest of the program.

`cons`

, `car`

, `cdr`

`pair?`

of the type `Pair`

is the canonical example of an ADT.

## Abstractions

Appropriate *abstractions* (the ones that reflects certain aspects of reality, of what is) is the secret of the trade. From right abstractions comes right data-structures, and from these right algorithms follow.

Abstractions are obtained from the process of modeling aspects of reality. Modeling is a part of the scientific method.

The first principle about abstractions - they must not be disconnected from reality, from what is, like Hegelian nonsense. Abstraction means generalization, not some "higher level" bullshit from "the universe of ideas". Abstractions must be concrete. An *atom* is a good abstraction, *sequence* is good abstraction. "Higher dimensions" is bullshit.

## Abstract Data Types or ADTs.

Abstract data types are the very *essence* of programming. In contrast to so-called "machine types" which define values according with hard-wired encoding of this or that hardware architecture, abstract data types are used by programmer to *describe and represent* real-world concepts and data.

The power of a programming language mostly comes from the ability to define ADTs and procedures over them.

ADTs are abstractions described in terms of a programming language, with a set of procedures "closed over them". The procedures are of three different kinds: *constructors, selectors* and *predicates*.

Values of a certain type must follow (or satisfy) a set of conventions and constraints - how values of this type are *represented* and *interpreted*.

The precise definition is - **a type is defined by a set of possible operations (procedures) over its values**. Constructors and selectors define *representation* aspect, predicates - *interpretation*.
To understand the ideas *behind* data types we have to pay attention to details and spell out all the nuances of real-world phenomena we are trying to capture.

For example, we could say that a list is *either* an "empty list" - a list containing zero elements - nothing, emptiness, *or* it is a "cons cell" (or a pair) of an element and "another list". Now lets try to define these "abstract" relationships in terms of a "concrete" programming language.

In Standard ML we could write it this way:

datatype List = Empty | Cons of int * List

This almost literally means - a List is either Empty or it is a Cons of int and another List.

This expression happen to be a definition of so-called Algebraic Data Type but this in NOT the point. What we are studying here is more general concept of an ADT to which an Algebraic Type are just a "specialization". So, let's forget about it for a while.

Here we have defined a `List`

of `Int`

s. According to the semantics of Standard ML, `Empty`

and `Cons`

are so-called "data-constructors". Logically, these are *procedures* which returns a piece of data "type-tagged" with the tag `List`

. We could say that they do *wrap a value into a List-container* or *lift a value into the world of Lists* while technically we just "tag a value with an additional marker*, like (cons 'type x).
*

The `Empty`

constructor requires no arguments, while `Cons`

requires exactly two - the first one of the type `Int`

and the second must be of type `List`

. They both returns a value of the same type `List`

.

Having just this we could define a value of type `List`

:

val x = Cons (4, Cons (23, Cons (2008, Empty)))

Think about this as a nested procedure calls - Empty, then Cons of 2008 and that value returned by Empty, and then Cons 23 to the value of that, and then Cons 4 to it.

The clever idea in the ML family of languages is that they have a DSL - a sub-language for describing data types, which also defines "data-constructors" and "templates for pattern-matching" instead of "selectors", which must be explicitly defined (as ordinary procedures) by a programmer. The very same idea we will see in Haskell.

In Common Lisp we have a DSL (a bunch of macros) for defining structures, which implicitly creates "selectors", "constructors" and "helper procedures for a generalized setter" (`setf`

macro) for a given type. The same idea, generalized.

We do not have to think about the *representation*, only about *logic*. The "data-constructors" are *hiding (encapsulating) all the implementation details of how a value of the type List is represented*, which does not concern us.

Now we could define the `append`

procedure for our type `List`

in Standard ML language:

fun list_append (xs, ys) = case xs of Empty => ys | Cons (x, xs') => Cons (x, list_append (xs', ys))

And here is `append`

for built-in lists:

fun append (xs, ys) = case xs of [] => ys | x::xs' => x :: append(xs', ys)

In both cases we are using *pattern-matching against data-constructors* (`[]`

and `::`

in the case of built-in ML lists).

Now we could define a *polymorphic type* - a generic list - a list of any type.

datatype 'a List = Empty | Cons of 'a * 'a List

where `'a`

is a "type-parameter", a *placeholder* for any other type, such as Int or Real.

In ML and Haskell all the lists must be of values of the same type. This property has a special name - *homogeneous* lists.

Now let's consider Scheme's built-in Lists. The naive `append`

function would have almost the same shape:

(define (append xs ys) (cond ((null? xs) ys) (#t (cons (car xs) (append (cdr xs) ys)))))

Instead of pattern-matching which performs "type-checking" and "destructuring" (variable binding) implicitly we are explicitly using a *conditional* `cond`

with the "predicate" function (`null?`

) and "selectors" (`car`

and `cdr`

).

The ML's version of `append`

is much more elegant, but **to understand** ADTs we have to learn to define and use "constructors", "selectors" and "predicates" for each ADT.

So, for the built-in List type there are "constructors" - `cons`

and `list`

, "selectors" `car`

and `cdr`

, "predicates" - `null?`

and `list?`

.

More idiomatic version of `apppend`

:

(define (append xs ys) (if (null? xs) ys (cons (car xs) (append (cdr xs) ys))))

Defining a generic List

data List a = Nil | Cons a (List a)

and the naive `append`

for built-in Lists in Haskell

append [] ys = ys append [x:xs] ys = x : append xs ys

or

[] ++ ys = ys (x:xs) ++ ys = x : (xs ++ ys)

**Note:**See TracWiki for help on using the wiki.