Abstract Data Types
An abstract type is a type equipped with a set of operations on its values, which are the only operations applicable to that type.
Such type forms an Abstraction Barrier such that its representation can be changed — perhaps to a more efficient one — without affecting the rest of the program.
pair? of the type
Pair is the canonical example of an ADT.
Appropriate abstractions (the ones that reflects certain aspects of reality, of what is) is the secret of the trade. From right abstractions comes right data-structures, and from these right algorithms follow.
Abstractions are obtained from the process of modeling aspects of reality. Modeling is a part of the scientific method.
The first principle about abstractions - they must not be disconnected from reality, from what is, like Hegelian nonsense. Abstraction means generalization, not some "higher level" bullshit from "the universe of ideas". Abstractions must be concrete. An atom is a good abstraction, sequence is good abstraction. "Higher dimensions" is bullshit.
Abstract Data Types or ADTs.
Abstract data types are the very essence of programming. In contrast to so-called "machine types" which define values according with hard-wired encoding of this or that hardware architecture, abstract data types are used by programmer to describe and represent real-world concepts and data.
The power of a programming language mostly comes from the ability to define ADTs and procedures over them.
ADTs are abstractions described in terms of a programming language, with a set of procedures "closed over them". The procedures are of three different kinds: constructors, selectors and predicates.
Values of a certain type must follow (or satisfy) a set of conventions and constraints - how values of this type are represented and interpreted.
The precise definition is - a type is defined by a set of possible operations (procedures) over its values. Constructors and selectors define representation aspect, predicates - interpretation. To understand the ideas behind data types we have to pay attention to details and spell out all the nuances of real-world phenomena we are trying to capture.
For example, we could say that a list is either an "empty list" - a list containing zero elements - nothing, emptiness, or it is a "cons cell" (or a pair) of an element and "another list". Now lets try to define these "abstract" relationships in terms of a "concrete" programming language.
In Standard ML we could write it this way:
datatype List = Empty | Cons of int * List
This almost literally means - a List is either Empty or it is a Cons of int and another List.
This expression happen to be a definition of so-called Algebraic Data Type but this in NOT the point. What we are studying here is more general concept of an ADT to which an Algebraic Type are just a "specialization". So, let's forget about it for a while.
Here we have defined a
Ints. According to the semantics of Standard ML,
Cons are so-called "data-constructors". Logically, these are procedures which returns a piece of data "type-tagged" with the tag
List. We could say that they do wrap a value into a List-container or lift a value into the world of Lists while technically we just "tag a value with an additional marker, like
(cons 'type x).
Empty constructor requires no arguments, while
Cons requires exactly two - the first one of the type
Int and the second must be of type
List. They both returns a value of the same type
Having just this we could define a value of type
val x = Cons (4, Cons (23, Cons (2008, Empty)))
Think about this as a nested procedure calls - Empty, then Cons of 2008 and that value returned by Empty, and then Cons 23 to the value of that, and then Cons 4 to it.
The clever idea in the ML family of languages is that they have a DSL - a sub-language for describing data types, which also defines "data-constructors" and "templates for pattern-matching" instead of "selectors", which must be explicitly defined (as ordinary procedures) by a programmer. The very same idea we will see in Haskell.
In Common Lisp we have a DSL (a bunch of macros) for defining structures, which implicitly creates "selectors", "constructors" and "helper procedures for a generalized setter" (
setf macro) for a given type. The same idea, generalized.
We do not have to think about the representation, only about logic. The "data-constructors" are hiding (encapsulating) all the implementation details of how a value of the type
List is represented, which does not concern us.
Now we could define the
append procedure for our type
List in Standard ML language:
fun list_append (xs, ys) = case xs of Empty => ys | Cons (x, xs') => Cons (x, list_append (xs', ys))
And here is
append for built-in lists:
fun append (xs, ys) = case xs of  => ys | x::xs' => x :: append(xs', ys)
In both cases we are using pattern-matching against data-constructors (
:: in the case of built-in ML lists).
Now we could define a polymorphic type - a generic list - a list of any type.
datatype 'a List = Empty | Cons of 'a * 'a List
'a is a "type-parameter", a placeholder for any other type, such as Int or Real.
In ML and Haskell all the lists must be of values of the same type. This property has a special name - homogeneous lists.
Now let's consider Scheme's built-in Lists. The naive
append function would have almost the same shape:
(define (append xs ys) (cond ((null? xs) ys) (#t (cons (car xs) (append (cdr xs) ys)))))
Instead of pattern-matching which performs "type-checking" and "destructuring" (variable binding) implicitly we are explicitly using a conditional
cond with the "predicate" function (
null?) and "selectors" (
The ML's version of
append is much more elegant, but to understand ADTs we have to learn to define and use "constructors", "selectors" and "predicates" for each ADT.
So, for the built-in List type there are "constructors" -
cdr, "predicates" -
More idiomatic version of
(define (append xs ys) (if (null? xs) ys (cons (car xs) (append (cdr xs) ys))))
Defining a generic List
data List a = Nil | Cons a (List a)
and the naive
append for built-in Lists in Haskell
append  ys = ys append [x:xs] ys = x : append xs ys
 ++ ys = ys (x:xs) ++ ys = x : (xs ++ ys)