‎

#+TITLE Abstract Data Types

This is by far the most important concept (and related methodology) in programming.

This is a fundamental result, originally by Barbara Liskov (for which she received the Turing Award).

What she did, however, is clearly articulated and grouped together related concepts, while the original totions came from abstract mathematics.

The first well-known Abstract Data Type, so to speak, is what underlies the Set Theory - the formalized notion of a Set.

Abstract mathematics in general deliberately ignored (abstracted out) any notions of actual representation or even possible existence of abstract “objects” it considers.

To formalize notions form a Domain in pure mathematics first (using Sets and Logic) is an excellent strategy, and this is what any serious programmers of the Golden Age (70s, 80s) advocate.

However, classic mathematics is untyped or implicitly weakly typed, which means that the properties which constitute an abstract structure, or a whole family (or a class) of such structures, while stated precisely and formally, do not constitute a formal definition of a particular type (ADTs and Type-classes of Wadler are attempts to do typing systemically).

The abstract notion of a Set is defined only in terms of operations (which itself is a set of operators), deliberately leaving out everything else.

This is the same principle underlies any proper ADT - it is defined in terms of operations on the corresponding Set of values (which can be seen as an abstract partition on a Universe of all possible values).

So ADTs are Sets of values, but defined (such that) in terms of operations.

Type-classes extends this notion with nesting and extension or specialization by defining a Set of required interfaces to be implemented. It thus formalizes the notion of a Duck-typing.

To be a duck is to exhibit this and that behavior - to quack like a duck and walk like a duck.

The principle behind Abstract Data Types is clear and well-defined separation of

A Public Interface (corresponds to a signature in ML)
An Internal representations, together with a well-defined representation invariants and other constraints.
A “concrete” Data-Structure which underlay the type and actual implementation of corresponding algorithms.

Representation and implementation are defined in structures in ML.

Data dominates (another fundamental principle) which means that once a particular data structure is selected, all the algorithms and even the shape of the code (implementation) follows - a recursive nature of a data structure implies recursive functions that traverse them.

Similarly, a sum-type dictates its pattern-matching code and overall shape of the functions which manipulate values of a type.

Monads in Haskell are ADTs, defined as a class of types (a type-class), while each particular instance is precisely an ADT.

The type-class itself is just a named set of interfaces together with a class-subclass relation (or nesting). Semantics of a type-class is to state what it takes to be an X - which particular signatures must be implemented (by an ADT)

The second most fundamental concept of Modules is one level higher. A Module could contain one or more signatures and corresponding structures. In ML these could be separated in a .mli and .ml files.

In ML Modules are high-level abstractions and could be applied to one another to produce a “specialized” structure (which is another new module).

This application is based on the mathematical concept of a Functor which is a mapping that preserves the structure of an “object” (originally defined on directed graphs).