UP | HOME

Data and structures

There is some journey from higher abstractions down to Earth

When we talk about an one more that one (one and another one) values, all we can have is:

The fist two are symmetrical (one just rotate it or move an observer around), while an arrow is by definition has a direction (it is a direction abstracted out).

Nesting and recursion gives one tuples and linked lists. Associating labels (symbols) to the individual slots (instead of mere positions) gives records.

When using labels (names) one does not have to consider the actual ordering or any aspect of representation whatsoever. This is how the mind works.

Even more importantly, one does not have to think about other slots and completeness of the whole structure. Adding more items (to an |) or named slots (to an ,) change nothing in the previously defined pure expressions (the code).

This is how one deals with partial or incomplete knowledge. Once one’s knowledge improves with experience, the corresponding structures get augmented. This is how the brain and a continuous refinement in general work.

Pattern-matching makes tuples way more convenient, but does not solve the position-access (the need to know the particular ordering) problem.

When there are two arrows in both directions (actually 4, - plus two identity arrows) then this is another common pattern (an emergent abstract structure).

Cartesian product

  • Arbitrary chosen “Origin” (this is a philosophically important principle)
  • Axies are equally-spaced “rulers”, or abstractly - Number Lines
  • Every Number on a line is a distance from the “Origin” (distances are “real”, numbers are not)
  • 2 Coordinate axis (superimposed by the mind “rulers”) cover an abstract plane
  • All possible pairs cover every single point on a plane
  • 3 mutually orthogonal (or perpendicular) cover a whole Space (it happens that a motion can be factored into 3 independent components)
  • Cartesian Product \(x \times y \times z\) covers every point in a Space

Abstract N-dimensional spaces

  • Dimensions are products of the mind and does not exist
  • However it is useful to apply these abstractions in a systematic way
  • Just as within an Abstract Plane (or a Number Line) other abstract axis (rulers) can be imagined.
  • A “ruler” or an axis can be superimposed on any attribute (aspect)
  • This is how we get n-tuples to define (and represent) almost everything
  • Each value ranges over a certain finite Set - with certain bounds

Labeled Cartesian Product

  • Once we label components of a tuple with symbols we get a record

(here a tuple or a record defines a whole set of possible values)

  • Just as a tuple, a record “covers” (ranges over) a whole space of “points”

Generalized labeled Product Type

When we say { name = "" } we cover a linear (1D) space of all people ever lived, who had a name. Socrates is here, and George W. Bush.

Notice that this is a Type (a Set of all possible values) not the individual names (values).

If we add a { social_security_number = "" } slot to out type we will restrict the subset (and will get a different image on a Universe of discourse), disqualifying Socrates and many others.

This is how we select (or filter) a particular sub-spaces (subsets) from the “whole”.

Structure or a Table

  • By defining a type we partition the “Universe of Discourse”.
  • When we specialize a record we get a table (or a structure) which is a type of its own together with an actual implementation
  • It is an actual representation for a subset of individual values.

A table is an Ordered Sequence of Rows

  • It is a representation of an abstract sub-space (of a subset).
  • Rows represent individual “points” in a sub-space.
  • Table is used to store a collection of rows (in a particular order)

Product-types

And that is basically it. We superimpose rulers, group together “coordinates” or “distances” and partition “points” into sub-spaces using what we call product types.

This is how we categorize and group “things” in our minds (Sets are the most general abstraction).

We also need Sum-types (disjoint unions) to systematically represent possible distinct outcomes (which form a Set too).

Relations

An entity can be described as a set of its attributes.

A relation can be viewed as a set of arrows (ordered pairs) from an id to each of the values.

So, mathematically, a relation is just a set of ordered pairs represented as an n-tuple.

  • Each attribute is a set of values - has its particular domain.
  • The order of attributes is immaterial (as it should be)

A table is a representation for some type of particular relation.

Each row of a table is a record or an instance of a particular type of relation.

All rows (records) in a table from a Set, while each field ranges over a set of possible values. So everything is typed.

Schemas

When we define a schema for a table we establish a higher-kinded structural typing for what is itself a type (a Set of all possible values).

Only tables of the same kind (same structure) can be used \(\cup,\cap\) operations.

“Select”

Select corresponds to the notion of “such that” and filtering.

Can be nested.

  • Selects commute under composition (order in a composition is irrelevant) (this is an instance of a “universal construction”)
  • Select distributes over \(\cup, \cap\) (this is an intuitive property)

Author: <schiptsov@gmail.com>

Email: lngnmn2@yahoo.com

Created: 2023-08-08 Tue 18:42

Emacs 29.1.50 (Org mode 9.7-pre)