The 20-80 Rule
This generalized rule has an easiest algorithmic explanation.
- take a
list
of items sort
by most important characteristicreverse
(assuming sorted in an assending order)- take the first
1/5
or the top20%
(take (length / 5) xs
)
Even better, you may create a few different sorted lists (using different properties)
- merge sorted lists
- remove duplicates
The resulting list is a template for a priority queue.
The 1-99 rule for programming
99% of information on the web about programming is fucking bullshit and a noise, low-effort narcissistic crap and bullshit.
Here is the 1%
.
Immutability
Once some value is bound to a symbol it stays forever. Mathematicians got it right intuitively.
A new binding with the same symbol (name) can be introduced, so it shadows the prior binding.
This principle is both for names and for data structures, and corresponds to the fundamental (for Life itself) properties of molecules (as building blocks of Life).
One more than one
A, B
(same asB, A
) - anAND
, (rotation symmetry, ordering is immaterial)A | B
(same asB | A
) - anOR
, (rotation symmetry, ordering is immaterial)A -> B
(not the same asB -> A
), having both defines an isomorphism)- Cartesian Products corresponds to record-types.
- Disjoint Unions corresponds to variant types.
- Implications corresponds to function types.
Pure functions.
The great languages of the past (Common Lisp
, Scheme
and CLU
) used to generate
data-constructors and slot-selectors for every record-type you define.
There are pure function with respect to the first argument, which is a structured value of the corresponding Abstract Data Type. Same input - same output. Always.
Parial functions
The individual clauses of a pattern-matching expression are partial functions, which are defined on a particular subset of the domain.
When one pattern-matches on a sum-type one implicitly defines partial functions.
Sum and product types
The sum- and product- types (variants and records) is all you need. A language must have a proper support for sum-types, they are NOT just C /enums.
A language must have the A | B
notation and universal (evertywhere)
pattern-matching on individual data-constructors.
Liskov’s ADTs
Abstract data types hide implementation and representations behind “abstract” interfaces, and thus introduces and defines (within a module) an abstraction barrier.
This is the most fundamental principle, orthogonal to functions, as fundamental as a cell-membrane.
GADTs
Parameterization by the type of a value in a “slot”.
This is related to the notions that “sums” are duals to “products”/.
Duality of sums and products
The abstract notion of duality between A, B
and A | B
captures the fundamental
aspect of Reality - to an external observer any two are either together or
apart, parts of a whole or unrelated. In the physical world this notion is
entirely based on locality (and proximity).
Notice that the “flipping the arrows” (to introduce a duality) is an abstract operation with nothing that corresponds to it in the real world. This is where one stops.
Data-constructors and data-selectors (which are A -> B
) together define an isomorphism.
Curry-Howard correspondence
Genzen’s Natural Deduction formalism corresponds to Simple-Typed Lambda Calculus.
It confirms that “everything” boils down to the same fundamental notions (generalized by an external observer).
There is even an higher level notion of an isomorphism between whole independently discovered formalisms.
GHC
GHC’s internal representation (IR) is based on a single sum-type with less than 10 data-constructors. This is enough to implement the System-F(w) system of logic (particular a set of “extensions” for classic first-order logic).
This proves (by example) that everything is reducible to this. (actually, to
the Simple-Typed Lambda Calculus formalism - the basis for System-F),
and even to original, untyped Lambda Calculus which has just 3
constructions.
Type-classes
So, the core of any decent language must have
- ADTs (data-abstraction)
- Sum-types (disjoint-unions of values)
- Product-types (structures or records)
Type-Classes (set-subset relations with nesting for interfaces)
Interfaces are just type-signatures.
This covers the 99%
of programming.
OO as a DSL
Just functions, Algebraic Types and related ADTs are enough for building all the OO features as a DSL (based on record-types or structs with both data and code “slots”).
CLOS
was the proof by example, and Smalltalk on the implementation side is also
message-reciving strucured closures.
Message-passing
The only way to do it right is share-nothing (full closures) together with asynchronous message passing.
Life evolved these stable “architectural patterns”.
In-wire representation
Life passes “physical structures” around as asynchronous messages (along with electrical signals through synapses, which are our “message-buses”).
The principle of passing “in-wire representations” around is universal and eliminates the all the shared-state problems.
No wonder this is how the internet works - the data, requests and responses are stateless, and this is the principle which makes everything work.
Layered structure
The architecture of complexity is a hierarchy of layers with communicate only with the ones one level below it.
Again, this has been evolved as a stable and universal “architecture”.