UP | HOME

Understanding C++

Equality is a symmetric relation.

It has to be the assignment operator := to suggest all the required notions of an imperative assignment, which implies LHS and RHS values, lifetimes, who will call destructors and when, etc.

“Functions” are procedures, but the flawed terminology has been inherited from C. “Bindings” are assignments, which actually are writing to memory locations.

In general, all the “natural” mathematical intuitions and assumptions are broken in imperative languages with destructive assignment statements. One has to realize this and learn a new set of basic assumptions and rules.

All of them are just little bit better than PHP (the fractal of bad design, you know) by having at lest some uniformity and some form of standardization.

At least every single annoying verbose explicit detail of the fucking mess which we call C++ is defined in the standard.

Another important thing os that C++ changes nothing (muh non-breaking the compatibility meme of an industrial language), only piles up more crap on top.

The “good” part is that the compilers (at least clang++ and g++) are well-designed, layered and robust, and being changing incrementally, not ruining what already works and “orthogonally” adding hardware optimizations (even Google now/ uses modern compilers - clang).

And, yes, C++ is and the industrial language, unfortunately. It requires very special mindset (to ignore and to feel nothing about all the verbose, redundant crap, inhuman syntax, and just to write down the fucking code).

Generalization over hardware abstractions

The basic machine model of C and C++ is based on an abstraction of a computer hardware architecture, rather than some form of mathematics. This explains so many annoying kludges, like having 0 as false.

C++ extends C with references, classes and generics, while it is backward-compatible with C ABI. The C language, in turn, is a set of slightly higher-level generalized abstraction over a machine (hardware architectures), which was an true enlightenment at its time.

What Fortran did to programming mathematics, C did for programming hardware – both introduced a set of convenient higher-level abstractions.

While C++ standard is trying to define a whole coherent model, C is just a small set of close to hardware model, “almost transparent” abstractions (and this is the real philosophy and beauty of C).

This model includes “hardware types” (which are an actual representation with a corresponding set of CPU instructions) - how a machine represents its “core” types - ints, floats, and pointers.

Basically, any hardware platform is abstracted as a model with the oversimplified notions of memory, pointers and arrays (continuous chunks of memeory) the machine-word size and endianness, the stack and procedure calling conventions.

An OS adds to it the full ABI, which defines what exactly is passed and returned in registers and on the stack, how interrupts are handled, etc.

C and C++ do destructive assignments (over-writing of memory locations) instead of bindings. Just as machine instructions do, It is copying of a value.

Assignments are not bindings to the same value. x = y is copying (of the value if y), not binding (shadowing) of x (to the value of y), as in math.

C pointers and C++ references are used to explicitly specify any form of sharing.

A variable is just a name (symbol) given to a memory location. It is not a pure symbol-to-value immutable binding – no mathematical standard conventions here. “Variables” are placeholders, “functions” are procedures, numbers are only within a particular range, integers overflow, etc.

This is exactly what it means to be a pure-imperative, procedural language. OO makes everything even worse.

Good C and C++ compilers map statements to single machine instructions.

Procedures are implemented using standard ABIs for passing of values of formal parameters on the stack. The clever early C++ hack just adds an extra pointer to an “object” on the stack when calling methods (which are just procs with an extra argument).

“Just right” mathematical abstractions

A value is a sequence of bits interpreted according to a type (using a machine representation).

A “variable” (as in math) binds a value of some type (set of values * set of operations).

In an imperative languages it is some area in RAM that stores a value.

Types, standard functions define a vocabulary of implementation. Unlike math, all out abstractions require implementations and representations.

A type can be though off a s a Set of Values (together with a Set of Operations defined on it). Just as the mathematical abstraction of a Set itself.

One also could think of a mathematical system, which is a Set together with one or more operations and the closure property. This is essentially how higher-kinded types are defined.

The type of a value or an “object” determines the set of operations applicable to it and its layout in memory.

Just as in First-order logic values can have attributes or properties (and thus predicates on these), types (like Sets) can be defined by interfaces – particular sets of function signatures (together with “laws” and invariants stated informally elsewhere).

These interfaces are subsets of possible operations, and implementing them makes an “object” operationally an instance of a type defined by a required interface.

All the multiple traits, type-classes and what not – everything can be boiled down to Sets and ADTs. Both traits and type-classes are ADTs of ADTs if you will.

To be precise – such that it implements this particular set of signatures, which is the mathematical essence of what we call Duck-Typing.

The “designers” of Imperative languages, however, were mostly mathematically ignorant (the best ones were too focused on a machine) and just fucked everything up. Imperative language break almost all mathematical properties one could naturally expect.

Implicit coercions and truncations break the fundamental closure property, destructive over-writes (assignments) breaks the referential transparency property, and without these there is no mathematics. References to disappearing entities is just bullshit.

Again, the Algol lineage is just a bunch of non-mathematical (imperative) abstractions over machine abstractions. This is the key for the right understanding.

Standard Libraries

A standard library of any decent language is an idiomatic code to read. It also show how good the language is.

The C++ standard library, including STL, is written in itself, and it is a mess.

The -std=c++XX argument

First of all, strings in C++11, C++14 and C++17 are not the same. An idiomatic C++ code will compile but won’t link (due to the subtle differences in mangling).

In theory, one has to recompile all the dependencies and system libraries with the same -std=c++XX flag. Projects which do vendoring at least are trying to compile everything with the same flags (but linking with system libs and the stdlib).

Google nowadays compiles its Chromium with -std=c++20 (so should you) and vendoring almost everything, including libc++.

In practice, good systems like Gentoo, cannot force, lets say -std=c++20 by default (the most reasonable choice – no breaking changes from 17) because of the stale projects, and this illustrates yet another fundamental problem with C++ (the “not-breaking the compatibility” meme results is a dependency hell).

So, what should one do? Well, vendor and re-compile everything with the same flags – the way Google does it.

Author: <schiptsov@gmail.com>

Email: lngnmn2@yahoo.com

Created: 2023-08-08 Tue 18:39

Emacs 29.1.50 (Org mode 9.7-pre)