On bullshit
The general theory of bullshit
When one ask a random stranger about something what one would get back is bullshit.
Unless one is an expert, like, lets say, a well-trained mathematician, one cannot call bullshit. One doesn’t even know that he has been bullshitted.
This is OK when what one asks about are some observable (by multiple observers) properties of some locality - directions, roads, etc.
With socially constructed abstractions it is almost always bullshit. Socially-agreed-upon and thus preserved and maintained bullshit.
Now, what is the advantage if some software system “indexed” all the socially constructed, preserved and maintained bullshit?
One can ask random question and get a non-random, obviously non-unrelated answer, given in a confident, authoritative “tone”.
It is as if a child (with an yet empty “semantic storage”) asks an adult with a mild neurological mental disorder, who never bothers to validate or even doubt what he “knows”.
This is a weak metaphor. A better one would be reading public forums without understanding of an expert.
In a few areas that I am an expert, what I read is almost always bullshit. Very few people are bothering to give a well-formulated, correct answer, augmented with justifications and underlying reasons. Most of the time that would be just an emotional opinion according to beliefs.
Now what is good about indexing uninformed opinions, based on beliefs (and sorting them in a sophisticated way)? Would one use this as a “knowledge base” or any trustworthy source?
I would certainly not. Even mathematical books are full of shit (when they talk about non-facts).
There is no shortcut of carefully maintaining one’s inner “map” of the territory and no substitute for a valid and adequate “map”. Well, it seems like it is OK to ask random strangers about directions, until it isn’t.
At least one should get the right understanding what exactly these language models are. They are just “indexing” textual information (of questionable quality) and the resulting abstract structure, while definitely captures “something”, is not capturing any “knowledge” – only information, which is not the same thing.
So, yes, it may be thought of as a different kind of “Google” (different kind of indexing) but it indexes mostly bullshit (in principle) in a more sophisticated way.
When it generates a seemingly coherent and linguistically correct response, it uses very different heuristics as a reasonable person would (encoding for verbal communication what one is already know). It operates on a level way beyond (and therefore unrelated to) any semantic knowledge.
Last but not least, what we have here is the same old problem of “false-prophets” but in a much more subtle and “dangerous” way. Another problem is over-confident but grossly unqualified doctors and other kinds of merely imposters.
How one knows this isn’t a bullshit? And this is the fundamental problem.
WTF am I reading?
With arrival of these ChatGPTs we have, literally, to go back where the ancients started and ask all the fundamental questions over again.
“What is Truth?” (An accurate verbal or written description of some or another aspect of actual reality, without unnecessary abstractions). “What is a statement of fact?” And most importantly - “How do you know?”
Logicians and mathematicians spent ages on these questions, and the answer is, it seems, that we know very little but talk a lot of bullshit.
With adoption of social media we are drowning in “peer-reviewed” bullshit. The problem is - how could you call bullshit?
No one is labelling for you what the actual fuck you are reading
- an unwarranted assumption
- an unvalidated hypothesis
- someone’s opinion (based on a belief)
- a “scientific consensus” (a socially constructed bullshit)
- a mere uninformed guessing
- a refined religious dogma
- a carefully crafted propaganda
- Freudian abstract “psychology”
- Hegelian abstract bullshit
- everything else, which is not a statement of fact
At any given moment one could visit, lets say /g/catalog
or just modern HN
and, believe me, 99% of what one would read is utter bullshit.
The field of Computer Science and of Programming Languages in particular is especially telling, because there is a striking difference between expressed opinions and actual representations and implementations (which have certain observed and measured characteristics) executed by a computers.
The fact that there are always actual representations and implementations is what distinguish computing from bullshitting, and only when on knows them one could “actually see” the utter bullshit being so confidently posted on these social platforms.
There is no reason to believe that in any other field of knowledge the situation is any better. We are lucky in CS that we have lots of objective metrics, not mere opinions.
What is an expert, then?
An expert is a person (or a system) which have managed to build and maintain the most accurate inner representation (an “inner map”) of some or other aspect of what actually is (which we call Reality).
How do we know the representation is accurate is the most fundamental question, which is too complicated to answer.
What we do know is that there is a certain methodology of establishing the truth (to discover what is real) which is called “experimental science”, and some limited inference processes, which are applicable to statements of (previously established) fact, expressed in a symbolic forms. This is, literally, all what we got.
In short, it is all about the correct representation, whatever it may be.
What current large language models produce is definitely NOT a correct representation of anything, except, perhaps, being a sophisticated snapshot of all the current socially constructed bullshit, taken from a particular “angle” and already outdated.
This is at least a less wrong analysis of what we have.