UP | HOME

Language Models

There are fundamental flaws in all current language models which are based on words (or ngrams) distances.

First, when one downloads entire internet and build a huge representation, one never captures any true aspects of underlying reality, only some current set of beliefs about isolated aspects of it. In other words, it is always utter bullshit, either sectarian, religions, conspiracy or just plain bullshit.

It is worth to reiterate again - word distances (and therefore the most likely next word in a sequence) cannot, in principle, capture anything real, exactly the same way that statistics based on mere observations cannot capture the underlying “mechanics” of a process.

Yes, one may prompt a GPT-3 model and get seemingly intelligent answers, but it will be actually list likehood, or what the most people would say.

This, however, is almost always bullshit. What most people would say is never the right thing, because a real expertise is rare and it is hard to achieve, and the voices of actual experts are always lost in the noise.

The languages models capture the current shape of the noise.

The idea that it captures underlying hidden semantic structure is nonsense. It captures its current distortion.

Second, a human language is not being used by humans as a medium of accurate description of aspects of reality (a mere encoding from audio or verbatim transmission of inner representation of accumulated and scientifically verified knowledge). Not even textbooks are adequate. Everything which has neen written about model-based Economics turns out to be sectarian bullshit, for example.

So, science is still the only methodology (however slow and costly) to arrive at some approximation of true knowledge (being less wrong, given partial and imperfect information we have).

The belief that a language model can provide an insight, leave alone the truth, is deeply delusional and contradicts the fundamental principles. But it allows some very fluent bullshitters, like fluent Karpathy, to have a decent living and make a lot of money.

The only thing that Karpathy got right is that a vastly complex structure which emerge as the result of training and optimizations captures more than a human could even comprehend (the shape of a noise).

Last but not least, they funally arrived backwards at the same principle - that the data-sets better don’t contain any distortions of reality, which is exactly how biological organisms prior to language-based abstractions came into being.

Another side note is the fact that these complex artifacts are “fragile”, and just one node being owerwitten turns a dog in an ostrich, which inderectly proves that it is just a vastly complex “snapshot”of a current noise and underlying reality is burried even deeer underneath.

Author: <schiptsov@gmail.com>

Email: lngnmn2@yahoo.com

Created: 2023-08-08 Tue 18:41

Emacs 29.1.50 (Org mode 9.7-pre)