AI：從荷馬到ChatGPT

Hacker News·4 個月前

本文探討了人工智能的歷史淵源，從荷馬史詩中描繪的自動機概念，到機械奇蹟如「土耳其棋手」，再延伸至現代的ChatGPT。

AI: From Homer to ChatGPT

Tags: research, musings

AI is omnipresent. With speculations on its capabilities running
rampant, it is time to slow down a bit and assume a broader perspective.
This is an article I wrote for a general audience, so if you are already
in the know, please forgive me for taking some shortcuts for the sake of
exposition.

Automatons In The Iliad

As it turns out, AI has been with us for much, much, longer. Already
Homer describes attendants of
Hephaestus, the god of
blacksmiths and artisans, in the
Iliad:

Grasping a thick staff he limped from the forge, supported by
servants made of gold, fashioned like living girls, who attended
swiftly on their master. As well as the use of their limbs they had
intellect, and the immortals gave them skill in subtle crafts.

Somewhat tellingly, we would refer to such contraptions as “automatons,”
which comes from the Ancient Greece word αὐτόματος (automatos), composed
of “self-” and “thinking.” This already provides some insights into what
we humans hope AI to be.

The Mechanical Turk And The Difference Engine

Moving from myth to mechanics, we find the 18th century to be the heyday
of all types of clever automatons, including filigree music boxes (which
our modern mind would probably not directly associate with anything AI)
or contraptions like the infamous “Mechanical Turk”.
Created by a certain Wolfgang von
Kempelen, this
machine was advertised as being able to play chess against a human
player. Unfortunately, but maybe not unsurprisingly, it turned out to be
a cleverly designed fake: The box essentially had enough space to hide
a human player. Before people found out, the “Mechanical Turk” was
a sensation (and rightfully so) since it appeared to mechanize that
which used to be solely under the purview of humans.

All in all, however, the creativity of that epoch was restricted by the
available technology. Thus, most automatons from that epoch, regardless
of how cleverly designed and crafted they might have been, are just too
specific to be considered AI. Nevertheless, an interesting anecdote by
none other than Charles
Babbage, creator and
programmer of the difference
engine, i.e., one of
the precursors of the modern computer. Babbage wanted to solve
polynomial equations faster and more precisely; when presenting his
machine, he made the following entries in his journal:1

On two occasions I have been asked, “Pray, Mr. Babbage, if you put
into the machine wrong figures, will the right answers come out?” […]
I am not able rightly to apprehend the kind of confusion of ideas that
could provoke such a question.

Overpromising & Underdelivering?

Arriving finally in our times, AI and its promises were rediscovered.
In 1956, leading scientists met at the Dartmouth Workshop,
widely considered to be the creation of AI as a modern research field.
Like our own research, the initial brief was widely optimistic and
considered that about two months (!) would be sufficient to reach the
following aims:2

[…] The study is to proceed on the basis of the conjecture that every
aspect of learning or any other feature of intelligence can in principle
be so precisely described that a machine can be made to simulate it. […]
We think that a significant advance can be made in one or more of these
problems if a carefully selected group of scientists work on it together
for a summer.

Despite bringing about many new ideas (including expert systems and
ideas for the first “artificial” neural networks), the workshop did
not manage to bring AI into the public eye. This changed briefly in
1996, when Garry Kasparov
played chess against “Deep Blue”—and lost.
However, as much as this was a milestone for rule-based systems and AI
in general, “Deep Blue” was a research cul-de-sac; modern neural
networks are operating based on different paradigms, and “Deep Blue” was
not even close to a general-purpose device.

The Revolution Has Good Graphics

Skipping over a couple of years, the breakthrough finally happened
around the year 2012. Mostly driven by the availability of modern
graphics cards, i.e., Graphics Processing Units (GPUs), neural
networks essentially became feasible over night.3 The primary idea of
such networks is to combine many small computational units, dubbed the
“neurons,” which receive an input signal and may slightly modify it to
pass it on to other neurons using trainable weights (more about
that later). By arranging neurons hierarchically in different layers,
an input signal (such as an image) can be iteratively transformed
into an output signal (such as the probability that the image
contains a dog).

To train such neural networks, we require vast amounts of data.
Ideally, this data should be of high quality and diverse (whatever that might
mean for a given application). A very simplistic approximation to what
happens during the training process of neural networks is that the
neural network is shown data and the expected output. Its internal
weights are then adjusted such that the actual output more closely
matches the expected output. By repeating this countless times and
with a sufficiently large dataset, the hope is that at some point, the
neural network will have picked up the underlying pattern and
generalize to hitherto-unseen data. Despite decades of progress,
training such a neural network remains something of an art, and numerous
interesting unexpected phenomena arise during training. This did not
stop the empirical progress, and after several “victories” in the field
of computer vision, neural networks started transforming other fields.
Already in 2018, three visionary AI researchers of the 1980s—Yann LeCun,
Yoshua Bengio,
and Geoffrey Hinton—received the “Turing Award,”
one of the highest honors in computer science. Together with other
pioneers like Jürgen Schmidhuber and
Fei-Fei Li, these
researchers could be considered the godparents of modern AI.

Pay Attention!

The next revolution happened thanks to the Attention
Mechanism, which provided a neural
network with a way to assign different parts of its inputs different
weights, denoting their importance for a specific task. The resulting
architecture, the transformer, turned out to be of general use,
enabling, among other things, improved translation engines. This led to
neural networks being capable to make use of inputs in the form of
natural language, thus encroaching on humanity’s primary domain, viz.,
our capability of wielding language. Again, through vast amounts of
data, often painstakingly annotated by humans, OpenAI managed to create
one of the first generally usable large language models, dubbed
ChatGPT. The true genius of this model is that it makes a plethora of
tools available through language. No more arcane command-line inputs or
scripts, but instead incantations, i.e., prompts, that remind me very
much of magic spells…

Whatever our own opinions on AI, the transformation is already happening
now, and large language models are ubiquitous. The use of language as
the primary way of communication makes models appear “intelligent,”
despite the mechanisms being essentially the same as in the previous
decades, involving “just” a larger amount of data and compute. Whether
this is a difference in degree or a difference in kind remains to be
shown. However, there is at least one major distinction to known
special-purpose machines like the aforementioned difference engine: Such
machines typically make no mistakes in the calculations for which they
were constructed (setting aside the fact that floating-point
calculations are still hard to do with finite memory). This precision
comes at the price of being fundamentally limited. Large language
models do not exhibit such limitations a priori, but all their
cognition is based on their inputs. If ChatGPT reads that people write
about Switzerland having mountains, it will “learn” said fact, but it
would equally be “happy” with learning that Switzerland is an
almost-planar country. Hence, a lot of human intervention is required to
teach AI the “right” things—and we should not be content with that
type of training happening behind closed doors! Moreover, given the
fundamentally stochastic nature of these models, it is not guaranteed
that a model will always describe the right things. Unwillingly, it may
hallucinate new “facts,” including references to non-existent articles
or books. Some researchers thus believe that modern large language
models will ultimately suffer the fate of “Deep Blue” and turn out to be
too specific to be of long-term general use. In a recent WSJ
article, Yann LeCun even discourages aspiring
PhD students to work on large language models!

The Future Is Here And (Sort Of) Evenly Distributed

Whatever the future may hold, seemingly “modern” AI is now obviously
there, and it is a technology that demands a lot from us. It challenges
our beliefs, our systems, and our ways of living. Like any other
technology, it has the potential for good and for evil. Unlike other
technologies, however, AI is much more seductive since it promises
short-term cognitive shortcuts that may result in long-term
deficits.4 Moreover, modern large language models pretend to be
persons, but at least for now, this is most likely5 just a facade and
there is a lot of cognition but no consciousness. This does not prevent
many users to treat their large language model as a person, though, and
trust its utterances. Such trust can be misplaced and turn out fatal. As
a society, we thus have to ask ourselves how we want to deal with AI.
Do we treat everything it like something out of Pandora’s box or do we
rather want to consider it as something straight from the horn of
Amalthea? Regardless of our stance, everyone should at least understand
the basics of this technology, lest the new relationship between human
and machine turns out to be a toxic one.

(This is an extended and translated version of my essay on AI,
appearing in UNIVERSITAS.)

“Passages from the Life of a Philosopher,” 1864 ↩︎

This makes me feel less bad when reflecting upon my own research
proposals. ↩︎

The readers familiar with neural networks may have to pardon that
I am glossing over a lot of details here. ↩︎

While AI is famously compared to mechanical calculators, I think
this is a half-baked argument at best. We still need to teach
students basic arithmetical skills so that they may hopefully
understand what calculations are all about. Even a TI-82
will not solve your homework if all you do is mash buttons at
random. ↩︎

At least I hope so, because I find the idea of an enslaved
conscious entity to be loathsome. ↩︎

You can give me anonymous
feedback or a
tip.
Moreover, follow me on X (formerly Twitter),
Bluesky,
or Mastodon
to get notified about new posts. Unless specified otherwise, all content
has been created by Bastian Rieck and is licensed under a
Creative Commons Attribution 4.0 International License.