Ask HN:AI規模擴展的平台期,會不會只是個「假性衰退」?

Hacker News·

這篇Hacker News的文章探討了一個假設:AI發展(特別是大型語言模型)中感知到的效能平台期,可能只是暫時的「複雜度衰退」,而非真正的極限。作者認為,複雜度的增加初期可能導致效能下降,但隨後會引發更優越的湧現能力。

Image

The Premise: There has been a lot of talk lately about the possibility that AI development (as we currently know it) is approaching a plateau. While I don't personally agree with this hypothesis, it is undeniably a common sentiment in the industry right now, so it’s worth investigating.

We have seen that increasing the number of parameters or "scaling up" a neural network doesn't always yield immediate linear improvements. With certain versions of ChatGPT, many users perceived a degradation in performance despite the underlying network complexity presumably being increased.

My Theory: Is it possible that we are seeing a "complexity dip"? In other words, could there be a phase where increasing complexity initially causes a drop in performance, only to be followed by a new phase where that same complexity allows for superior emergent properties?

To simplify, let’s imagine a hypothetical scale where we compare "Complexity" (parameters/compute) vs. "Performance." For example:

LLM: Chat GPT 3 // Complexity Level 1 // Performace 0.2

LLM: Chat GPT 3.5 // Complexity Level 10 // Performance 0.5

LLM: Chat GPT 4 // Complexity Level 100 // Performance 0.75

LLM: Chat GPT 4.2 // Complexity Level 1000 // Performance 0.6 (The "False Plateau" / Performance degradation)

LLM: Chat GPT 4.2X // Complexity Level 10000 // Performance 0.5 (Further degradation due to unmanaged complexity)

LLM: Chat GPT 6 // Complexity Level 100000 // Performance 0.8 (The "breakthrough": new abilities emerge)

LLM: Chat GPT 7 // Complexity Level 1000000 // Performance 0.99 (Potential AGI / Peak performance)

The Risk: The real problem here is economic and psychological. If we are currently in the "GPT-4.x" phase of this example, the industry might stop investing because the returns look negative. We might never reach the "GPT-6" level simply because we mistook a temporary dip for a permanent ceiling.

I’m curious to hear your thoughts. Have we seen similar "dips" in other complex systems before a new level of organization emerges? Or is the plateau a hard physical limit?

Image

Hacker News

相關文章

  1. 如果AI既非常優秀又不那麼顛覆呢?

    3 個月前

  2. 大型語言模型若為泡沫,AI對齊的未來將走向何方

    Lesswrong · 4 個月前

  3. 與大型語言模型協作有時真的讓人精疲力竭

    大約 1 個月前

  4. AI 的侷限:初階開發者如何透過理解 AI 的限制來蓬勃發展

    5 個月前

  5. 大語言模型作為淺層電路的巨型查找表

    Lesswrong · 大約 1 個月前