Sparks of AGI

· March 31, 2023

(Updated: Nov 23, 2024)

Microsoft published a paper about GPT-4, with the (literal) headline claim that it showed sparks of artificial general intelligence. The paper includes an approach to evaluate the model’s abilities, and many examples.

Jorge I Velez put together an excellent write up onLessWrong (or hissubstack).

One topic discussed is inner dialog:

“[S]ince the training data is essentially a linear exposition of the solution, a model trained on this data has no incentive to engage in an “inner dialogue” where it revisits and critically evaluates its own suggestions and calculations. Second, the limitation to try things and backtrack is inherent to the next-word-prediction paradigm that the model operates on. It only generates the next word, and it has no mechanism to revise or modify its previous output, which makes it produce arguments “linearly”.

In contrast, when humans answer questions, they can have an internal reasoning process before speaking or taking action. We can think about the problem and then examine our own thoughts, while the model has no ability to go back and alter what it has already output. You can envision a two-step process where a language model generates an answer that is then added to the prompt before emitting the final response; effectively this is what is happening in various chatgpt prompt chains already.

Extending this: some popular theories of mind posit our neural processes as many separate parts vying for attention, and, at a sufficiently high level, us making choices between them. Our inner dialogue makes us capable of making decisions and critically evaluating our own suggestions and calculations: a space for those subagents to operate.

In a slide deck, Yann LeCunn sticks to his guns re “fundamental architecture change”, namely:

  • Abandon generative models
    • in favor joint-embedding architectures
  • Abandon probabilistic model
    • in favor of energy-based models
  • Abandon contrastive methods
    • in favor of regularized methods
  • Abandon Reinforcement Learning
    • In favor of model-predictive control

MeanwhileKarpathy is more on the “string them together” side:

“1 GPT call is a bit like 1 thought. Stringing them together in loops creates agents that can perceive, think, and act, their goals defined in English in prompts. For feedback / learning, one path is to have a “reflect” phase that evaluates outcomes, saves rollouts to memory, loads them to prompts to few-shot on them. That is the “meta-learning” few-shot path. You can “learn” on whatever you manage to cram into the context window. “