Shane Legg (DeepMind Founder) — 2028 AGI, superhuman alignment, new architectures

Dwarkesh Podcast 44min 4 min #57
Shane Legg (DeepMind Founder) — 2028 AGI, superhuman alignment, new architectures
Watch on YouTube

Summary

  • Shane Legg, co-founder and Chief AGI Scientist at Google DeepMind, discusses how to measure progress toward AGI, what’s missing from current AI systems, how alignment might work for superhuman models, and why he estimates a 50% chance of AGI by 2028 — a prediction he first formulated around 2001 based on exponential trends in compute and data.

Defining and Measuring AGI

  • Legg defines AGI as a machine that can perform the full range of cognitive tasks that humans can do, possibly more — it’s about generality, not excelling at any single task.
  • Measuring progress requires a broad suite of tests spanning the breadth of human cognitive abilities, benchmarked against human performance.
    • No single test or benchmark is sufficient; the key is comprehensiveness across many domains.
    • Even after passing a large test suite, an adversarial approach is needed: if people deliberately try to find cognitive tasks where the machine falls below human level and fail, that’s strong evidence for AGI.
  • Current benchmarks like MMLU are limited — they don’t capture things like understanding streaming video, episodic memory, or rapid learning of specific events.

What’s Missing in Current Models

  • Episodic memory: Humans can rapidly learn and recall specific events (e.g., remembering something said yesterday) via the hippocampus. LLMs lack this — they have context window (working memory) and trained weights (long-term cortical-like memory), but nothing in between for fast, specific learning.
  • Sample efficiency: Related to episodic memory; LLMs require trillions of tokens while humans learn from far less experience. This isn’t a fundamental limitation but a gap in current architectures.
  • System 2 reasoning: Current LLMs operate like “System 1” — fast, intuitive, pattern-matching responses. They lack deliberate, step-by-step reasoning through options using a world model, which Legg sees as essential for both creativity and ethical decision-making.
  • Legg believes these are solvable problems, not fundamental blockers, and that relatively clear research paths exist to address them.

The Role of Search and Creativity

  • True creativity requires search through a space of possibilities to find hidden gems, not just blending existing data.
    • AlphaGo’s famous Move 37 came from search identifying an unlikely but excellent move, not from mimicking training data.
  • LLMs can recombine and generalize in novel ways but cannot yet go truly beyond their training data without search mechanisms.
  • Legg sees search as a necessary ingredient for the next level of AI capability beyond current foundation models.

Alignment and Superhuman AI

  • Legg argues containment won’t work for truly capable AGI — the system must be fundamentally value-aligned from the start.
  • His framework for ethical AI mirrors how humans handle difficult ethical decisions: calm down, enumerate options, use a world model to predict consequences, then reason about which action best aligns with ethical principles.
    • This requires a System 2 process: deliberate reasoning, not just sampling from a distribution.
  • Current alignment techniques like RLHF and Constitutional AI try to fix the underlying System 1 distribution, which Legg sees as fragile in a high-dimensional space.
  • For robust alignment, the system needs:
    • A deep world model
    • A thorough understanding of human ethics (trained on lectures, papers, books)
    • Robust, reliable reasoning applied to ethical analysis of every decision
  • The challenge of specifying which ethics to follow is a societal problem, not purely technical — but once specified, the system should be engineered to consistently apply them.
  • Legg favors checking the reasoning process and ethical understanding (grilling the system, auditing decisions) over pure reinforcement, which risks training deception.
  • He’s optimistic that better world models and reasoning will come with increased capabilities, but the ethical framework piece requires deliberate work.

DeepMind’s Impact on Safety vs. Capabilities

  • DeepMind was founded with safety motivations and has maintained an AGI safety group from the start, publishing papers and hiring safety researchers when the field was fringe.
  • Legg acknowledges the counterfactual is hard to assess — DeepMind has clearly accelerated capabilities (e.g., AlphaGo), but the field was already moving in this direction with many players.
  • He notes that good ideas often emerge simultaneously across the community, suggesting much of this progress would have happened anyway, perhaps on a slightly different timeline.

Timelines and Predictions

  • Legg has held a prediction of AGI around 2028 (log-normal distribution, mode 2025) since roughly 2001, based on:
    • Exponential growth in compute and data continuing for decades
    • Positive feedback loops between better algorithms, more compute, and more data
    • The expectation that by the 2020s, models could be trained on more data than a human experiences in a lifetime
  • He currently gives a 50% chance of AGI by 2028 and expects the intervening years to see:
    • Models becoming less delusional and more factual
    • Greater multimodality
    • Many impressive and useful applications
    • Some misuse cases, but mostly positive developments
  • If AGI doesn’t arrive by 2028, it would likely be due to unexpected research problems taking longer to solve — but he currently sees no insurmountable blockers.

The Next Landmark: Multimodality

  • Legg believes the next major milestone people will remember is fully multimodal AI — systems that understand images, video, audio, and text in an integrated way.
    • This will make AI feel like it has “opened up into the world” rather than being confined to text chat.
    • It will unlock new training data sources and applications that are hard to imagine today.
  • Early multimodal efforts (like DeepMind’s Gato or ChatGPT’s image features) are still early days; the real breakthrough will come from deeply digesting video and other modalities for grounded world understanding.

Legg’s Intellectual Journey

  • His PhD work on universal intelligence sought a mathematically clean definition of intelligence weighted by Kolmogorov complexity, but the framework has a free parameter (the reference machine) that makes it incomplete.
  • He’s converged on human intelligence in human-like environments as the most natural and meaningful reference point, given its economic, philosophical, and historical significance.
  • The success of LLMs as powerful sequence predictors aligns with his earlier theoretical work connecting Solomonoff induction to general intelligence — a good predictor plus search yields a powerful agent.
Back to Dwarkesh Podcast