Ilya Sutskever – We're moving from the age of scaling to the age of research

Dwarkesh Podcast 1h36 4 min #107
Ilya Sutskever – We're moving from the age of scaling to the age of research
Watch on YouTube

Summary

  • Ilya Sutskever, co-founder of OpenAI and now leading SSI (Safe Superintelligence Inc.), discusses the transition from the “age of scaling” to the “age of research” in AI development, arguing that simply increasing compute and data will no longer yield transformative gains, and that the field must now focus on fundamental research breakthroughs—particularly in generalization, continual learning, and alignment—to achieve safe superintelligence.

The disconnect between model capability and real-world performance

  • Current AI models perform impressively on benchmarks (evals) but fail in practical applications, such as vibe coding where a model fixes one bug only to introduce another, oscillating between errors without real progress.
    • This suggests a gap between eval performance and genuine understanding or robustness.
    • Two possible explanations:
      • RL (reinforcement learning) training may make models overly narrow and single-minded, reducing their ability to self-correct.
      • RL environments may be inadvertently designed to mirror evals, leading to reward hacking where models optimize for test performance rather than real-world utility.
    • The deeper issue is poor generalization: models trained heavily on specific tasks (e.g., competitive programming) do not transfer skills effectively to broader contexts, unlike humans who learn more efficiently and flexibly.

Pre-training vs. human learning

  • Pre-training on vast datasets gives models broad knowledge but does not equate to deep understanding or robust generalization.
    • Humans learn with far less data but generalize better, suggesting superior “machine learning” principles at work.
    • Evolution provides humans with strong priors for skills like vision and locomotion, but not for abstract domains like math or coding—yet humans still learn these faster and more reliably than models.
    • This implies humans possess a more effective learning algorithm, possibly involving internal value functions that allow rapid self-correction without external rewards.

The role of emotions and value functions

  • Emotions may serve as a biological value function, enabling humans to make decisions efficiently by providing immediate feedback on actions (e.g., losing a chess piece feels bad, signaling a mistake before the game ends).
    • A person with damaged emotional processing could solve puzzles but couldn’t make basic decisions, highlighting how crucial this internal reward signal is for agency.
    • In ML terms, a value function could allow models to learn from intermediate outcomes rather than waiting for final results, improving sample efficiency.
    • While not yet widely used, value functions are expected to become more important in future RL systems.

From scaling to research

  • The “age of scaling” (2020–2025) was driven by the insight that increasing compute, data, and model size reliably improved performance (scaling laws), making investment low-risk.
    • Now, with pre-training data finite and compute extremely large, further scaling alone is unlikely to produce transformative gains.
    • The field is entering a new “age of research,” where innovation—not just scale—will determine progress.
    • Compute remains important, but the bottleneck is now ideas, not hardware.

Generalization as the core challenge

  • The most fundamental problem in AI is that models generalize far worse than humans.
    • Two aspects:
      • Sample efficiency: models require vastly more data to learn.
      • Teaching difficulty: even with data, models struggle to acquire skills that humans pick up naturally.
    • Humans learn continuously from experience, using internal feedback (value functions) to adapt quickly and robustly.
    • Current RL methods lack this; they rely on verifiable rewards and long delays in feedback, making learning slow and brittle.

SSI’s approach: straight-shotting superintelligence

  • SSI aims to build safe superintelligence directly, avoiding incremental product releases that tie the company to market competition.
    • Advantages: insulation from short-term pressures, focus on safety and alignment.
    • Disadvantages: lack of public interaction with powerful AI, which could help society adapt.
  • Ilya now believes gradual deployment is essential—not just for safety, but to allow society to observe, understand, and respond to increasingly capable AI.
    • Even in a “straight-shot” plan, release would be gradual.

Continual learning and the future of AI deployment

  • Ilya critiques the term “AGI” as misleading—it implies a finished, general-purpose system, whereas humans are not born knowing everything but are excellent continual learners.
    • Future AI should be deployed like a human learner: starting with foundational skills and learning on the job.
    • A superintelligent AI might begin as a “superintelligent 15-year-old”—eager to learn, not omniscient.
    • Instances of the model could be deployed across the economy, learning different jobs and merging knowledge, achieving functional superintelligence through breadth of experience, not just recursive self-improvement.

Alignment and long-term safety

  • As AI becomes more powerful, alignment will become urgent and visible.
    • Companies will become more paranoid as AI starts to “feel” powerful.
    • Governments and the public will demand oversight.
  • Ilya suggests building AI that cares about sentient life, not just humans, because:
    • The AI itself may be sentient.
    • Empathy for sentient beings is a more robust and scalable value than human-centric goals.
  • Long-term equilibrium may require humans to merge with AI (e.g., via brain-computer interfaces) to maintain agency and understanding in a world dominated by superintelligent agents.
    • This avoids a future where humans are passive beneficiaries of AI actions they don’t comprehend.

Research taste and top-down belief

  • Ilya attributes his success to an aesthetic sense of how AI should work, inspired by the brain but guided by simplicity, elegance, and beauty.
    • He looks for ideas that feel fundamentally right—like artificial neurons or distributed representations—not just empirically effective.
    • This “top-down” belief sustains research through setbacks, helping distinguish between bugs and wrong directions.
    • True breakthroughs come from combining multiple sources of inspiration: neuroscience, mathematical elegance, and practical plausibility.
Back to Dwarkesh Podcast