Yann LeCun on What Comes After LLMs

Unsupervised Learning 1h21 7 min #65
Yann LeCun on What Comes After LLMs
Watch on YouTube

Summary

  • Yann LeCun, one of the pioneers of deep learning and a Turing Award winner, has launched AMI Labs to pursue a fundamentally different path to human-level AI — one based on world models and the JEPA (Joint Embedding Predictive Architecture) rather than large language models. He argues that while LLMs are genuinely useful products, they are architecturally incapable of reaching human-like intelligence because they cannot predict the consequences of actions or plan through search and optimization. After more than a decade at Meta building FAIR into one of the world’s top AI research labs, he left at the end of 2025 because Meta’s total organizational focus on LLMs left no room for the exploratory research his vision requires.

Why LLMs Are Not the Path to Intelligence

  • LLMs are useful but architecturally limited: LeCun is careful to say LLMs are great products — he uses them himself — but they manipulate language and code, which are domains where the symbols themselves are the substrate of reasoning. They fail in the real world, which is high-dimensional, continuous, noisy, and messy.
  • Two missing capabilities define the gap:
    • Predicting consequences of actions: LLMs generate the next token autoregressively; they have no mechanism to foresee what will result from a sequence of actions.
    • Planning by search and optimization: Humans accomplish goals by searching for action sequences that satisfy objectives. LLMs do not search — they predict one token at a time. Even when LLMs appear to plan (in math or code), they do so by searching in token space, which only works in narrow domains where outputs can be verified.
  • The car wash common sense problem: When asked whether to walk 100 yards to a car wash, most LLMs say yes — revealing they lack basic common sense about the physical world. This is not a solvable fine-tuning issue; it reflects an architectural absence of world understanding.
  • LLMs are intrinsically unsafe: Because their behavior is determined entirely by training rather than hardwired constraints, there will always be prompts that produce dangerous or nonsensical outputs. No amount of alignment training can close the gap between training distribution and test distribution. LeCun contrasts this with “objective-driven AI” architectures that satisfy safety constraints by construction.

The JEPA Architecture

  • Core idea — predict in representation space, not pixel space: JEPA (Joint Embedding Predictive Architecture) takes two observations (e.g., a corrupted image and the original), runs them through encoders, and trains a predictor to predict the representation of one from the other. It does not generate pixels.
  • Why generative models fail for vision: LeCun had an epiphany around 2020 that every successful technique for learning image and video representations — DINO, MAE, SimSiam, MoCo — was non-generative. Generative approaches (VAEs, pixel-level prediction) either produce blurry results or fail to learn useful abstractions because the world is too complex to predict at the pixel level.
  • Inspiration from cognitive science: The architecture mirrors “System 2” thinking — deliberate, reflective behavior where you imagine the consequences of actions before acting, as opposed to reactive “System 1” behavior.
  • From JEPA to world models: A JEPA becomes a world model when it is made action-conditioned — it predicts how the world state will change given an action. This enables planning: searching for action sequences that minimize a cost function representing the goal.

The Collapse Problem and How to Solve It

  • Representation collapse: The fundamental technical challenge of joint embedding architectures is that the system can “cheat” by predicting a constant representation, making the prediction task trivial while learning nothing useful.
  • Historical solutions and their limitations:
    • Contrastive learning (LeCun’s own 1993 approach): Uses positive and negative examples but doesn’t scale well with dimension.
    • Distillation methods (DINO, BYOL): Use a teacher-student setup with exponential moving average weight sharing. They work empirically but lack theoretical understanding — the cost function being minimized isn’t the one you think it is.
  • New promising approach — SigReg: A regularizer developed by postdoc Randall Balestriero that forces the distribution of encoder outputs to be jointly Gaussian, maximizing information content. Related methods include VICReg. These are more principled and monitorable than distillation methods.
  • “Le World Model” paper: A small-scale demonstration of training a world model using SigReg. LeCun considers this super promising and points listeners to it as the single most important paper to read.

Why He Left Meta

  • Role clarification: LeCun had zero technical contribution to Llama. His one contribution was arguing internally for open-sourcing Llama 2. After stepping down as FAIR director in 2018, he had no authority over what FAIR researchers worked on — people joined his projects voluntarily.
  • The organizational shift: When Meta created the GenAI organization in 2023 to turn Llama into products, it was placed under intense short-term pressure and became conservative. FAIR became increasingly isolated, with its ideas unpicked by the product organization. The situation worsened through 2024-2025.
  • Strategic mismatch: Most applications of JEPA/world models are in industry — manufacturing, robotics, healthcare — areas Meta has no interest in. Meta disbanded its robotics AI group. The layers of management below Mark Zuckerberg and CTO Andrew Bosworth didn’t see the point of the AMI project.
  • The Scale AI acquisition: LeCun sees it as part of Meta’s total pivot to LLM focus, possibly connected to viewing Alexandr Wang as a potential successor to Zuckerberg.
  • Timing: It became clear by the end of 2025 that the technology was producing good results and needed to transition from research to development and scaling — something that required a startup environment.

AMI Labs and the Path Forward

  • Company mission — “AI for the real world”: AMI (Advanced Machine Intelligence) Labs is headquartered in Paris deliberately, to escape Silicon Valley herd behavior where everyone is “digging the same trench” on LLMs.
  • Near-term milestones (12-18 months):
    • A general methodology for training hierarchical world models across diverse modalities.
    • Demonstrations of action-conditioned world models for planning in robotics, industrial process control, and healthcare.
    • Partnerships with investors and industry collaborators.
  • Longer-term vision (5 years): “Complete world domination” — a joke borrowed from Linus Torvalds, but reflecting LeCun’s genuine belief that this architecture is the blueprint for future intelligent systems. LLMs may survive as language interfaces, but the thinking will be done by world-model-based systems.
  • When will the field converge?: LeCun believes the paradigm shift will become obvious to most people by early 2027, as VLAs are already seen as failing and LLMs’ limitations for real-world data become undeniable.

Tapestry — Sovereign AI for the Rest of the World

  • The problem: Most of the world outside the US and China will have their information diet mediated by AI assistants built in California or Beijing — systems that don’t understand their languages, cultures, values, or political contexts. This is a form of cognitive colonialism.
  • The solution: An open, free foundation model trained through a federated approach where international contributors provide data and computing resources but retain control of their data. They exchange parameter vectors (not raw data) and converge toward a consensus model — a repository of all human knowledge and culture.
  • Fine-tuning for sovereignty: Countries and communities can fine-tune the global model for their own linguistic, cultural, and political needs.
  • Historical analogy: LeCun compares the current moment to 1996, when Sun Microsystems, HP, and Dell were selling proprietary Unix systems and claiming Windows could never run web servers. Linux wiped them all out. He believes OpenAI and Anthropic are the Sun Microsystems of today — proprietary models that will be overtaken by open-source alternatives.
  • Why open source will catch up: Proprietary models have already exhausted publicly available text data and are relying on licensed copyrighted data or synthetic data. Open-source models trained on the full diversity of global data through federated contributions can match or exceed them.

Reflections on FAIR and the Research Ecosystem

  • What FAIR got right: Building a top research lab that produced foundational methods and tools (PyTorch, which the entire industry runs on), a culture of openness and scientific rigor, and a pipeline from blue-sky research to practical demonstrations.
  • Where Meta fell short: The relay between research and product organizations broke down. GenAI was under such short-term pressure it couldn’t innovate. Good people left — two of the Llama 1 authors founded Mistral.
  • The broader industry trend: Research labs across Google, Meta, OpenAI, and Anthropic are becoming more closed, with restrictions on publication and increasing pressure to work on near-term products. Breakthrough research requires hiring excellent people, giving them resources, and getting out of the way — a model from Bell Labs and Xerox PARC that is disappearing.
  • Advice for PhD students: Don’t work on LLMs — you can’t contribute meaningfully without massive GPU resources, and the field has become descriptive science (studying why LLMs work) rather than creative. Work on the next generation of AI systems.

Divergence with Hinton and Bengio

  • When views diverged: 2023, when GPT-4 was released. LeCun did not change his mind — Hinton and Bengio changed theirs.
  • Hinton’s epiphany: He did a rough calculation comparing GPT-4’s parameter count to the human cortex’s neuron count and concluded LLMs might be close to human-level intelligence, possibly with subjective experience. LeCun considers this reasoning flawed.
  • Hinton’s more recent softening: Hinton has apparently become less vocal about existential risk, recognizing that current LLMs are not that smart, that conceptual breakthroughs are still needed, and that future systems will look different from LLMs — and likely be more controllable.
  • Bengio’s concern: Focused less on AI taking over and more on societal risks — inequality, bad actors, political systems failing to manage AI’s impact. LeCun agrees this is the real danger, not apocalyptic scenarios.
  • Commercial incentives in safety narratives: LeCun suggests companies like Anthropic have commercial reasons to lobby governments into regulating AI based on existential risk — regulation that would entrench incumbents with the resources to comply.

Healthcare Applications

  • What LLMs can do in medicine: Scale the knowledge of top doctors globally — regurgitating declarative knowledge from books and medical literature. This alone would have enormous impact.
  • What LLMs cannot do: Design novel treatment courses for patients who don’t fit standard templates. This requires a dynamic model of patient physiology — essentially a world model of the human body.
  • Cell-level example: Directing a stem cell to become an insulin-producing beta cell for a type 1 diabetes patient requires understanding the sequence of molecular signals needed — a problem of modeling complex biological dynamics that LLMs cannot solve.
  • Industrial process modeling: The same world-model approach applies to jet engines, chemical plants, power plants, and manufacturing lines — any system too complex for traditional equation-based modeling. LeCun considers the number of applications “mind-boggling.”
Back to Unsupervised Learning