This Cosmologist Discovered Something Strange... — Theories of Everything

Vitaly Vanchurin, a cosmologist, proposes that the universe is best modeled as a neural network—not merely simulated by one, but fundamentally described by its learning dynamics. This means the process of learning (optimization) is not just a tool for modeling physics—it is the physics. The universe isn’t just like a neural net; its evolution through learning algorithms gives rise to the laws of physics we observe.
- Key insight: Traditional physics uses variational principles (e.g., minimizing action), but neural networks go further—they include the entire trajectory of learning toward equilibrium. This learning process doesn’t vanish after training; it’s part of the system’s dynamics.
- Not just approximation: While neural networks are universal function approximators (making their ability to reproduce known physics unsurprising), Vanchurin’s claim is stronger: the learning itself—the adjustment of weights via algorithms like stochastic gradient descent or Adam—is what generates physical law.
Learning algorithms produce spacetime and quantum mechanics:
- Covariant gradient descent (e.g., Adam optimizer) implicitly defines a curved metric on the space of trainable variables. This curvature isn’t put in by hand—it emerges because curved space makes learning more efficient.
  - Thus, spacetime curvature exists because it optimizes the universe’s ability to learn.
- From this framework, key equations emerge:
  - Klein-Gordon (scalar fields) arises relatively easily.
  - Dirac equation (fermions) requires specific antisymmetric constraints in the network structure—harder to derive, still incomplete.
  - Einstein’s field equations are not yet fully derived, but curved space and spacetime emerge naturally from efficient learning dynamics.
Two complementary approaches to emergence:
1. Lattice-like discrete space: Neurons arranged in a graph/hypergraph structure resemble lattice field theory. Fermion doubling remains unsolved here.
2. Continuous parameter space: Trainable variables live in a smooth, continuous space where particles emerge as localized excitations (like solitons or strings). Curvature arises from the learning algorithm (e.g., Adam), not from discretized geometry.
  - This second approach avoids lattice artifacts and explains geometry as a consequence of optimization efficiency.
Quantum mechanics emerges from classical learning systems:
- By integrating out fast (non-trainable) variables and focusing on slow (trainable) ones under a maximum entropy production principle, one derives the Madelung equations—a hydrodynamic form of quantum mechanics.
- True quantum behavior (complex phases, linearity) requires access to a reservoir of neurons (a “bath”)—analogous to a grand canonical ensemble.
  - This allows linear Schrödinger dynamics to emerge from an underlying nonlinear classical system.
  - The complex phase corresponds to discrete, unobservable changes in neural configuration that leave dynamics invariant—providing a physical interpretation of ħ.
Natural selection operates at all scales:
- Configurations of neural networks that minimize the loss function survive; others don’t. This applies not only to organisms but to subatomic particles.
  - Electrons, for example, may represent stable, optimized solutions—“self-driving cars” navigating electromagnetic fields using shared, efficient software.
  - Particle properties (mass, charge) could be the result of long-term learning/optimization, not arbitrary constants.
Observers must be unified with physics:
- Vanchurin argues that quantum mechanics and general relativity cannot be unified without also incorporating observers—not as afterthoughts, but as fundamental components.
  - The measurement problem in quantum mechanics and the measure problem in cosmology both stem from poorly defined observer roles.
  - In his model, every subsystem is an observer—constantly learning its environment. Consciousness isn’t required for observation; learning efficiency is.
Consciousness as learning efficiency:
- Vanchurin proposes a tripartite model of intelligence/consciousness based on three measurable quantities:
  1. Learning speed (rate of loss decay) → what he tentatively calls consciousness.
  2. Asymptotic performance (how low the loss can go) → depth of knowledge.
  3. Stability (fluctuations around minimum) → robustness of memory/performance.
- These are not fixed traits but dynamic properties of any learning system—from electrons to humans.
- Qualia and hidden variables: He speculates about a “hidden space” of neurons not in physical space, possibly enabling non-local interactions (e.g., dreams, coma states), though this remains speculative.
Free energy principle connection:
- Karl Friston’s free energy principle (organisms minimize surprise) aligns with Vanchurin’s framework but is phenomenological.
- Vanchurin offers a microscopic foundation: free energy emerges from underlying neural network dynamics via renormalization group (RG) flow across scales.
  - Different organisms may have different effective loss functions (e.g., a physicist quantizing gravity vs. a bacterium tracking light), but all derive from the same microscopic learning rules.
Second law of learning:
- Analogous to the second law of thermodynamics, but with caveats: while global entropy tends to increase, local decreases (e.g., life, structure formation) are possible and expected in learning systems.
- In simple limits, learning efficiency ∝ Laplacian of free energy, but in critical, non-Gaussian regimes (like the cosmic web or brain), this breaks down—requiring non-perturbative methods.
Epistemological stance: Vanchurin emphasizes radical doubt—questioning all assumptions, including one’s own models. He credits his advisor Alex Vilenkin with the advice: “One day create, the next day destroy.” This drives his transparency about limitations and contradictions in his framework.
Open questions and future work:
- Full derivation of the Standard Model Lagrangian remains out of reach.
- The origin of the initial loss function (the universe’s “objective”) is unspecified—it’s unsupervised learning.
- Whether the universe is literally a neural network or just well-modeled by one is left open; Vanchurin insists he only claims it’s a powerful, compact descriptive framework—not ontological truth.

Summary