-
Nvidia’s core business is transforming electrons into tokens — Jensen Huang frames the entire company around this: electricity goes in, AI-generated tokens come out, and Nvidia’s job is to make that transformation as efficient and valuable as possible. The “artistry, engineering, science, and invention” required to make each token more valuable than the last is, in his view, extremely hard to commoditize, even if the physical manufacturing (TSMC, SK Hynix, ODMs) is done by others.
-
Nvidia’s philosophy: “do as much as necessary, as little as possible”
- Nvidia focuses on the parts of the stack that nobody else would build (CUDA, NVLink, domain-specific libraries like cuLitho) and partners on everything else.
- This extends to the supply chain: Nvidia doesn’t want to be a cloud provider or a financier, but it invests in and backstops ecosystem partners (CoreWeave, Nscale, Nebius) because those businesses wouldn’t exist without Nvidia’s support.
- The company has the largest AI ecosystem across all five layers of the AI stack (energy, chips, infrastructure, models, applications).
-
Supply chain moat: Nvidia’s scale lets it lock up scarce components years in advance
- Nvidia has ~$100B in reported purchase commitments (SemiAnalysis estimates $250B) with foundries, memory makers, and packaging suppliers.
- Upstream suppliers invest because they trust Nvidia’s downstream demand is large enough to absorb their output — a self-reinforcing cycle.
- Huang personally spends significant time educating upstream CEOs about the scale of the AI opportunity, which inspires them to invest.
- Past bottlenecks (CoWoS packaging, HBM memory) were resolved within 2–3 years by “swarming” them with investment and attention. Nvidia now prefetches bottlenecks years ahead (e.g., silicon photonics via Lumentum/Coherent investments).
- The hardest bottlenecks to scale are not chip-related but labor-related: plumbers, electricians, and construction workers needed to build data centers.
-
Nvidia can sustain 2x annual growth even at massive scale
- AI will be ~60% of TSMC’s N3 node this year, ~86% next year. Huang argues none of the bottlenecks last more than 2–3 years because once a demand signal exists, capacity can be replicated.
- Nvidia compensates for supply constraints with extreme co-design: each generation (e.g., Hopper → Blackwell) delivers 30–50x efficiency gains through new algorithms (MoEs, disaggregation), not just Moore’s Law (~25%/year).
- The real long-term bottleneck is energy policy, not chip capacity. Building new energy infrastructure takes far longer than building fabs.
-
CUDA and the software ecosystem are Nvidia’s deepest moat, not just hardware specs
- CUDA has several hundred million GPUs in the install base, spanning every cloud and on-prem deployment, going back many generations.
- Even hyperscalers that write their own kernels (OpenAI with Triton, Anthropic with custom stacks) build on top of CUDA because the ecosystem is so rich and well-validated — when something breaks, they want it to be their code, not the platform’s.
- Nvidia dedicates enormous engineering resources to co-optimizing with AI labs, often delivering 2–3x speedups to their models.
- Performance per TCO (total cost of ownership) and performance per watt are, in Huang’s view, unmatched — no competitor has demonstrated better on public benchmarks like InferenceMAX or MLPerf.
-
Competitors (TPUs, Trainium, ASICs) are not a structural threat
- Google’s TPUs power Claude and Gemini, but Huang attributes this to Anthropic’s unique financial arrangement with Google (multi-billion-dollar investment in exchange for compute usage), not a technical superiority.
- Nvidia’s addressable market is far broader than any ASIC because accelerated computing serves molecular dynamics, fluid dynamics, data processing, and many non-AI workloads.
- Programmability matters: new architectures (hybrid SSMs, diffusion-autoregressive fusion) require a generally programmable platform, which TPUs lack.
- ASIC margins are also very high (~65%), so the cost savings vs. Nvidia’s ~70% margins are minimal.
- Nvidia’s revenue share is growing, not shrinking, even as competitors exist.
-
Nvidia is investing in AI labs (OpenAI, Anthropic) but won’t become a hyperscaler
- Nvidia invested ~$30B in OpenAI and ~$10B in Anthropic, but only after reaching sufficient scale to do so. Earlier, it couldn’t make the multi-billion-dollar commitments that Google and AWS could.
- The company backstops CoreWeave (up to $6.3B) and invests in neoclouds to ensure the ecosystem thrives, but deliberately avoids becoming a cloud provider itself — consistent with “do as little as possible.”
- Nvidia doesn’t pick winners: it invests in all major foundation model companies because it couldn’t predict which would succeed (early Nvidia itself was considered the least likely to survive among 60 graphics companies).
-
Allocation is first-come, first-served, not highest bidder
- Nvidia sets a price and sticks to it regardless of demand spikes. Allocation goes to whoever places a purchase order first and has their data center ready.
- Stories about executives “begging” for GPUs at dinner are exaggerated — they just needed to place an order.
- Huang sees being dependable and foundational as a core competitive advantage: customers can “bet the farm” on Nvidia delivering next-generation platforms every single year.
-
On selling chips to China: Huang strongly opposes export controls
- China already has enormous compute capacity: it manufactures 60%+ of the world’s mainstream chips, has abundant energy (including empty powered data centers), and has 50% of the world’s AI researchers. Denying them Nvidia chips doesn’t prevent them from having AI capabilities — it just forces them onto domestic alternatives.
- Huawei’s 910C is roughly Hopper-class (7nm), and Huawei just had the largest year in company history. They have plenty of logic and HBM2 memory. While bandwidth lags, they can gang chips together and use silicon photonics to compensate.
- The real risk of export controls: conceding the world’s second-largest computing market means Chinese AI developers build on a Chinese tech stack, not the American one. Since China is the largest contributor to open source AI models, if those models are optimized for Huawei rather than Nvidia, the American ecosystem loses global influence.
- Huang’s analogy: export controls on AI chips are like export controls on microprocessors or DRAM — they sound protective but actually accelerate foreign competitors’ independence. He points to the US telecommunications industry, which was “policied out” of global markets, as a cautionary tale.
- On the cyber-security concern (e.g., Mythos finding zero-days): Huang argues the solution is researcher dialogue and building AI safety ecosystems (open source, open models), not compute denial. He also notes Mythos was trained on “fairly mundane” compute that China already possesses.
- Huang’s core argument: the US should compete everywhere and win everywhere across all five layers of the AI stack. Conceding any layer — especially chips — is a disservice to American technology leadership. The US should have the best and most chips domestically while also competing globally.
-
Why Nvidia doesn’t make multiple chip architectures in parallel
- Nvidia simulates alternative architectures (wafer-scale, Dojo-style, non-CUDA) and finds them provably worse for current workloads.
- The company recently acquired Groq to serve a new market segment: premium, low-latency inference where customers will pay high ASPs per token for faster response times, even at lower throughput. This expands the Pareto frontier rather than replacing the core architecture.
- If Nvidia had more money, it would invest more in its current architecture rather than diversifying.
-
If deep learning hadn’t happened, Nvidia would still be very large
- The fundamental premise of Nvidia is accelerated computing: general-purpose CPUs can’t efficiently scale for many workloads (molecular dynamics, seismic processing, fluid dynamics, image generation, computational lithography, quantum chemistry).
- CUDA and the GPU architecture democratized deep learning, but the mission of bringing accelerated computing to science and engineering would remain.
- A significant portion of GTC is non-AI: computational lithography, quantum chemistry, data processing, and other domain-specific acceleration work.
Jensen Huang – Will Nvidia’s moat persist?
Dwarkesh Podcast • • 1h43 → 5 min • #116