What has PMF Today, Google is Cooking & GPT Wrappers are Winning | With Latent Space — Unsupervised Learning

This episode is a crossover between two AI podcasts: Unsupervised Learning (hosted by Jordan and Jacob from Redpoint Ventures) and Latent Space (hosted by Swyx and Alessio Fanelli). The conversation covers the biggest surprises, overhyped and underhyped trends, product-market fit, defensibility, and unanswered questions in AI as of early 2025. All four participants are deeply embedded in the AI ecosystem as podcasters, investors, and community builders, and the discussion reflects a practitioner-level view of where the industry stands.

Reflecting on the Past Year in AI

The transition from pre-training to inference-time scaling was suspiciously neat
- Right after Ilya Sutskever’s “pre-training is dead” talk, powerful new models dropped, making it feel like a planned handoff from pre-training scaling to inference-time compute as the new scaling law.
- In reality, labs like OpenAI had been working on this (e.g., Strawberry for ~2 years), but from the outside the timing looked almost conspiratorial.
DeepSeek was the biggest open-source story, but its significance is debated
- DeepSeek surprised people by closing the gap with closed-source models, particularly on reasoning tasks, and was the first open model to release full reasoning traces (R1).
- However, some argue DeepSeek was more about efficient execution of known techniques than fundamental innovation, and that the broader “open source is catching up” narrative is really just “DeepSeek caught up.”
- There are signs DeepSeek may stop open-sourcing, and the rest of the open-source ecosystem has largely been distilling from DeepSeek rather than innovating from scratch.
- Enterprise adoption of open-source models remains low (~5% and declining), with most enterprises preferring the most powerful available model for use-case discovery.
GPT wrappers went from mocked to dominant
- The consensus has shifted from dismissing “GPT wrappers” to recognizing that wrapper companies (like Perplexity, Cursor) are where the real value and product-market fit live.
- Pre-product-market-fit differentiation is a ridiculous thing to optimize for; the key is building something people want.
Low-code/no-code builders missed the AI builder market
- Companies like Zapier, Airtable, Retool, and Notion—which should have had the DNA, distribution, and fast-follow capability—failed to capture the AI-native builder market.
- The AI-native companies (e.g., Cursor, Replit, Lovable) succeeded because they built from scratch without being tied to existing product paradigms, while incumbents simply bolted AI onto their existing bases.
Apple Intelligence was a major disappointment
- Despite being in a prime position for personal AI, Apple Intelligence launched with numerous failures, including inaccurate text message summaries and a BBC notification that falsely reported a shooting.
- However, Apple’s Private Cloud Compute (PCC)—bringing on-device security guarantees to cloud-based LLM inference—is underhyped and could be significant for enterprise and privacy-sensitive workloads.

Overhyped and Underhyped Trends

Overhyped: Agent frameworks
- The agent framework space is chasing workloads that are still in too much flux for any framework to stabilize around. It’s compared to the jQuery era of JavaScript—useful helpers, but not the foundational layer.
- The counterargument: it may be too early for frameworks entirely; the real opportunity is in protocols (like MCP) rather than frameworks, similar to how XMLHttpRequest enabled Ajax and modern web apps.
Overhyped: New model training companies
- Despite the capital intensity and the dominance of a few large labs, new model training companies keep emerging. The consensus is that this is largely not a good opportunity unless tied to unique data (robotics, biology, material science) or specific vertical use cases.
Underhyped: Memory and stateful AI
- Neither OpenAI’s Agents SDK nor Google’s agent definitions include memory as a core feature. Yet memory—storing knowledge graphs, facts about users, and learning on the job—is critical for making agents genuinely smarter.
- MCP’s initial server lineup included a memory server, which is a good starting point, but memory should be a standard, normalized part of the AI stack.
- Stateful systems are also inherently interesting to VCs because they resemble databases, which have proven monetization paths.
Underhyped: Private Cloud Compute and multi-tenant security
- As AI workloads move to the cloud, enterprises need single-tenant security guarantees in multi-tenant GPU environments. Apple’s PCC architecture is a leading example of solving this problem.

What Has Product-Market Fit Today

The established PMF categories are coding agents, customer support agents, and deep research
- Coding agents (Cursor), support agents (Sierra, Dekugan), and deep research (OpenAI, Gemini, Perplexity versions) are the three form factors with clear, proven demand.
- OpenAI’s Deep Research launch is estimated to have generated billions in revenue, driven by the $200/month tier upgrade from the $20 ChatGPT plan.
Customer support is a natural first wave because it was already outsourced
- Companies had already accepted lower performance for cost savings via BPOs, making them willing to accept AI agents that are “good enough.”
- However, this also means price competition may be fiercest here, since customers already view it as a cost center.
- The next wave—AI that increases topline revenue (e.g., AI go-to-market, outbound sales)—may prove more defensible because the ROI story is cleaner.
Voice AI and scheduling/intake agents are emerging
- In home services, companies miss 50% of incoming calls. Even an AI that handles 75% of calls effectively represents massive revenue upside.
- You don’t need 100% accuracy to deliver enormous value in these contexts.
Up-and-coming categories: outbound sales, hiring/recruiting, education, finance, screen sharing, summarization, and personal AI
- Education is a massive opportunity but faces institutional resistance (e.g., teachers’ unions, unprepared educators).
- Personal AI is harder to monetize but remains a compelling long-term category.

The Model Company Dilemma: Build Products or Sell APIs?

Model companies are under financial pressure to monetize faster, pushing them into product territory (OpenAI’s Deep Research, Operator; Anthropic’s coding agents).
The “friend-enemy” dynamic is emerging: Cursor (valued at $10B) and Anthropic may increasingly compete in coding; OpenAI’s search product puts it in competition with Perplexity and Google.
The key question: Can having the best model let you cold-start a product and win, or do product-layer companies maintain their advantage even when switching to slightly worse models?
Vertical vs. general-purpose models: Bloomberg built a finance-specific model, only to find that general-purpose models from OpenAI/Google outperformed it on finance tasks. The data pipeline and team survived, but the model was discontinued. General-purpose models continue to dominate, and it remains an open question whether vertical models can compete on quality.

Google’s Position

Google is “cooking” right now—shipping strong models (Gemini 2.0 Flash wins daily bake-offs for tasks like news summarization) and gaining usage momentum.
But Google struggles with fragmentation: Google Cloud vs. Vertex AI vs. AI Studio vs. Gemini creates too many brands and friction for developers.
Google’s inability to use its own thinking models in Cursor is cited as an example of how small integration barriers prevent adoption, even when the underlying technology is excellent.

Defensibility at the App Layer

Network effects are underprioritized by AI founders
- Chai Research (a Character AI competitor) has no proprietary models but has built a network of people submitting models to be run—a marketplace model that creates a defensible choke point between users and model providers.
- Brand compounds network effects: in as little as 6–9 months, a company can become synonymous with an entire category, making it the default choice in every customer room.
The “thousand small things” defensibility model
- Early AI app defensibility narratives (unique data, proprietary models) were a head fake.
- Real defensibility looks more like traditional SaaS: UX design, product velocity, breadth of product surface area, and the speed at which a company adopts each new model generation.
- Every 6 months, a new model release is an existential event—if you’re not first to integrate it, someone else will be.
ACVs are actually increasing in some categories despite predictions of pricing compression, because companies pay a premium to be the recognized brand.

Infrastructure and Investment Perspectives

The “LLM OS” layers are where the interesting infra value lies: code execution, memory, search, and security—not bare-metal GPU serving.
GPU serving is capital-intensive and tends toward cost-plus economics, which is not where you want to be in AI. Applications let you charge for utility, not cost.
Cybersecurity is a major opportunity: AI-enabled offense requires AI-enabled defense. Areas like email security, identity, red teaming, and binary inspection benefit from semantic understanding that LLMs enable beyond traditional syntax-based rules.
OpenAI is absorbing startup categories: every checkbox in the ChatGPT custom GPT builder represents a startup (search, code execution, etc.), and OpenAI is increasingly offering these as APIs.
Categories the hosts struggle with: fine-tuning companies (hard to scale as a standalone business), AI DevOps/anomaly detection (may just be traditional anomaly detection rebranded), and voice real-time infra (promising but timing uncertain).

Unanswered Questions with Large Implications

Can RL work in non-verifiable domains?
- RL works well in verifiable domains (math, coding). But can it work for law, marketing, sales conversations, or other domains without clear right answers?
- If not, we may end up with fully autonomous AI coders and scientists but still need humans as “taste makers” for basic tasks like writing sales emails—a bizarre and uneven future.
How will we scale compute to meet the “rule of nines” reliability demands?
- Going from 90% to 99% reliability requires an order of magnitude more compute; 99% to 99.9% requires another order of magnitude, and this cycle repeats every 2–3 years.
- Is Nvidia’s CUDA moat sustainable? Competitors (AMD, AWS Trainium/Inferentia, Microsoft, Meta) are all developing chips, but Nvidia’s GPUs are general-purpose by design and deeply entrenched.
- The bet is that transformer architecture is stable enough to bake into dedicated silicon, but no challenger has made a real dent yet.
Agent authentication is an emerging critical problem
- When an AI agent (like OpenAI’s Operator) accesses a website on your behalf, how does the site know it’s an agent acting for you and not you directly?
- This is described as effectively needing “new SSO for agents”—a fundamental piece of infrastructure that doesn’t yet exist.

Quickfire Round

Dream podcast guest: Andrej Karpathy (for Swyx and Alessio) — he legitimized the “AI engineer” role and was instrumental in building Latent Space’s audience. For Jacob, it’s the OpenAI story itself, which deserves an “Acquired”-style deep dive.
Best new source for staying up to date: The Latent Space Discord (curated by Sean), where every important link across AI, developer tools, creator economy, and macro is posted and organized by channel. Also, in-person conversations in SF remain irreplaceable.
Plugs: Latent Space (latent.space, YouTube), AI Engineer World’s Fair (June 2025), and Unsupervised Learning (YouTube, Redpoint Ventures).

Summary

Reflecting on the Past Year in AI

Overhyped and Underhyped Trends

What Has Product-Market Fit Today

The Model Company Dilemma: Build Products or Sell APIs?

Google’s Position

Defensibility at the App Layer

Infrastructure and Investment Perspectives

Unanswered Questions with Large Implications

Quickfire Round