Software engineering with LLMs in 2025: reality check (at LDX3 by LeadDev) — The Pragmatic Engineer

In mid-2025, there is a striking gap between the hype from AI company executives and the actual experience of software engineers using AI tools day to day. The speaker, a pragmatic engineering writer, conducted informal interviews across AI dev tool startups, big tech companies, AI startups, and independent engineers to get a realistic picture of how LLMs are being used in software engineering. The findings reveal that adoption is real and growing, but uneven, with the most meaningful signals coming not from CEOs but from experienced engineers who are quietly becoming more productive and more excited than they have been in years.

AI dev tool startups: heavy internal usage, but with a bias

AI dev tool companies report extremely high internal adoption of their own tools, though they have a clear incentive to show strong numbers.
- Anthropic engineers adopted Claude Code (a CLI-based coding agent) immediately when given access, and 90% of Claude Code’s own codebase is now written using Claude Code.
- Claude Code launched publicly on May 22, 2025; on day one, usage increased 40%, and within less than a month, total usage was up 160%.
- Windsurf reports that 95% of its code is written using its own agent or autocomplete features.
- Cursor estimates roughly 40–50% of its code is written with its tools, which the speaker notes is a more candid figure.
Anthropic open-sourced the Model Context Protocol (MCP) in late 2024, a protocol that lets IDEs or agents connect to external systems like databases, GitHub, Google Drive, and Puppeteer through a standard interface.
- MCP saw early adoption by smaller companies in December and February, then major support from OpenAI, Google, and Microsoft by March–April 2025.
- Thousands of MCP servers are now estimated to exist, making it a significant emerging standard for connecting AI agents to tools and data.

Big tech: deep integration, cautious culture, and quiet preparation

Google has a fully custom internal stack (Borg instead of Kubernetes, its own repo and code review tool, and Cider, a VS code–based cloud IDE integrated across all internal services).
- LLMs are now deeply integrated into Cider: autocomplete, chat-based coding, AI-powered code review (Critique), and LLM-enhanced code search.
- A former Googler noted that about a year ago, AI tools were barely used internally; now they are everywhere.
- Google’s approach is described as cautious and slow, focused on getting tools right so engineers trust them.
- Internal tools like Notebook LM, an LLM prompt playground, and a knowledge-base search engine (MoMA) are widely used.
- One unattributed quote suggests that org-specific GenAI tooling is proliferating partly because leadership rewards visible AI adoption with more funding.
- Most strikingly, a source close to Google SREs said they are preparing for 10 times the volume of code entering production, scaling up infrastructure, deployment pipelines, code review tooling, and feature flagging accordingly.
Amazon is less publicly associated with AI, but internally nearly all developers use Amazon Q Developer Pro, especially for AWS-related coding.
- Engineers report it has improved significantly over the past year and is surprisingly unknown outside Amazon.
- Claude is also used internally for writing tasks like PR docs (Amazon’s six-pager format) and performance reviews.
- Because Amazon has been API-first since a famous 2002 mandate from Jeff Bezos requiring all teams to expose data and functionality through service interfaces, it is trivially easy to attach MCP servers to existing internal tools.
- Most internal tools and websites at Amazon already have MCP support, and developers are automating ticketing systems, emails, and internal workflows at scale, though this is rarely discussed publicly.

AI startups: mixed results, with some niches left behind

incident.io, an on-call platform evolving into an AI-first company, reports that its engineering team is heavily using AI and sharing tips internally via Slack.
- One engineer discovered that well-defined tickets can be passed to an agent for a useful first pass, and shared this with the team.
- Another engineer’s favorite technique is prompting for options rather than single answers (e.g., “give me three ways to write this code” or “what are possible explanations for this error?”).
- The entire team became regular Claude Code users within days of its public launch.
A biotech AI startup (unnamed at their request) uses AI/ML to design proteins and has about 50–100 engineers running automated numerical pipelines on Kubernetes with Python and Rust.
- Despite experimenting with multiple LLMs including recent models, none have stuck; it is still faster for their engineers to write correct code from scratch than to review and fix LLM-generated code.
- They suspect their niche—building novel software that has never existed before—may be particularly poorly served by current LLMs.
- They use AI code review tools only intermittently and asked not to be named to avoid being labeled as AI skeptics.

Seasoned independent engineers: a genuine inflection point

Armin Ronacher (creator of Flask, founding engineer at Sentry) published an article titled “AI Changes Everything” stating that he now prefers working with an AI agent as a “virtual programming intern” over being an engineering lead, which would have been unbelievable to him six months earlier.
- He credits Claude Code’s quality, extensive LLM use overcoming his initial resistance, and the fact that the tool runs itself and gets feedback, avoiding hallucination problems.
Peter Steinberger (creator of PSPDFKit, iOS internals expert) wrote “The Spark Returns,” saying he hasn’t been this excited by technology in decades.
- He feels languages and frameworks matter less now because switching is so easy with AI assistance; he is coding in languages like TypeScript he would never have touched before.
- He estimates 10–20x more output from a capable engineer and notes that many of his tech friends are staying up late coding because they are so engaged.
- He is also seeing burnt-out developers return to building things.
Bridget Kromhout (distinguished engineer at Thoughtworks) views LLMs as a tool that works at any abstraction level—from low-level assembly-like code to high-level human language—making it a lateral move across the stack rather than just a new layer on top.
Simon Willison (creator of Django, independent open-source developer and blogger whose work is followed by Andrej Karpathy) says coding agents actually work now, running in loops with compilers and tests, and that model improvements over the past six months have reached a tipping point where they are becoming genuinely useful.

Open questions that remain

Why are founders and CEOs more excited than most engineers? The speaker observes that the most enthusiastic adopters at AI tooling companies tend to be founders and PMs, not senior engineers. Public CEO claims (e.g., Microsoft’s “30% of code written by AI,” Anthropic’s “90% in 6 months”) may reflect financial incentives rather than ground truth.
How mainstream is AI usage really? A show of hands at the event suggested 60–70% of attendees use AI tools at least weekly. DX’s survey of 38,000 developers found the median organization has about 50% of developers using AI weekly (not daily), and the top companies reach about 60%. Most of the speaker’s interview subjects were above the median, suggesting selection bias.
How much time is actually saved? Peter Steinberger estimates 10–20x more output, but DX’s survey finds developers save roughly 3–5 hours per week on average. It is unclear whether that saved time translates into more output or different work.
Why does it work better for individuals than organizations? Multiple sources confirm these tools are great for individual developers but not yet effective at the organizational level, where coordination, code review, and complex codebases create friction.

The bigger picture: a step change comparable to historic shifts

Martin Fowler reviewed the speaker’s analysis and offered his assessment: he thinks LLMs will change software development at a level comparable to the shift from assembly to high-level programming languages, with the critical difference that LLMs are non-deterministic for the first time in computing.
Kent Beck, a veteran software engineer with 52 years of experience, said he is having more fun programming now than at any point in his career, because LLMs let him be ambitious again without the fatigue of constantly learning new frameworks and migrating between technologies.
- He is currently building a Smalltalk server with language server integration, something he always wanted to do.
- He compared the current moment to three previous shifts he has lived through: the move from mainframes to microprocessors, the rise of the internet, and the arrival of smartphones.
- His key insight: the landscape of what is cheap and expensive has fundamentally shifted—things that were previously assumed to be too hard or too expensive are now trivially cheap, and engineers need to be trying them.
The speaker’s overall takeaway is that a real step change is underway, driven not by executive hype but by experienced engineers finding genuine new leverage. The call to action is to experiment more, understand what has become cheap, and learn what works and what doesn’t.

Summary

AI dev tool startups: heavy internal usage, but with a bias

Big tech: deep integration, cautious culture, and quiet preparation

AI startups: mixed results, with some niches left behind

Seasoned independent engineers: a genuine inflection point

Open questions that remain

The bigger picture: a step change comparable to historic shifts