Developer Experience at Uber with Gautam Korlam — The Pragmatic Engineer

Gautam Korlam spent nearly a decade at Uber, rising from Android engineer #8 to Principal Engineer, working on developer experience and internal tooling at massive scale. He’s now co-founder of Gitar, an agentic AI startup automating code maintenance. This episode covers Uber’s unique engineering stack (monorepo, submit queue, DevPods), how developer productivity was measured, the rise of vibe coding and AI tools, and what skills will matter for engineers in an AI-driven future.

Uber’s Engineering Stack at Scale

Monorepo adoption
- Uber started with hundreds of separate repos (networking, experimentation, analytics each had their own), which made dependency upgrades painful — every team had to manually bump versions across the chain.
- iOS moved to a monorepo first (notably while also migrating to Swift), and Android followed by freezing the codebase for a weekend and migrating everyone at once.
- Later, Java and Go monorepos were migrated to using Bazel as the build system.
- The main benefit: standardized code meant a centralized team could update a networking library or Google Play Services for everyone at once, rather than coordinating across hundreds of repos.
- The main tradeoff: teams lost the ability to break APIs without thinking about all consumers immediately. Some teams pushed back, arguing builds would be slower, but the macro productivity gain outweighed individual team slowdowns.
- Companies like Amazon and Netflix make multi-repo work with tooling that manages “golden version sets” — essentially a monorepo without being one — but Gautam argues monorepo is generally easier if you invest in tooling.
SubmitQueue
- A system to guarantee a green main branch by serializing commits and testing them in combination before merging.
- At Uber’s scale (roughly one commit per minute across thousands of engineers), running tests sequentially on each commit would take forever, and not testing cross-dependencies meant constant breakage.
- The SubmitQueue used ML models to estimate which changes might cause failures, speculatively tried paths that might be green, and backtracked when needed.
- It was novel enough that Uber published an academic paper on it. Very few companies have anything similar even today.
DevPods (Cloud Developer Environments)
- Containerized development environments running in the cloud, with pre-indexed code and pre-warmed build caches.
- Engineers could spin up a DevPod in seconds with a single command — no bootstrap scripts, no environment setup.
- Key innovation: all DevPods used the same home directory path (/home/user), so shared indices didn’t need path denormalization (which took ~1 hour with JetBrains’ shared index approach). This enabled ~6 second boot times.
- Multi-tenant and timezone-aware: machines were provisioned close to developers (e.g., India vs. US) to minimize latency for build cache uploads/downloads.
- Supported multiplexing — running multiple DevPods on different features simultaneously, each with warm caches.
- The challenge with cloud dev environments generally is that teams often start with the container solution rather than understanding the developer workflow first. Uber’s approach was to map the workflow and then adapt the container to it, pushing background updates every few hours transparently.
Local Developer Analytics (LDA)
- A daemon running on developers’ machines that collected system metrics (CPU, memory) and integrated deeply with CLI tools and IDEs.
- Tracked which files engineers edited most, which files had the most bugs, and where developers dropped off in the funnel (e.g., PRs failing to create due to build errors).
- This was novel at the time — even large companies didn’t have this kind of end-to-end developer observability. It powered dashboards and deep analytics that Uber still uses today.
- Uber worked with vendors like JetBrains to upstream some of these needs into products everyone now uses.

Measuring Developer Productivity

Starting with sentiment
- Uber began with NPS surveys. When measurement efforts started, developer NPS was around -50; by the time Gautam left, it was around +8.
- Surveys are essential because even with perfect metrics, if development “feels” wrong, you need to know.
What was measured
- Build times, time to code review, time spent in meetings, focus time, and diffs per engineer.
- Time to code review was the biggest bottleneck — P90s could be days, especially across time zones (e.g., Amsterdam waiting for SF reviews, adding ~10 hours of delay).
- CI time was a blip compared to review time.
How metrics were used
- Aggregated at a high level to find bottlenecks and guide investment, never for individual performance reviews or promotion decisions.
- Zero diffs in a year would trigger a conversation with the manager to understand what was going on, but wasn’t treated as a blanket performance signal.
- Teams focused on unblocking: auto-approving PRs with minor semantic changes, auto-landing approved PRs to avoid rebase conflicts, nudging reviewers.
The risk of misuse
- Diffs per engineer can be misleading — someone writing docs, influencing strategy, or unblocking other teams may have fewer diffs but high impact.
- Gautam emphasizes that metrics should flow from what engineers actually complain about, not from blindly adopting an industry framework.

Gautam’s Career Journey: Eng II to Principal Engineer

Rapid growth at a rocket ship
- Joined Uber in 2014 as Android engineer #8, when there were no unit tests and integration testing was done on physical phones.
- Skipped a level early (entry to senior) because the first year demanded shipping fast.
- Got promoted 4-5 times over 9 years, eventually reaching Principal Engineer — one of only a few dozen out of ~2,000-3,000 tech employees.
Strategy for advancement
- Found a niche (developer platform/tooling) that others didn’t want to do, which made him the go-to person as the company grew.
- Every 2 years, did introspection: am I challenged enough? Then deliberately pushed into new areas (mobile → CI/build → organization-wide efforts).
- Built social capital by helping people immediately — dropping everything to debug someone’s environment issue, holding office hours, and being the “encyclopedia” of where to find things.
- Sought mentorship from senior engineers to understand how Principal Engineers operate, including the business side of engineering.
Principal Engineer archetypes at Uber
- Depth in a particular area, breadth across many areas, internal influence, or external influence.
- Gautam focused on depth (developer productivity) plus breadth from talking to many teams across the organization.
- Past senior levels, it’s less about pure coding and more about understanding how engineering meets business, relationship management, and creative problem-solving.
- His manager relationship was more peer than boss — the principal engineer technically load-balances while the manager handles organizational unblocking.

Running Developer Experience Like a Product Team

Treating developers as customers
- The platform team of ~10 engineers supported ~1,000 engineers with published SLAs, on-call rotations, office hours, and escalation paths — just like a vendor serving external customers.
- They were “customer obsessed” — if a build fails, the developer feels the same frustration as a rider whose Uber doesn’t show up.
- Reliability and latency guarantees were treated as seriously as production SLOs.
Key practices
- Published SLAs and reviewed them regularly, with incident management and retrospectives when missed.
- Focused on the developer funnel: where do people drop off? Onboarding was a major pain point — outdated docs led to hours of environment setup, which DevPods eliminated.
- Automated fixes wherever possible: if a linter failure could be auto-fixed, it should be, because developers under shipping pressure will merge anyway if it’s not blocking.
- “Golden path” thinking: rather than giving developers raw tools, the team curated a standardized workflow and optimized it relentlessly.

AI’s Impact on Software Development

Current state of AI tools
- Autocompletion (e.g., Cursor) removes grunt work and lets engineers think more and type less, but doesn’t replace the thought process or taste for what the outcome should be.
- Most tools are point solutions — either IDE autocomplete or CI code review — with no cross-cutting understanding of the full SDLC.
- Gitar’s bet: an agentic AI that spans the IDE, code review, deployment, and production, understanding how a character typed in the IDE affects code review, deploy, and incidents.
Vibe coding
- Prototyping by focusing on how the system should behave rather than the exact implementation, iterating fast with agentic loops.
- Works well for prototyping because you can always refactor later if you have the right abstractions.
- At enterprise scale, unconstrained vibe coding risks breaking abstraction layers — it needs guardrails.
- Most people using AI for coding are still in the prototype phase.
The 70% problem
- AI tools get you started but can get confused and stuck at the tricky 30% — this is where experienced engineers with strong CS fundamentals and system-level knowledge shine.
- Senior engineers who adopt these tools become dramatically more productive because they can delegate tasks they used to give to junior engineers and verify the output.
- Junior engineers will thrive because they bring fresh ways of working with these tools and aren’t biased by old patterns — but they still need fundamentals to understand when things go wrong.
The rise of the general engineer
- 20 years ago, roles were siloed (DBA, Web Master, Java engineer). Then specialization increased (frontend, mobile, data, ML). AI is reversing this — engineers can jump across domains because AI handles the framework-specific details.
- The differentiator shifts from “how you wrote the code” to “taste and understanding of the end user.”

Skills to Stay Relevant in an AI-Driven Future

Taste and end-user experience
- If building software becomes easy, the differentiator is understanding what matters to the customer and crafting great UX.
- Examples: WhatsApp (~50 employees, $19B acquisition), Instagram (~13 people, 30 million users) — small, efficient teams that deeply understood their product.
System-level and scalability knowledge
- Understanding how software scales, which parts are efficient vs. inefficient, and how to maintain it over years.
- Designing for maintainability: pluggable, composable, replaceable components so migrations can happen without code freezes or downtime.
Business understanding
- AI agents are based on expert knowledge — they can’t tell you what matters to your specific business or user segment.
- Engineers who deeply understand their business will differentiate AI-driven code from code that actually makes the business thrive.
Pattern of working at companies that build at scale or with taste
- Experience matters: working at companies known for groundbreaking work builds muscles in scale and taste that compound over time, especially when combined with increasingly powerful AI tools.

Rapid Fire

Favorite programming language: Rust — the type safety makes it hard to have certain classes of bugs, and once you treat the borrow checker as a friend, it unlocks a lot. (Despite spending most of his career in Java and Go, and once saying he’d never work in Java.)
AI tool he uses most: Jimmy (Guitar’s own agentic tool) for end-to-end work, Cursor for autocomplete, Claude directly for Q&A and deeper research.
Book recommendation: Head First Design Patterns — old but excellent for understanding how to layer abstractions, when to use which pattern, and how to design software for maintainability.

Summary

Uber’s Engineering Stack at Scale

Measuring Developer Productivity

Gautam’s Career Journey: Eng II to Principal Engineer

Running Developer Experience Like a Product Team

AI’s Impact on Software Development

Skills to Stay Relevant in an AI-Driven Future

Rapid Fire