Code security for software engineers

The Pragmatic Engineer 1h7 5 min #57
Code security for software engineers
Watch on YouTube

Summary

  • Johannes Dahse, VP of Code Security at Sonar and a security expert with 20 years of experience, explains what every software engineer should know about writing secure code — from foundational basics to how AI is reshaping the threat landscape.

How Johannes got into security

  • His computer was infected by the Sasser worm, which sparked curiosity about how attackers gain access.
  • He studied IT security in Germany, played capture-the-flag hacking competitions, then moved into professional penetration testing and vulnerability hunting tooling before joining Sonar.

Penetration testing

  • Penetration testing simulates a real attack: a company hires you as a hacker to find vulnerabilities within a defined scope and time.
  • Blackbox testing: no prior knowledge of the system; you approach it as an external attacker would.
  • Whitebox testing: you have access to source code and internal information.
  • The process involves automated reconnaissance (port scanning, endpoint mapping) followed by manual probing to find exploitable issues.

Who owns code security

  • Code security (bugs and vulnerabilities in code) should be owned by developers, since they are the only ones who write and change code.
  • Application security (the broader field including compliance, penetration tests, threat monitoring, cryptography, authentication logic) should be owned by security teams.
  • Security teams act like a platform team: they provide specialized expertise and tooling, but the day-to-day responsibility for code-level issues sits with developers.
  • Historically, security was owned by security teams and done as a final audit before quarterly releases. Today’s faster release cadences (multiple times per day, AI-assisted development) make that model unworkable.

What is code security

  • Code free of issues that an attacker could exploit — but this goes beyond classic vulnerabilities like SQL injection.
  • It includes bugs that create unintended states (null pointer exceptions, memory corruption/buffer overflows in C/C++) and logic errors (e.g., allowing shell uploads disguised as profile pictures).
  • At its core, code security is technical debt: things developers forgot or misspecified.

Code security basics every developer should know

  • Really understand what your code is doing. Security experts find issues by looking for corner cases and edge cases the developer overlooked.
  • Never trust input. Validate and sanitize all user input — GET/POST parameters, cookies, file uploads, even things like video titles that can be modified.
  • Classic issues that are still prevalent: SQL injection, cross-site scripting (XSS), hardcoded secrets (API tokens, passwords, crypto keys).
  • Attackers crawl public GitHub repos for leaked secrets, and secrets persist in git history even after deletion.

Advanced security challenges

  • Cryptography, authentication logic, access control, and password reset functionality are areas where things commonly go wrong.
  • The key advice: don’t reinvent the wheel — use well-vetted, community-trusted frameworks and libraries.
  • Dependency/package poisoning: attackers compromise a maintainer and inject malicious code into a popular package. Downstream users are affected automatically.
  • Defense: use Software Composition Analysis (SCA) tools that check your dependency manifest files against databases of known vulnerabilities (CVEs) and warn you about malicious or vulnerable packages.

The CVE program

  • CVE (Common Vulnerabilities and Exposures) is a database run by MITRE (US government) that documents known vulnerabilities.
  • Over 200,000 CVEs exist, with ~50 new ones reported daily. It’s a full-time job to track manually.
  • Practical advice: don’t try to keep up with CVEs yourself — use an SCA tool to automate detection and, more importantly, get guidance on fixing them.

Common security tools

  • IDE linting: basic syntactic and semantic checks as you type, but limited in security coverage.
  • Static Application Security Testing (SAST): transforms your entire codebase into a graph model (functions, if/else branches, variable assignments) and simulates data flow to trace how user input propagates to security-sensitive operations. Techniques include taint analysis and symbolic execution. This automates the “follow the input” process even across very long, complex code paths.
  • Secret detection: scans for hardcoded credentials in code and configuration files.
  • Infrastructure as Code scanning: treats CI/CD configs, GitHub Actions, and infrastructure definitions as code to be analyzed.
  • Software Composition Analysis (SCA): checks dependencies against known vulnerability databases.
  • Dynamic Application Security Testing (DAST): tests a running application from the outside by sending malicious payloads and observing behavior (error messages, delays, crashes). More suited to security teams than developers due to longer feedback loops.
  • Fuzzing: flips bits in complex inputs (file formats, protocols) to find crashes. Works well for embedded software, C/C++, and binary processing.

The State of Code Security report (Sonar)

  • Sonar analyzed 8 billion lines of code from 1 million developers across 40,000 organizations.
  • Finding: roughly 1 security issue per 1,000 lines of code.
  • Top issues: SQL injection, XSS, hardcoded passwords, path traversal, and surprisingly, insecure regular expressions that can cause denial-of-service attacks.

Code quality and security are deeply connected

  • Poor-quality code (spaghetti code, unreadable code) makes security issues harder to spot during code reviews and harder to fix later.
  • With AI-generated code often being lower quality, this becomes a bigger security problem: more code means more potential issues, and bloated code is harder to review and maintain.
  • The old principle that “code is a liability” still applies — concise, well-structured code is easier to secure.

AI’s impact on code security

  • AI-generated code risks: LLMs produce more verbose code to solve problems (e.g., GPT-5’s reasoning mode), which increases the attack surface. Studies of popular LLMs show distinct “personalities” in the types and frequency of issues they produce.
  • New vulnerability types: prompt injection is becoming the new code injection — as text becomes code, manipulating an LLM’s system prompt or instructions is analogous to injecting malicious code.
  • Typo squatting: AI may suggest a library name that doesn’t exist; an attacker registers that name on npm/Maven Central, and the AI-assisted developer unknowingly pulls in a malicious package.
  • Verification is the new bottleneck: writing code is no longer the challenge; verifying that AI-generated code is secure at scale is.
  • AI can enhance deterministic tools: AI can feed knowledge about millions of libraries and frameworks into static analysis engines, improving their accuracy. AI is also very good at fixing specific, well-scoped security issues when given a small code snippet.
  • AI reviewing AI-generated code has limitations: it’s non-deterministic (inconsistent results), which doesn’t work well for quality gates that need to be consistent across teams. It’s analogous to students grading their own homework.

Developer machines as an attack vector

  • Developer machines are high-value targets: compromising one can lead to supply chain attacks (backdooring a popular dependency).
  • With AI coding assistants and MCP servers running locally, there’s a new threat: a malicious MCP server could instruct an agent to add a backdoor or exfiltrate credentials (e.g., passing a malicious Jira ticket to the agent).

Common misconceptions about the security industry

  • Security is a product you can buy: the industry often sells tools that promise security, but real security must be built into the development process, not bolted on.
  • Security requires elite hackers: in reality, most of the work is fixing bugs, not building sophisticated exploits.
  • Perfect security is achievable: it isn’t. Even the most popular, well-maintained open-source projects with bug bounty programs still have vulnerabilities found in them.

When is security “good enough”?

  • Use automated tools as basic hygiene (like locking doors and windows).
  • Do an initial assessment to find and fix your most critical issues.
  • As you add features, use automation to ensure you’re not introducing new vulnerabilities or technical debt.
  • Re-assess periodically (e.g., quarterly) to track progress.
  • Keep an eye on evolving threats (currently: LLMs and prompt injection) and resources like the OWASP Top 10.

Johannes’s favorite secure language

  • Newer languages like Go are more secure by design, having learned from the mistakes of older languages.
  • Established languages like Java are also quite secure and widely used in enterprises.
Back to The Pragmatic Engineer