DevOps Cloud Security Application Security

Google Cloud Security launches CodeMender for bug fixes

Mon, 6th Oct 2025

Google Cloud Security has introduced CodeMender, an artificial intelligence agent for fixing software vulnerabilities. It has already contributed 72 security fixes to open-source projects.

The tool is designed to find the root causes of flaws, generate patches and validate those changes before passing them to human reviewers. This work has taken place over the past six months and has included projects with codebases of up to 4.5 million lines.

Software vulnerabilities remain difficult for developers to identify and repair, particularly in large, widely used open-source projects. The launch reflects a broader push across the technology sector to use generative AI not only to write code but also to test, audit and amend it.

CodeMender builds on earlier internal work on AI-assisted vulnerability discovery, including Big Sleep and OSS-Fuzz. As AI systems improve at identifying flaws, the volume of findings could outpace the ability of human developers alone to address them.

How it works

CodeMender uses Gemini Deep Think models to operate as an autonomous debugging and repair agent. It is paired with analysis and validation tools that examine code before changes are made and test those changes afterwards to reduce the risk of regressions.

The validation layer is central to the system's design. Patches are surfaced for human review only after checks indicate that the proposed fix addresses the underlying issue, works as intended and follows project style rules.

To support that process, Google developed a set of program analysis tools, including static analysis, dynamic analysis, differential testing, fuzzing and SMT solvers. The system also uses multiple specialist agents to review different aspects of a proposed change, including a critique system that compares original and modified code to identify possible regressions.

Patch examples

Google described cases in which the system handled vulnerabilities whose causes were not obvious from an initial crash report. In one example, a heap buffer overflow was traced not to the reported crash location but to incorrect stack management of XML elements during parsing.

In another case, the agent produced what Google described as a non-trivial patch for a complex object lifetime issue. That work involved altering a custom system for generating C code within the affected project.

Beyond reactive patching, CodeMender has also been used to rewrite existing code with security safeguards in mind. Google deployed the tool to apply -fbounds-safety annotations to parts of libwebp, an image compression library previously linked to a serious security flaw.

With those annotations in place, the compiler adds bounds checks intended to stop attackers from exploiting buffer overflows or underflows to execute arbitrary code. Google said the earlier libwebp heap buffer overflow known as CVE-2023-4863, which was used in a zero-click iOS exploit, would have been rendered unexploitable in annotated parts of the project.

Google also said the agent can deal with errors introduced by its own changes. In tests it described, the system corrected compilation failures and used a judging tool configured for functional equivalence to check whether revised code still behaved as expected.

Cautious rollout

Google is taking what it describes as a cautious approach to reliability. All patches generated by CodeMender are currently reviewed by human researchers before being submitted to upstream open-source projects.

That process matters for maintainers, many of whom must weigh the benefit of fast security fixes against the risk that automated changes could introduce new bugs or subtly alter software behaviour. Human review remains standard practice in security-sensitive projects, even as automation takes on a larger role in testing and remediation.

Patches have already been submitted to several critical open-source libraries, and many have been accepted. Google said it is increasing the pace gradually while collecting feedback from maintainers and the wider open-source community.

For Google, the project also highlights a more ambitious aim: shifting AI tools from identifying vulnerabilities to carrying out a larger share of the repair work. If such systems prove dependable at scale, they could change how maintainers manage ageing codebases and recurring classes of software flaws.

"Currently, all patches generated by CodeMender are reviewed by human researchers before they're submitted upstream," said Raluca Ada Popa and Four Flynn of Google Cloud Security.

ChatGPT

Key takeaways Explain why it matters Create action plan Future watch

Claude

Key takeaways Explain why it matters Create action plan Future watch

Perplexity

Key takeaways Explain why it matters Create action plan Future watch

Grok

Key takeaways Explain why it matters Create action plan Future watch

Share Share

Add us as a preferred source on Google