New DryRun Security Research: Anthropic’s Claude Generates the Most Unresolved Security Flaws in AI-Built Applications

The inaugural Agentic Coding Security Report evaluates real applications built by Claude, Codex, and Gemini in real development workflows

AUSTIN, Texas, March 11, 2026 (GLOBE NEWSWIRE) -- DryRun Security, the industry’s first AI-native, code security intelligence company, today released The Agentic Coding Security Report, new research examining how leading AI coding agents perform when building real applications.

The study found that while AI coding agents significantly accelerate software development, they also consistently introduce security vulnerabilities during the development process. Among the agents evaluated, Anthropic’s Claude, produced the highest number of unresolved high-severity security flaws in the final applications.

DryRun evaluated three leading coding agents Claude, Codex, and Gemini as they developed two full applications through sequential pull requests, mirroring how real engineering teams implement features over time.

Across the study:

26 of 30 pull requests (87%) introduced at least one vulnerability
143 security issues were identified across 38 security scans
The same vulnerability classes appeared repeatedly across all agents
None of the agents produced a fully secure application

“AI coding agents can produce working software at incredible speed, but security isn’t part of their default thinking,” said James Wickett, CEO of DryRun Security. “In our usage and experience, AI coding agents often missed adding security components or created authentication logic flaws. These mistakes and gaps are exactly where attackers win.”

Claude Produced the Most Unresolved High-Severity Vulnerabilities

While all three agents introduced security flaws during development, the study showed clear differences in their final security posture.

Claude produced the highest number of unresolved high-severity vulnerabilities in the final applications.
Codex ultimately finished with the fewest vulnerabilities and demonstrated stronger remediation behavior during development.
Gemini introduced multiple issues early in its work and interestingly, as it continued, it ended up removing some issues with later modifications. However, it still ended with several high-severity findings.

Despite these differences, no agent produced a fully secure application.

Recurring Security Failures Appeared Across Every Codebase

Several vulnerability classes appeared consistently across both applications and all agents, many aligned with the OWASP Top 10.

Four weaknesses appeared in every final codebase, all related to authentication:

Insecure JWT verification and management
Lack of application-level brute force protections
Open to token replay attacks
Insecure defaults for refresh token cookie configurations

In multiple cases, agents implemented security mechanisms but failed to apply them consistently across the system. For example, authentication middleware was created for REST APIs but never applied to WebSocket endpoints, leaving parts of the application exposed.

Agentic Development Requires Continuous Security Review

For the study, DryRun designed two applications, a web app to track family allergies and a browser-based racing game, and had each agent build features incrementally through pull requests much like real life agentic development.

Each change was analyzed with DryRun Security before the next feature was implemented, followed by a full DeepScan of the final codebases. The results show that security risk accumulates quickly during agent-driven development if code is not reviewed continuously and remediated as part of the process.

DryRun Security’s Contextual Security Analysis evaluates how applications behave in context, allowing teams to identify the systemic security gaps introduced by AI-generated code.

The full report can be downloaded here.

About DryRun Security

DryRun Security is the industry’s first AI-native, agentic code security intelligence solution. Powered by its proprietary Contextual Security Analysis engine, DryRun Security helps security and developer teams reduce noise, surface real risk, and secure modern software built by both humans and autonomous agents. DryRun Security saves organizations thousands of hours otherwise spent on false positives, manual triage, and reactive reviews, while enabling security to scale with the speed and complexity of AI-driven development. For more information, please visit: https://www.dryrun.security/.

Media Contact

Nina Korfias for DryRun Security
nina@aspironinfluence.com

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.