❌

Normal view

The Race to Ship AI Tools Left Security Behind. Part 1: Sandbox Escape

AI coding tools are being shipped fast. In too many cases, basic security is not keeping up.

In our latest research, we found the same sandbox trust-boundary failure pattern across tools from Anthropic, Google, and OpenAI. Anthropic fixed and engaged quickly (CVE-2026-25725). Google did not ship a fix by disclosure. OpenAI closed the report as informational and did not address the core architectural issue.

That gap in response says a lot about vendor security posture.

submitted by /u/Fun_Preference1113
[link] [comments]

Anthropic Opus 4.6 is less good at finding vulns than you might think

We benchmarked Opus 4.6's ability to find simple C vulns and found that the model flags about 1 in 4 flaws -- with a very high false positive rate and lots of inconsistency from run to run. Techniques like judge agents and requiring the model to justify its results improve the results to some extent, but they're still not great.

submitted by /u/Prior-Penalty
[link] [comments]
❌