Anthropic Unveils ‘Strikingly Capable’ AI Model for Cyberattacks

Anthropic researchers announced that they have developed an AI model that shows a “strikingly capable” ability to identify and exploit software vulnerabilities, raising fresh concerns about cybersecurity risks and defenses.

The model, Claude Mythos Preview, has identified thousands of “high-severity” vulnerabilities, including some in “every major operating system and web browser,” the company said in a blog post. If it fell into the wrong hands, “the fallout — for economies, public safety, and national security — could be severe.”

Anthropic said it is not releasing Mythos publicly but only sharing it with a small group of partners as an “urgent attempt to put these capabilities to work for defensive purposes.” The effort, called Project Glasswing, includes $100 million in usage credits for Mythos and $4 million in donations to open-source security groups.

Researchers said Mythos can autonomously discover and exploit zero-day vulnerabilities — previously unknown software flaws — across major operating systems and web browsers. In internal testing, the system identified subtle bugs, including one in OpenBSD that had gone undetected for 27 years, and was able to chain multiple vulnerabilities into working exploits without human assistance.

The model also demonstrated the ability to reverse-engineer software, bypass security protections, and escalate privileges to gain full system access. In one case, it independently developed a remote code execution exploit targeting a FreeBSD server, granting root access to unauthenticated users.

Researchers said the model’s capabilities emerged from general improvements in reasoning and coding rather than being explicitly trained for offensive security tasks.