Bipko Biz Digital News

collapse
Home / Daily News Analysis / Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?

Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?

May 27, 2026  Twila Rosenbaum  9 views
Can Anthropic Keep Its Exploit-Writing AI Out of the Wrong Hands?

Anthropic's Mythos model promises major innovations in vulnerability management and security red-teaming, but questions remain regarding how defenders can keep threat actors from taking full advantage. The artificial intelligence firm unveiled Claude Mythos Preview on April 7, 2026, describing it as a general-purpose large language model (LLM) that "performs strongly across the board, but it is strikingly capable at computer security tasks." According to Anthropic's blog post, Mythos can identify and exploit zero-day vulnerabilities in "every major operating system and every major Web browser" at user direction, including subtle and difficult-to-detect flaws. One exploit even targeted a patched 27-year-old vulnerability in OpenBSD.

Some of these vulnerabilities are complex, but the company says one does not need to be a security engineer to properly prompt the model. In one demonstration, Mythos Preview wrote a Web browser exploit that chained together four vulnerabilities, writing a complex JIT heap spray that escaped both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. It also autonomously wrote a remote code execution exploit on FreeBSD's NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets.

Enter Project Glasswing: Anthropic Mythos for Cyber Defenders

It is likely in anticipation of misuse that Anthropic introduced "Project Glasswing," a new initiative launched this week in partnership with companies like Apple, AWS, Microsoft, Palo Alto Networks, and CrowdStrike. The company claimed Project Glasswing could fundamentally "reshape cybersecurity" and that this would be "an urgent attempt to put these capabilities to work for defensive purposes." In practical terms, Anthropic has extended Mythos Preview access to a group of more than 40 organizations to scan and secure first-party and open source systems. Lee Klarich, chief product and technology officer of Palo Alto Networks, called early Mythos Preview results "compelling" in a LinkedIn blog post.

In addition to granting limited access to partners, Anthropic is committing $100 million in Mythos Preview usage credits to Project Glasswing, as well as $4 million in direct donations to open source security organizations. Forrester senior analyst Erik Nost told Dark Reading that the move is good PR for Anthropic, as the company is basically saying its AI is so good that it can reshape cybersecurity and software development. Secondly, it also calls attention to the vulnerability detection gaps that the industry has dealt with for 30 years.

Keeping Mythos Preview Out of the Wrong Hands

Nost explains that there are controls in place ensuring Mythos stays in the right hands, though it has become "a race [for defenders] to remediate and patch before other AIs, in the wrong hands, discover these zero-days and rapidly write exploits." He emphasizes that it's a call to action, a heads-up to defenders that vulnerability management practices are about to get very different.

Julian Totzek-Hallhuber, senior principal solution architect at Veracode, notes that because there is no clear answer for how these tools can stay out of attacker hands, defenders should assume the capability will proliferate and prepare accordingly. This means investing in detection instead of just prevention, identifying the behavioral signatures of AI-assisted exploitation, and investing in zero-trust architecture as well as aggressive patching cycles and anomaly-based detection.

Melissa Ruzzi, director of AI at AppOmni, tells a deeper truth: "No one can ever keep anything 100% out of attackers' hands. The best that can be done is to make it more difficult for them to get access to it." This sentiment resonates across the cybersecurity community, where the dual-use nature of powerful tools has always been a challenge. From penetration testing frameworks like Cobalt Strike to exploit kits, legitimate security tools often find their way into malicious use.

Mythos' potential comes with a caveat: While the early Anthropic examples of discovered vulnerabilities are compelling, two data points do not make a pattern. Totzek-Hallhuber emphasizes that "Anthropic controls both the model and the narrative; independent replication is impossible when the model isn't publicly available." He adds, "Until independent researchers with access can run their own evaluations, healthy skepticism is the appropriate posture. This is, frankly, another consequence of the restricted access model: the claims can't be tested, so they can't be fully trusted or refuted."

Dark Reading contacted Anthropic to ask for statistics regarding false positives and error rates; the vendor did not respond by press time. The lack of independent verification raises questions about the true scope of Mythos's capabilities. However, the threat landscape is evolving rapidly, and AI-driven vulnerability discovery could dramatically accelerate the pace at which zero-days are found and weaponized.

Historically, vulnerability research has been a manual, time-intensive process. Security researchers spend months reverse-engineering software, hunting for subtle flaws in memory management, race conditions, or cryptographic implementations. The rise of AI models like Mythos threatens to compress that timeline from months to minutes. For defenders, this means traditional patch cycles of 30 to 90 days are no longer sufficient. Attackers leveraging similar AI capabilities could exploit vulnerabilities before vendors even issue a fix.

Project Glasswing represents an attempt to tip the scales back toward defenders by providing early access to the same AI power. Yet, the initiative is not without its critics. Some argue that by concentrating such powerful technology within a select group of partners, Anthropic may inadvertently create a privileged class of defenders while leaving smaller organizations and open-source projects vulnerable. The $4 million in donations to open source security organizations is a step in the right direction, but it pales in comparison to the scale of the problem.

The broader implications extend beyond cybersecurity. If AI models can autonmously discover and exploit software vulnerabilities, they could be used to compromise critical infrastructure, financial systems, or even military networks. Nation-state actors are likely already experimenting with similar techniques, and the democratization of such capabilities could lead to an escalation in cyber conflict.

For now, the cybersecurity community watches Anthropic's moves closely. The company has a reputation for prioritizing AI safety, but the launch of Mythos Preview and Project Glasswing suggests a shift toward actively deploying offensive capabilities in the name of defense. Whether this approach will succeed remains to be seen, but one thing is clear: the race between attack and defense is entering a new, AI-driven phase.


Source: Dark Reading News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy