Anthropic Keeps New AI Model Private After Finding Thousands

Artificial Intelligence18 hours ago

Anthropic has withheld its most powerful AI model, Claude Mythos Preview, from public release after it discovered thousands of cybersecurity vulnerabilities across major platforms. Through Project Glasswing, the company is sharing the model exclusively with leading tech organisations and committing $100 million to help secure critical infrastructure.

In a move that signals a dramatic shift in how frontier AI capabilities are deployed, Anthropic has chosen to withhold its most advanced artificial intelligence model from public release after it detected thousands of previously unknown cybersecurity vulnerabilities across major operating systems and web browsers. Instead of launching it commercially, the company quietly distributed access to the organisations responsible for securing the world’s digital infrastructure.

What Happened: Project Glasswing and Claude Mythos Preview

The model in question is called Claude Mythos Preview, and it represents Anthropic’s most capable system to date. During internal testing, Mythos demonstrated an extraordinary ability to identify security flaws — uncovering thousands of vulnerabilities spanning every major OS and browser ecosystem currently in use.

Rather than treating this as a product launch opportunity, Anthropic established an initiative dubbed Project Glasswing. The program channels Mythos’s capabilities directly to the companies and foundations that build and maintain critical software. Launch partners read like a who’s-who of global technology: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks.

Beyond that initial cohort, access has been extended to more than 40 additional organisations that develop or maintain essential software infrastructure. To back the effort financially, Anthropic is committing up to $100 million in usage credits so these partners can run Mythos against their own codebases.

Why This Matters: A New Paradigm for Responsible AI Deployment

This decision carries enormous significance for several reasons. First, it represents one of the clearest examples yet of an AI lab deliberately limiting the commercial potential of a breakthrough model because of security implications. In an industry often criticized for racing to ship products, Anthropic chose restraint — and that’s noteworthy.

Second, the sheer scale of the discoveries is staggering. Finding thousands of vulnerabilities across multiple platforms suggests that Mythos operates at a level of code analysis that far exceeds what human security researchers or existing automated tools can accomplish in comparable timeframes. For context, the CVE database — the global standard for tracking publicly disclosed cybersecurity vulnerabilities — typically catalogs around 25,000 to 30,000 new entries per year. A single AI model contributing thousands of findings in a concentrated period would represent a meaningful fraction of annual global disclosure.

Third, this approach could redefine the economics of vulnerability research. If you’ve been following our coverage of Microsoft Open-Source Toolkit Secures AI Agents at Runtime, you’ll know that bug bounty programs and penetration testing remain expensive and labor-intensive. An AI model capable of doing this work at scale could save organisations billions in breach-related costs.

Background: Anthropic’s Safety-First Reputation

Anthropic was founded in 2021 by former OpenAI researchers, including siblings Dario and Daniela Amodei. The company has consistently positioned itself as the safety-conscious alternative in the frontier AI race, publishing detailed responsible scaling policies and investing heavily in alignment research. Its Claude family of models competes directly with OpenAI’s GPT series and Google’s Gemini.

The decision to withhold Mythos from public access is consistent with that ethos. Releasing a model that can identify zero-day vulnerabilities at scale would be a double-edged sword — invaluable for defenders, but potentially catastrophic if accessed by malicious actors who could exploit those same flaws before patches were deployed.

This tension sits at the heart of what security researchers call the “offensive-defensive asymmetry” problem. Attackers need to find just one exploitable weakness. Defenders need to find and patch all of them. An AI that tips the balance toward defenders — but only if its capabilities are carefully controlled — is exactly the kind of tool that demands a restricted distribution model.

The Strategic Implications for Anthropic and the Industry

From a business perspective, Project Glasswing is a masterstroke even if it sacrifices short-term revenue. Consider the strategic benefits:

Deep integration with critical infrastructure: By embedding Mythos into the workflows of AWS, Microsoft, Google, and others, Anthropic creates dependency and trust that no marketing campaign could buy.
Goodwill with regulators: As governments worldwide draft AI legislation, demonstrating responsible restraint earns credibility that will matter when policy decisions are made.
Data feedback loops: Every vulnerability that partner organisations confirm and patch generates valuable training signal that will make future Anthropic models even more capable.
Competitive differentiation: Neither OpenAI nor Google DeepMind has announced anything comparable in scope, giving Anthropic a unique position in the enterprise security market.

The inclusion of JPMorganChase is particularly telling. Financial institutions face relentless cyberattacks and operate under some of the strictest regulatory frameworks on Earth. Their participation signals that Mythos has already demonstrated value beyond the tech sector. If you’re interested in how AI is reshaping financial services, check out our piece on Google AI Unveils PaperOrchestra for Automated Research.

What Comes Next

Several questions remain unanswered. Will Anthropic eventually release Mythos publicly, perhaps after the most critical vulnerabilities have been patched? How will the company verify that partner organisations are actually acting on the findings in a timely manner? And what happens when adversaries inevitably develop their own vulnerability-hunting AI models without similar ethical guardrails?

The broader industry will also be watching to see whether this collaborative, quasi-philanthropic approach becomes a template. If a $100 million commitment to defensive cybersecurity yields measurable results — fewer breaches, faster patches, a more secure internet — other AI labs will face pressure to follow suit.

There’s also the question of whether governments will seek to mandate this kind of responsible disclosure framework. The Cybersecurity and Infrastructure Security Agency (CISA) in the United States has already signaled interest in AI-assisted vulnerability detection as part of its broader mission.

The Bottom Line

Anthropic’s decision to keep Claude Mythos Preview out of the public eye is one of the most consequential moves in the AI industry this year. By choosing security over spectacle, the company is making a calculated bet that the most powerful AI models shouldn’t always be products — sometimes they should be tools wielded quietly by the people best positioned to protect everyone else. Whether that bet pays off will depend on execution, transparency, and whether the rest of the industry is willing to follow a path that prioritizes collective defense over individual profit.