Is Claude Mythos “Terrifying” or Just Hype?

The controversy began last week with a widely read column by Thomas Friedman in The New York Times, which diverged from his usual geopolitical analysis to spotlight what he termed a “stunning advance in artificial intelligence” with equally profound implications. Friedman, expressing considerable alarm, characterized Anthropic’s decision to restrict public access to Mythos as a “terrifying warning sign.” He articulated fears that if such a tool became widely available, the sophisticated ability to hack critical infrastructure systems—once the exclusive domain of highly skilled private-sector experts and intelligence organizations—could fall into the hands of every criminal actor, terrorist organization, and nation-state, regardless of size or resources. This sentiment resonated across numerous major news outlets, with one particularly anxiety-provoking headline questioning if Mythos was an “AI nightmare waiting to happen.” However, a closer examination reveals a more nuanced reality, prompting a re-evaluation of how AI advancements, particularly those from the developers themselves, are consumed and interpreted by the public and media.

The Genesis of AI in Cybersecurity: A Chronology of Concern

Concerns about the dual-use nature of artificial intelligence, particularly large language models (LLMs), in the realm of cybersecurity are not new; they have been a persistent theme among security researchers since the nascent stages of consumer-grade LLMs. The potential for AI to both defend and attack digital systems has been a topic of extensive discussion, research, and, at times, apprehension within the cybersecurity community.

Early Explorations (Pre-2024): Even before the mainstream adoption of powerful LLMs, academic research explored the theoretical applications of AI in vulnerability detection and exploitation. Early studies focused on using machine learning for anomaly detection, malware analysis, and automated penetration testing. However, these systems often required extensive human oversight and were far from autonomously discovering complex vulnerabilities.

The GPT-4 Benchmark (2024): A significant milestone that brought these concerns to the forefront was a splashy study published by IBM researchers in 2024. Their research focused on the capabilities of OpenAI’s GPT-4 in attacking security vulnerabilities. The findings were stark: GPT-4 successfully exploited a remarkable 87% of the vulnerabilities it was presented with, a dramatic leap compared to the near 0% success rate of its predecessor, GPT-3.5. This study served as an early and potent indicator of LLMs’ burgeoning capabilities in offensive security, leading the IBM researchers to conclude, “Our findings raise questions around the widespread deployment of highly capable LLM agents.” While this research primarily assessed an LLM’s ability to write code to exploit known vulnerabilities, it undeniably established a precedent for the efficacy of advanced AI in malicious cybersecurity tasks.

Anthropic’s Opus 4.6 (Pre-Mythos): Anthropic, a prominent AI research company, had already demonstrated similar capabilities with its earlier models. Accompanying the release notes for their Opus 4.6 LLM, an observation from Anthropic’s security team highlighted the model’s success in finding “over 500 exploitable 0-day [vulnerabilities], some of which are decades old.” A "0-day" vulnerability refers to a flaw unknown to the vendor, making its discovery particularly significant. This announcement, made well before Mythos, showcased Anthropic’s models’ ability to discover vulnerabilities from scratch, not just exploit them. The language used then strikingly prefigured the recent Mythos announcement, differing primarily in the scale of vulnerabilities found—"over 500" for Opus 4.6 compared to "thousands" for Mythos. This historical context underscores that the core capability attributed to Mythos is not a sudden emergence but rather an evolution of existing AI functionalities.

The Claude Mythos Unveiling: Claims and Public Reception

Anthropic’s official press release regarding Claude Mythos presented a picture of unprecedented AI prowess. The company announced that Mythos would not be made available to the general public, citing profound concerns about its potential misuse. Instead, access would be limited to a carefully selected consortium of business partners. The justification for this restricted release was unequivocal: “AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.” The press release further elaborated on Mythos’s alarming effectiveness, stating it “has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.”

This pronouncement, amplified by Friedman’s influential column, triggered an immediate wave of public and media alarm. The specter of superintelligent AI rapidly outstripping human capabilities, especially in a domain as critical as cybersecurity, proved highly unsettling. Mainstream media, often eager to report on groundbreaking (or seemingly groundbreaking) technological shifts, largely echoed the narrative presented by Anthropic. Headlines ranged from cautious concern to outright apprehension, reflecting a widespread acceptance of the company’s claims at face value. The collective anxiety was palpable, painting a picture of an imminent paradigm shift in global cybersecurity, one where the advantage might irrevocably shift from defenders to attackers.

However, amidst this wave of concern, a counter-narrative began to emerge from the cybersecurity and AI research communities, urging a more critical and evidence-based assessment of Anthropic’s claims.

Deconstructing the "Stunning Advance": A Critical Examination

The critical analysis of Claude Mythos hinges on a few key questions: Is this truly a new capability, or an incremental improvement? How robust are Anthropic’s claims, and what do independent assessments reveal?

Novelty vs. Iteration: As established by the historical context, the ability of LLMs to find and exploit vulnerabilities is not a novel phenomenon. Researchers have been actively exploring and demonstrating this capability for several years. Mythos, therefore, represents an iteration rather than a revolutionary leap. The primary difference highlighted by Anthropic is the sheer scale—"thousands" of vulnerabilities found, an increase from Opus 4.6’s "over 500." While a quantitative increase is notable, it doesn’t necessarily signify a qualitative shift in the underlying AI capability or a sudden, unexpected emergence of a new skill.

Benchmark Performance: Anthropic provided quantitative data to support Mythos’s capabilities, stating it scored 83.1% on a "well-known cybersecurity benchmark." For comparison, Opus 4.6 scored 66.6% on the same test. A sixteen-percentage-point increase is statistically significant and indicates solid progress. However, as many AI researchers and practitioners caution, benchmark results must be interpreted with a critical eye. Benchmarks are often specific, sometimes narrow, tests that models can be "tuned" to pass. High scores on a benchmark do not always translate directly to real-world efficacy, especially in complex and dynamic fields like cybersecurity. The types of vulnerabilities tested, the environment, and the scope of the benchmark are crucial details often omitted or downplayed in public announcements. It is plausible that Mythos represents an optimized version of its predecessor, refined to excel on specific metrics, rather than a fundamentally different class of AI.

Independent Expert Scrutiny: The most compelling counter-arguments to Anthropic’s narrative have come from independent security researchers and AI commentators. Gary Marcus, a prominent AI critic, highlighted the skepticism within the cybersecurity community in a widely circulated Substack post. Marcus compiled responses from security researchers who took a closer look at the types of exploits Anthropic reported Mythos had discovered. The consensus among these experts was largely unimpressed.

Further findings from other independent assessments echoed this sentiment:

Quality vs. Quantity: Many of the "thousands" of vulnerabilities reportedly found by Mythos were described as low-severity, theoretical, or already known issues, rather than critical, practical 0-days. The distinction between finding a "vulnerability" and finding an exploitable, high-impact, novel vulnerability is significant.
False Positives: Like any automated tool, LLMs can generate a high number of false positives—identifying potential flaws that are not actually exploitable or are trivial in nature. The sheer volume of findings might be less impressive if a significant portion requires human sifting and validation, diminishing the "superhuman" claim.
Real-World Applicability: Critics questioned whether the discovered vulnerabilities translated into practical, real-world exploits that could bypass existing security measures. Often, theoretical vulnerabilities require specific conditions or further human ingenuity to weaponize effectively.
Lack of Transparency: A recurring concern was the proprietary nature of Mythos and the limited access, which prevented comprehensive independent auditing and verification of its claims. Without open access to the model and its findings, it is difficult for the broader security community to validate the scope and impact of its capabilities.

The Anthropic Irony: Perhaps the most potent piece of evidence undermining the narrative of Mythos’s infallible vulnerability detection came just days before its announcement. Anthropic accidentally leaked the source code for Claude Code, another one of its models. In a stark demonstration of real-world security challenges, independent security researchers swiftly identified "serious vulnerabilities" within the leaked code. This incident raised critical questions: If Mythos is so adept at finding and exploiting vulnerabilities, why was it seemingly not employed to scrutinize Anthropic’s own software before its public exposure? This oversight provided a tangible example where Anthropic’s internal security practices, or perhaps the application of their own advanced AI tools, appeared to fall short of the extraordinary capabilities attributed to Mythos.

Broader Implications and the "Hype Cycle" of AI

The saga of Claude Mythos serves as a compelling case study in the broader dynamics of AI development, corporate messaging, media consumption, and public perception.

AI Company Messaging and the "Existential Dread" Product: The pattern observed with Mythos is not isolated. Many AI companies, particularly those operating in the highly competitive and capital-intensive frontier of large language models, have developed a tendency to frame incremental progress as revolutionary or even existentially significant. As AI commentator Mo Bitar aptly observed in a recent video, Anthropic’s model rollouts sometimes resemble Apple iPhone launches—annual iterations with minor improvements. “Except here,” Bitar added, “the product is existential dread.” This strategic messaging can serve multiple purposes: attracting investment, influencing regulatory bodies (by emphasizing safety concerns that might necessitate specific policies), and solidifying market leadership through perceived innovation. The narrative of "superintelligence arriving faster than expected" creates urgency and reinforces the idea that these companies are at the vanguard of a profound technological shift.

Media Responsibility and Scrutiny: The initial media reaction to Mythos highlighted a critical challenge in contemporary journalism: the need for robust skepticism and independent verification when reporting on claims made by powerful technology companies. The immediate adoption of Anthropic’s narrative by many outlets, without sufficient critical inquiry or consultation with independent experts, contributed to the widespread public alarm. This episode underscores the importance of a journalistic approach that prioritizes factual scrutiny, context, and diverse expert opinions over sensationalism or uncritical dissemination of corporate press releases. In a rapidly evolving field like AI, the media plays a crucial role in calibrating public understanding and preventing unnecessary fear or unfounded hype.

The Legitimate Dual-Use Dilemma: While the claims surrounding Mythos may have been overstated, the underlying concern about the dual-use nature of advanced AI in cybersecurity remains profoundly legitimate. Powerful LLMs do possess the capability to assist in both offensive and defensive cybersecurity operations. They can analyze vast amounts of code, identify patterns indicative of vulnerabilities, and even generate malicious code. This inherent capability necessitates ongoing research into AI safety, ethical AI development, and robust regulatory frameworks to mitigate potential harms. The challenge lies in distinguishing between genuine, verified advancements that necessitate urgent attention and exaggerated claims that contribute to a cycle of hype and fear.

The Call for Independent Verification and Transparency: The Mythos incident reinforces the imperative for independent verification of AI capabilities. Given the profound societal implications of artificial intelligence, particularly in sensitive domains like cybersecurity, reliance solely on claims from the developing companies is insufficient. There is a growing consensus within the AI and cybersecurity communities that robust, transparent, and peer-reviewed evaluations of AI models are essential. This would involve allowing trusted third-party researchers to access and rigorously test models, validate performance metrics, and assess real-world impacts. Such transparency would not only foster greater public trust but also provide a more accurate basis for policy discussions, risk assessments, and responsible AI development.

In conclusion, while large language models undoubtedly pose significant new challenges and opportunities for cybersecurity, the narrative surrounding Anthropic’s Claude Mythos appears to be more a tale of incremental progress magnified by strategic corporate messaging and amplified by an eager media. It serves as a potent reminder that in the fast-paced world of artificial intelligence, a critical, data-driven approach, coupled with a healthy dose of skepticism and a demand for independent verification, is paramount to navigating the complexities of innovation and separating genuine breakthroughs from mere hype.