UK gov’s Mythos AI tests help separate cybersecurity threat from hype

UK gov’s Mythos AI tests help separate cybersecurity threat from hype

Last week, Anthropic made headlines by announcing a restricted launch of its Mythos Preview model, designed specifically for cybersecurity tasks. This limited release is intended for a select group of critical industry partners, allowing them time to adapt to a model described as exceptionally proficient in tackling computer security challenges. In a recent development, the UK government’s AI Security Institute (AISI) has released an initial assessment of Mythos's capabilities in cyber-attack scenarios, providing independent verification of Anthropic’s claims. The AISI evaluation indicates that Mythos performs comparably to other advanced AI models in individual cybersecurity tasks. However, what sets Mythos apart is its enhanced ability to link these tasks effectively, enabling it to execute the complex, multi-step attacks required to breach various systems. Since early 2023, AISI has been testing various AI models through tailored Capture the Flag (CTF) challenges. These challenges are designed to measure performance on cybersecurity tasks. Notably, earlier models like GPT-3.5 Turbo struggled with even basic tasks, but performance has improved significantly over time. Mythos Preview now successfully completes over 85% of the Apprentice-level CTF tasks, marking a new high in AISI’s testing history. Despite these impressive results, competing models such as GPT-5.4 and Anthropic’s own Opus 4.6 and Codex 5.3 have shown similar accuracy levels, falling within 5 to 10 percent of Mythos across various CTF difficulty tiers. This raises questions about the necessity of the protective measures surrounding the Mythos Preview launch. AISI’s evaluation highlighted Mythos's notable capabilities in a specific test known as “The Last Ones” (TLO). This exercise simulates a 32-step data extraction attack on a corporate network, requiring the AI to chain multiple actions across various hosts and network segments. AISI estimates that such an operation would typically take a trained human around 20 hours to accomplish, showcasing the potential efficiency of Mythos in real-world scenarios.

Sources : Ars Technica

Published On : Apr 14, 2026, 19:15

Startups
Uncovering Hidden Investment Opportunities: 10 Asian Stocks to Watch Beyond AI

As investors seek diversification in their portfolios, HSBC has identified ten Asian stocks that may not have received t...

CNBC | May 20, 2026, 02:55
Uncovering Hidden Investment Opportunities: 10 Asian Stocks to Watch Beyond AI
AI
Andrej Karpathy Joins Anthropic to Spearhead AI Research Initiative

Andrej Karpathy, a notable figure in the AI landscape and former co-founder of OpenAI, has made headlines once again by ...

Business Today | May 20, 2026, 04:50
Andrej Karpathy Joins Anthropic to Spearhead AI Research Initiative
Cybersecurity
From Hacking to Defending: Meet the Innovator Combatting AI-Powered Phishing with $28M in Funding

Shay Shwartz has a deep understanding of email phishing threats, rooted in his early years as a hacker. After being caug...

TechCrunch | May 19, 2026, 21:51
From Hacking to Defending: Meet the Innovator Combatting AI-Powered Phishing with $28M in Funding
AI
Singapore Welcomes OpenAI's First International Lab with Major Investment

Singapore is making significant strides in becoming a leading global hub for artificial intelligence, recently securing ...

CNBC | May 20, 2026, 03:45
Singapore Welcomes OpenAI's First International Lab with Major Investment
AI
Nvidia Unveils New AI Research Center in Singapore Amidst City’s Ambitious AI Initiatives

Nvidia, the global leader in artificial intelligence chips, is set to establish a new research facility in Singapore. Th...

CNBC | May 20, 2026, 04:15
Nvidia Unveils New AI Research Center in Singapore Amidst City’s Ambitious AI Initiatives
View All News