UK gov’s Mythos AI tests help separate cybersecurity threat from hype

Last week, Anthropic made headlines by announcing a restricted launch of its Mythos Preview model, designed specifically for cybersecurity tasks. This limited release is intended for a select group of critical industry partners, allowing them time to adapt to a model described as exceptionally proficient in tackling computer security challenges. In a recent development, the UK government’s AI Security Institute (AISI) has released an initial assessment of Mythos's capabilities in cyber-attack scenarios, providing independent verification of Anthropic’s claims. The AISI evaluation indicates that Mythos performs comparably to other advanced AI models in individual cybersecurity tasks. However, what sets Mythos apart is its enhanced ability to link these tasks effectively, enabling it to execute the complex, multi-step attacks required to breach various systems. Since early 2023, AISI has been testing various AI models through tailored Capture the Flag (CTF) challenges. These challenges are designed to measure performance on cybersecurity tasks. Notably, earlier models like GPT-3.5 Turbo struggled with even basic tasks, but performance has improved significantly over time. Mythos Preview now successfully completes over 85% of the Apprentice-level CTF tasks, marking a new high in AISI’s testing history. Despite these impressive results, competing models such as GPT-5.4 and Anthropic’s own Opus 4.6 and Codex 5.3 have shown similar accuracy levels, falling within 5 to 10 percent of Mythos across various CTF difficulty tiers. This raises questions about the necessity of the protective measures surrounding the Mythos Preview launch. AISI’s evaluation highlighted Mythos's notable capabilities in a specific test known as “The Last Ones” (TLO). This exercise simulates a 32-step data extraction attack on a corporate network, requiring the AI to chain multiple actions across various hosts and network segments. AISI estimates that such an operation would typically take a trained human around 20 hours to accomplish, showcasing the potential efficiency of Mythos in real-world scenarios.

Sources : Ars Technica

Published On : Apr 14, 2026, 19:15

Experts Urge Immediate Action as AI Threatens Workforce Stability

In a striking appeal, over 200 economists, AI specialists, and technology leaders, including 16 Nobel laureates, have is...

Business Today | Jul 17, 2026, 09:25

Experts Urge Immediate Action as AI Threatens Workforce Stability

AI Startup Secures $400 Million Loan to Pioneer Inference Chip Market

General Compute, an innovative startup specializing in AI inference, has successfully secured a substantial loan of $400...

TechCrunch | Jul 17, 2026, 12:15

AI Startup Secures $400 Million Loan to Pioneer Inference Chip Market

Kimi K3: China's New AI Model Poised to Compete with US Giants

In a bold move, China's Moonshot AI has unveiled its latest creation, the Kimi K3 model, which the startup claims is the...

Business Today | Jul 17, 2026, 09:25

Kimi K3: China's New AI Model Poised to Compete with US Giants

ASML Navigates the Complexities of U.S.-China Relations Amidst Growing AI Demand

ASML finds itself in a challenging position, balancing significant sales from China while facing increasing political pr...

CNBC | Jul 17, 2026, 11:15

ASML Navigates the Complexities of U.S.-China Relations Amidst Growing AI Demand

Gadgets

Asus Unveils the Game-Changing Asus Pad: A New Era for Android Tablets

Asus, the renowned Taiwanese technology firm, is set to make waves with the introduction of the Asus Pad, an innovative ...

Business Today | Jul 17, 2026, 10:25

Asus Unveils the Game-Changing Asus Pad: A New Era for Android Tablets

View All News

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolio

case study

follow us on

follow us on

UK gov’s Mythos AI tests help separate cybersecurity threat from hype

Experts Urge Immediate Action as AI Threatens Workforce Stability

AI Startup Secures $400 Million Loan to Pioneer Inference Chip Market

Kimi K3: China's New AI Model Poised to Compete with US Giants

ASML Navigates the Complexities of U.S.-China Relations Amidst Growing AI Demand

Asus Unveils the Game-Changing Asus Pad: A New Era for Android Tablets

Collaborate with Benzatine Infotech

High-quality, Cost-effective IT Outsourcing

let’s grow together!

portfolios

case study

follow us on

follow us on

portfolio

case study

follow us on

follow us on

UK gov’s Mythos AI tests help separate cybersecurity threat from hype

Experts Urge Immediate Action as AI Threatens Workforce Stability

AI Startup Secures $400 Million Loan to Pioneer Inference Chip Market

Kimi K3: China's New AI Model Poised to Compete with US Giants

ASML Navigates the Complexities of U.S.-China Relations Amidst Growing AI Demand

Asus Unveils the Game-Changing Asus Pad: A New Era for Android Tablets

Collaborate with Benzatine Infotech