Frontier AI has crossed a major capability threshold in security research, identifying vulnerabilities in production software at a speed, scale, and cost no human team can match. Anthropic’s Claude Mythos and OpenAI’s GPT-5.5-Cyber are gated distributions because public release was judged too risky. This month, Google’s Threat Intelligence Group confirmed the first case of a threat actor weaponizing an AI-developed zero-day.
This article summarizes what frontier AI has demonstrated for software security, considers whether hardware will follow, and what might be done about it. In this context, the term “hardware” refers specifically to semiconductor chips, the parts where security exposure is most relevant.
Frontier AI capabilities and their impact on cybersecurity
Mythos achieves 93.9% on SWE-bench, the standard benchmark of real-world software-engineering tasks drawn from open-source GitHub issues. It achieves 73% on the “expert-level cybersecurity tasks” on which every prior LLM (large language model) scored zero. It is the first model to complete the UK AI Security Institute’s 32-step “The Last Ones” network-takeover range end-to-end.
Mythos’s security findings span several categories of software. Among others, it found bugs in OpenBSD and the Linux kernel. It found 271 zero-day bugs in the popular Firefox web browser. In cryptographic libraries, it found bugs in TLS, SSH, and AES-GCM. And in firmware, it found chains that allowed root access on smartphones.