Google’s threat intelligence group published an advisory this month attributing a previously-unknown 2FA bypass to an LLM-assisted workflow — the first confirmed case of AI being used in the wild to discover and weaponise a zero-day, not just to write phishing copy. The vulnerability sits in an unnamed web-based system administration tool, requires valid user credentials, and stems from what Google’s analysts described as a “hard-coded trust assumption in semantic logic”. Google worked with the affected vendor on coordinated disclosure and remediation before publishing.
How Google knew an LLM wrote the exploit
The attribution is unusual because Google didn’t catch a model in the act; it caught one in the artefact. The Python exploit script came with educational docstrings, structured textbook-style formatting, and — the giveaway — hallucinated CVSS scores referencing CVE IDs that don’t exist. Per the advisory, those are signatures of training-corpus-style output, not human exploit-dev tradecraft. Google explicitly stated there is “no evidence to suggest that Google’s Gemini AI tool was used”, which leaves Claude, GPT-class models, and open-weight forks as the most likely candidates without naming any of them.
“AI is already accelerating vulnerability discovery, reducing the effort needed to identify, validate, and weaponize flaws.” — Ryan Dewhurst, Head of Threat Intelligence, watchTowr
That quote matches what Schneier and others have been writing since January, when published research first showed agentic AI systems chaining open-source tooling to compromise multi-host networks without human intervention.
Why this one matters more than the noise
We have spent eighteen months reading “AI lowers the bar for attackers” think-pieces. This is different in two specific ways. First, the bug class — a semantic-logic trust assumption — is exactly the kind of issue static analysers and pattern-matching scanners miss but an LLM trained on auth-flow source code can reason about. Second, the artefact survived contact with reality: it actually exploited the target in production, in volume, against multiple victims, before remediation. The novelty isn’t that an LLM helped someone write a phish; it’s that an LLM helped someone find a real, exploitable trust boundary, and the resulting code worked at scale.
What this means for practitioners
Two operational shifts worth making now:
- Treat second-factor mechanisms as discoverable targets. If your 2FA implementation has been stable for five years without review, the assumption that “no one will look at this” no longer holds. Schedule an auth-flow review with a security firm that has seen AI-generated exploit patterns.
- Watch for the artefact signatures in your own logs. Hallucinated CVE references, oddly-clean Python in attacker payloads, and docstring-heavy malware are all weak signals worth feeding to your detection engineering team. They are not high-confidence indicators alone, but in aggregate they shift the prior.
For the broader research context, the Schneier piece linked above traces the same trajectory back to its January 2026 starting point.