← Back to feed
Research

Vidoc reproduces Anthropic Mythos vulnerability-finding with GPT-5.4 + Opus 4.6 — 3/3 on FreeBSD and Botan

Six researchers at Vidoc Security Lab published a reproduction study on April 14 showing that Anthropic's Mythos findings — positioned as a gated, security-critical capability — can be approximated with public frontier models through open-source tooling. Using GPT-5.4 and Claude Opus 4.6 driven by the opencode agent, they tested reproductions across five codebases: both models hit 3/3 on FreeBSD and Botan. On OpenBSD, only Claude Opus 4.6 succeeded (3/3); GPT-5.4 failed entirely. On FFmpeg and wolfSSL, both produced partial results — identifying vulnerable code regions but not cleanly reproducing the specific CVEs. The authors conclude the moat has already moved 'up the stack, from model access to validation, prioritization, and remediation.'

vidocanthropicmythosvulnerabilityopen-source

Why it matters

If public frontier models can already do 60-100% of what Mythos claims, the White House's gated-distribution strategy loses most of its security rationale — defenders should assume attackers already have equivalent offensive reach. The paper also reframes the AI-security debate: frontier-access control isn't the bottleneck, validation and operationalization are. Expect CISO procurement cycles in Q2 to shift budget from 'wait for gated tools' to 'buy the validation stack now,' and expect a follow-on Anthropic publication trying to widen the capability gap with non-public benchmarks.

Impact scorecard

8/10
Stakes
9.0
Novelty
8.5
Authority
8.0
Coverage
6.5
Concreteness
9.0
Social
8.0
FUD risk
3.0
Coverage14 outlets · 2 tier-1
Vidoc Security, Hacker News, The Register, CSO Online, Risky Business
X / Twitter6,400 mentions
@vidocsecurity · 2,800 likes
Reddit1,600 upvotes
r/netsec
r/netsec, r/MachineLearning, r/singularity

Trust check

medium

Independent security lab, reproducible methodology, specific result counts per target. Published on vidocsecurity blog and discussed on HN front page. Not yet corroborated by a second replication team — treat magnitudes as directional until a third party confirms.

Primary source ↗