Interestingly, it sounds like OpenBSD held up very well:
> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.
The vulnerability in question is a DOS one in the TCP implementation, which is nasty but it's far from the multiple local privilege escalations found in the Linux kernel.
The absolute gall of this guy to laugh off a question about x-risks. Meanwhile, also Sam Altman, in 2015: "Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity. There are other threats that I think are more certain to happen (for example, an engineered virus with a long incubation period and a high mortality rate) but are unlikely to destroy every human in the universe in the way that SMI could. Also, most of these other big threats are already widely feared." [1]
How about "bad agents acquiring dozens of new zero-days and using them to compromise any company or nation they want"? It's not exactly hard to see why you wouldn't want public access to a model significantly better than Opus in cybersecurity.
Chinese companies have consistently been many months behind. I don't think they are hiding anything, they just don't have the compute capability to match Antropic's training runs. As for OpenAI, they are known to have nonpublic models; I agree that it's possible they are preparing for a major release too. (It's also possible that they aren't, in which case it's quite a fumble for them.)
It makes quite a lot of sense to focus on reducing the risks of every human everywhere dying, rather than the risks of already existing oppression getting worse.
In AI 2027, May 2026 is when the first model with professional-human hacking abilities is developed. It's currently April 2026 and Mythos just got previewed.
I don't see why you think this evidence makes this release less likely to be real, rather than more. It's a pretty straightforward scenario: Opus is already good at finding vulns, they scaled it up another OOM, they got something which is good enough at finding vulns to be a major threat.
I think you misunderstood, I do think it's real. I just think they're being disingenuous that this is a new threat.
This is the same company that reported that their models were being used by a state actor to perform exploits in real-time -
https://www.anthropic.com/news/disrupting-AI-espionage
It is functional. You can try it yourself or find third-party tests of it, even. Why do you think that it's a "cheap marketing trick" to test it on the GCC test suites?
[1] "Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.", https://red.anthropic.com/2026/mythos-preview/
reply