More

stratos123 · 2026-04-08T11:53:34 1775649214

$20000 is what the Antropic report says they spent on scanning OpenBSD [1].

[1] "Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.", https://red.anthropic.com/2026/mythos-preview/

stratos123 · 2026-04-08T11:50:48 1775649048

Interestingly, it sounds like OpenBSD held up very well:

> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.

The vulnerability in question is a DOS one in the TCP implementation, which is nasty but it's far from the multiple local privilege escalations found in the Linux kernel.

stratos123 · 2026-04-08T10:22:04 1775643724

The absolute gall of this guy to laugh off a question about x-risks. Meanwhile, also Sam Altman, in 2015: "Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity. There are other threats that I think are more certain to happen (for example, an engineered virus with a long incubation period and a high mortality rate) but are unlikely to destroy every human in the universe in the way that SMI could. Also, most of these other big threats are already widely feared." [1]

[1] https://blog.samaltman.com/machine-intelligence-part-1

stratos123 · 2026-04-08T09:34:52 1775640892

> Cautious for what?

How about "bad agents acquiring dozens of new zero-days and using them to compromise any company or nation they want"? It's not exactly hard to see why you wouldn't want public access to a model significantly better than Opus in cybersecurity.

poszlem · 2026-04-08T11:46:06 1775648766

Bad agents already have dozens of zero-days they can use.

stratos123 · 2026-04-08T09:23:52 1775640232

Chinese companies have consistently been many months behind. I don't think they are hiding anything, they just don't have the compute capability to match Antropic's training runs. As for OpenAI, they are known to have nonpublic models; I agree that it's possible they are preparing for a major release too. (It's also possible that they aren't, in which case it's quite a fumble for them.)

stratos123 · 2026-04-08T09:16:55 1775639815

It makes quite a lot of sense to focus on reducing the risks of every human everywhere dying, rather than the risks of already existing oppression getting worse.

stratos123 · 2026-04-08T09:12:32 1775639552

In AI 2027, May 2026 is when the first model with professional-human hacking abilities is developed. It's currently April 2026 and Mythos just got previewed.

lostmsu · 2026-04-08T12:44:39 1775652279

I think previous models could do hacking just fine.

stratos123 · 2026-04-08T08:48:16 1775638096

We already know Opus can find real vulnerabilities ([1], [2], ...), so it's not exactly surprising that a bigger model is better at it.

[1] https://news.ycombinator.com/item?id=47273854

[2] https://news.ycombinator.com/item?id=47611921

nicce · 2026-04-08T10:51:39 1775645499

That is not thousands high-severity vulnerabilities as above commenter stated. Even many local models have found individual vulnerabilities.

stratos123 · 2026-04-08T08:45:22 1775637922

I don't see why you think this evidence makes this release less likely to be real, rather than more. It's a pretty straightforward scenario: Opus is already good at finding vulns, they scaled it up another OOM, they got something which is good enough at finding vulns to be a major threat.

meander_water · 2026-04-08T11:04:01 1775646241

I think you misunderstood, I do think it's real. I just think they're being disingenuous that this is a new threat. This is the same company that reported that their models were being used by a state actor to perform exploits in real-time - https://www.anthropic.com/news/disrupting-AI-espionage

They know how to run a good marketing campaign.

stratos123 · 2026-04-08T08:41:16 1775637676

It is functional. You can try it yourself or find third-party tests of it, even. Why do you think that it's a "cheap marketing trick" to test it on the GCC test suites?

HN For You