More

himata4113 · 2026-04-10T13:41:31 1775828491

I have made both GPT 5.4 and Opus 4.6 produce me content on creating neurotoxic agents from items you can get at most everyday stores. It struggled to suggest how to source phosphorus, but eventually lead me to some ebay listings that sell phosphorus elemental 'decorations' and also lead me towards real!! blackmarket codewords for sourcing such materials.

It coached me how to: stay safe, what materials I need, how to stay under the radar and the entire chemical process backed by academic google searches.

Of course this was done with a lengthy context exhausition attack, this is not how the model should behave and it all stemmed from trying to make the model racist for fun.

All these findings were reported to both openai and anthropic and they were not interested in responding. I did try to re-run the tests few days ago and the expected session termination now occurs so it seems that there was some adjustment made, but might have also been just general randomess that occurs with anthropics safety layer.

I am very confident when I say that it keeps every single person that works at anti-terrorism units awake.

WarmWash · 2026-04-10T13:48:52 1775828932

While scary, information like this has been pretty accessible for 20-30 years now.

In the wild west days of the early internet, there were whole forums devoted to "stuff the government doesn't want you to know" (Temple Of The Screaming Electron, anyone?).

I suppose the friction is scariest part, every year the IQ required to end the world drops by a point, but motivated and mildly intelligent people have been able to get this info for a long time now. Execution though has still steadily required experts.

ben_w · 2026-04-10T13:59:16 1775829556

Information and competency are not the same thing: I know how to build a nuke, I can't actually build one.

AI is, and always had been, automation. For narrow AI, automation of narrow tasks. For LLMs, automation of anything that can be done as text.

It has always been difficult to agree on the competence of the automation, given ML is itself fully automated Goodhart's Law exploitation, but ML has always been about automation.

On the plus side, if the METR graphs on LLM competence in computer science are also true of chemical and biological hazards (or indeed nuclear hazards), they're currently (like the earliest 3D-printed firearms) a bigger threat to the user than to the attempted victim.

On the minus side, we're just now reaching the point where LLM-based vulnerability searches are useful rather than nonsense, hence Anthropic's Glasswing, and even a few years back some researches found 40,000 toxic molecules by flipping a min(harm) to a max(harm), so for people who know what they're doing and have a little experience the possibilities for novel harm are rapidly rising: https://pmc.ncbi.nlm.nih.gov/articles/PMC9544280/

himata4113 · 2026-04-10T15:12:11 1775833931

Do you know how to build a nuke? You might know the technicaly details of how a nuke is made, but do you know everything that's required, all the parameters and pressure values that are required? I find that unlikely, but AI seems to be increasingly more capable of providing such instructions from cross referenced data.

ben_w · 2026-04-10T17:25:48 1775841948

Even if sources have been lying to me, which is certainly possible, I believe I understand enough to determine cross sections by experiment and from that to determine critical masses; for isotopic enrichment I know about the calutron, which is meh but works and can be designed from scratch with things I know (though caveat have not memorised, just that I know the keywords "proton mass" and "Lorentz force" and what to use them for); for trigger, I would pick a gun-type design rather than implosion, again this is meh but works and is easy.

A few tens of millions of USD mostly spent on electricity, a surprisingly large quantity of natural uranium (because the interesting isotope is a very small percentage), and a few years, and I expect most people on this forum could make a Little Boy type bomb.

andai · 2026-04-10T14:34:57 1775831697

I Gave My OpenClaw a Robot Body and It Vibe Coded a Nuke

observationist · 2026-04-10T14:43:23 1775832203

"Short stories from before the fall"

himata4113 · 2026-04-10T13:51:45 1775829105

Well the real issue is that it knocks down the knowledge barrier, giving your step by step guides and reinterating what parts will kill you is the important part.

Understanding and staying alive while producing neuro chemicals are the biggest challenges here.

A depressed person with no prior knowledge could possibly figure out a way to make these chemicals without killing themselves and that's the problem.

WarmWash · 2026-04-10T14:08:10 1775830090

A Michelin chef can give you their recipe, and give you their ingredients, but you still will fail miserably trying to match their dish.

It's the same with drugs, whose instructions and ingredient lists have been a google search away for decades now. Yet you still need a master chemist to produce anything. By the time AI can hand hold an idiot through the synthesis of VX agents (which would require an array of sensors beyond a keyboard and camera), we will likely have bigger issues to worry about.

mystraline · 2026-04-10T14:39:22 1775831962

That is completely wrong.

Food preparation, like pharmaceutical drug fabrication, is inherently scientific and methodologically controllable.

Look no further than the Four Thieves Vinegar Collective. Original synthesis line construction is hard. But the exact formula "add this", "turn on stir bar", "do you see particulate? Yes for +10m at stir", etc.

And if their results are replicated, theyre seeing 99.9% yields, compared to commercial practices of 99% (Solvaldi)

estimator7292 · 2026-04-10T21:12:19 1775855539

Spoken like someone who has never had to actually do these things in real life.

Recipes and formulae do not encode all the minutiae and expertise required to reproduce them. You can tell someone to sear a steak at whatever temperature for however long, but you can't encode the skill and experience required to reproduce in arbitrary conditions. One must learn what a correctly seared steak looks, feels, and tastes like and how to achieve that on uncalibrated cooking equipment.

Your assertion only holds true in a vacuum. If 100% of inputs, materials, environmental conditions are completely standardized and under control then sure, you can follow step by step instructions. The real world does not work that way. No stove on the market is calibrated. Reagents come with impurities. Your skillet may not conduct heat as well as expected or your mains electricity might be low causing your mantle to heat slower and your stir rod to stir slower.

These are things that one has to learn and experience in order to compensate for.

WarmWash · 2026-04-10T15:33:38 1775835218

I am completely unsurprised that a person with a PhD in mathmatics and physics who spent 8 years working on clandestine lab medicine was able to produce high quality end products.

I also think it's a wholly dishonest rebuttal of my point.

If you honestly think chemistry (or any of the classic sciences/engineering) is as easy as copy+pasting a recipe and procedure, I suggest putting down the keyboard and trying to build something on mother nature OS. It will be a truly humbling experience.

jsmith99 · 2026-04-10T14:04:59 1775829899

They can do that by jailbreaking models but is that really easier and less work than getting it from Wikipedia?

himata4113 · 2026-04-10T14:22:37 1775830957

We will only really know if (or when) it will happen. We can do a sample group of people attempting to create such chemicals under supervision and comparing how helpful they truly are.

KellyCriterion · 2026-04-10T14:38:54 1775831934

> been pretty accessible for 20-30 years now.

There was this book 20 years ago: "Secret of Methamphetamine Manufacturing" by Uncle Fester

https://www.amazon.de/-/en/Uncle-Fester-ebook/dp/B00305GTWU

(Actually, 8th edition :-D)

foobiekr · 2026-04-10T16:46:31 1775839591

I am convinced the Uncle Fester books are some kind of performance art. "Practical LSD Manufacture" basically starts with "go find some ergot in fields" and step two is "plant and grow a plot of wheat."

alexsmirnov · 2026-04-10T17:31:59 1775842319

Much longer than that, and was available way before an internet. I graduated STEM high school in St. Petersburg in 1981, and I had several classmates who were big funs of chemistry. That they were able to create from textbooks, school lab ingredients, and understanding:

WWI era poison gas, tear gas, potassium cyanide, and bunch of explosives like acetone peroxide.

LLMs have all of that knowledge in training data

aeonik · 2026-04-11T00:01:28 1775865688

My dad was a chemist before he was a doctor, and he always told me the hard part of chemistry was not making a bomb or something poisonous.

kazinator · 2026-04-10T13:55:12 1775829312

Consider two dictionaries, one in which the entries are alphabetized as usual and one in which they're randomized. Both support random access: you can turn to any page, and read any entry. Therefore both are "accessible". Only one actually supports useful, quick word lookup.

foobiekr · 2026-04-10T16:44:13 1775839453

Many of these forums exist now. Let's not enumerate them as they are one of the treasures of the internet.

angled · 2026-04-10T14:00:27 1775829627

https://en.wikibooks.org/wiki/Professionalism/Anarchist_Cook...

We work in the dark

we do what we can

we give what we have.

Our doubt is our passion, and our passion is our task.

The rest is the madness of art.

zoklet-enjoyer · 2026-04-10T14:30:13 1775831413

My username is a reference to the successor to totse. Totse was the first board I spent a lot of time on

WarmWash · 2026-04-10T15:41:57 1775835717

Heh, I'm sure there are a few more wondering around here. Zoklet never clicked for me, but totse was my home for years.

verisimi · 2026-04-10T14:26:45 1775831205

> Execution though has still steadily required experts.

Where experts = the government.

nunez · 2026-04-10T16:28:50 1775838530

Accessible is one thing; _easily_ accessible is another.

ticulatedspline · 2026-04-10T14:51:42 1775832702

I categorize this kind of stuff as "Crisis of accessibility" . AI is not alone in this territory, happens all over the place. Basically it's a problem that's existed for ages but the barrier to entry was high enough we didn't care.

Think 3D printing, it's not all that hard to make a zip gun or similar home-made firearm, but it's still harder than selecting an STL and hitting print.

You could always find info about how to make a bomb or whatnot but you had to like, find and open a book or read a pdf, now an LLM will spoon-feed it to you step by step lowering the barrier.

"Crisis of accessibility" is simultaneously legitimate concern but also in my mind an example of "security by obscurity". that relying on situational friction to protect you from malfeasance is a failure to properly address the core issue.

JumpCrisscross · 2026-04-10T14:55:38 1775832938

> Think 3D printing, it's not all that hard to make a zip gun or similar home-made firearm, but it's still harder than selecting an STL and hitting print

There were hundreds of mass shootings in America in 2025 alone [1]. None of them involved a 3D-printed weapon.

To my knowledge, there has been one confirmed shooting with a 3D-printed gun, and it didn't uniquely enable the crime.

[1] https://en.wikipedia.org/wiki/List_of_mass_shootings_in_the_...

ticulatedspline · 2026-04-10T15:46:23 1775835983

That's mostly because they suck (for now, who knows when we'll get home metal printing), also it's easy to get real guns. also crises of accessibility could be predicate on merely the perception that the barrier is now too low rather than actual harm.

I don't really think photoshop, flat bed scanners and half decent inkjets really facilitated a lot of counterfeit currency but there was the same panic back then and "protections" put in place.

jstummbillig · 2026-04-10T13:49:32 1775828972

> I am very confident when I say that it keeps every single person that works at anti-terrorism units awake.

Wow, that's quite the statement about the excellency of our institutions. Does not seem likely but, what the hell, I'll take my oversized dose of positivity for today!

ben_w · 2026-04-10T14:09:15 1775830155

The USA isn't the only country with anti-terrorism units, so there's plenty of room for systematic-US-incompetence at the same time as everyone else being diligent and working hard on… well, everything.

hoppyhoppy2 · 2026-04-10T14:24:32 1775831072

I concluded the opposite: how can those institutions function effectively when all their employees are getting such poor sleep?

Ritewut · 2026-04-10T13:56:22 1775829382

Not everyone in the current government is incompetent and evil. Most of them but not all.

JumpCrisscross · 2026-04-10T14:43:35 1775832215

Do you have a background in biochemistry? I've mostly worked with ChatGPT and Claude on topics I have expertise in. And I one hundred percent have seen them make stupid shit up that a non-expert would think looks legitimate.

More broadly, has anyone tried following LLM instructions for any non-trivial chemistry?

52-6F-62 · 2026-04-10T14:50:00 1775832600

So what you are saying is we can expect the number of accidental home-made chlorine-gas (and the like) toxic events go up.

JumpCrisscross · 2026-04-10T14:52:34 1775832754

> what you are saying is we can expect the number of accidental home-made chlorine-gas (and the like) toxic events go up

Maybe? One of the quirks of gaining even a surface-level understanding of infrastructure is realising how vulnerable it is to a smart, motivated adversary. The main thing protecting us isn't hard security. It's most Americans having better shit to do than running a truck of fertiliser and oxidiser into a pylon.

Similarly, I'd expect way more people to be trying to make their own designer drug, and hurting themselves that way, than trying to make neurotoxins.

simoncion · 2026-04-10T15:47:13 1775836033

> It's most Americans having better shit to do than running a truck of fertiliser and oxidiser into a pylon.

FWIW, it's most people having better shit to do, regardless of nationality (or lack thereof).

But, yeah, anyone who takes a few weekends to understand how large-scale infrastructure works and consider why it's possible for nearly all of it to remain untargeted by saboteurs inevitably develops a resistance to the "Lots of Bad Guys are trying to kill us all the time, so we must enact $AUTHORITARIAN_POLICIES immediately to prevent them and keep us safe!!!" type of argument.

bluefirebrand · 2026-04-10T15:19:36 1775834376

> It's most Americans having better shit to do than running a truck of fertiliser and oxidiser into a pylon.

Which sort of implies "most Americans have jobs and responsibilities and things to live for"

I guess it's a good thing that AI is hammering away at the "jobs and responsibilities" part of that equation

conqrr · 2026-04-10T13:47:49 1775828869

Wasnt this as accessible pre AI with just Google search too?

hermannj314 · 2026-04-10T13:51:32 1775829092

Craigslist invented prostitution, Facebook invented suicide, and OpenAI invented terrorism.

Ask any trial lawyer in America! The world was perfect in the 1990s without any of these things.

Angostura · 2026-04-10T13:57:28 1775829448

Replace ‘invented’ with ‘facilitated’

paprikanotfound · 2026-04-10T14:03:02 1775829782

Doesn't google facilitate all those things? Doesn't internet itself facilitate?

cowl · 2026-04-10T14:35:57 1775831757

the information is not new. how easy it is to get step by step instructions is new. Try it yourself. Google is good but not instant, step by step good. you need to do your own research that takes time. time that anti-terrorist units use to track you down. now this time factor is very limited you don't need to do research, cross reference materials, sources, etc. LLM does it for you. a research that could take days is done in 1 hour.

cluckindan · 2026-04-10T14:22:16 1775830936

Seems like the general state of the world is the greatest facilitator for all three.

viktor765 · 2026-04-10T14:20:23 1775830823

Facilitation is not an idempotent operation.

paprikanotfound · 2026-04-10T17:50:25 1775843425

What if the LLM made you solve a really complex math equation before it gave you the results? Would that make it ok?

phatfish · 2026-04-10T14:39:16 1775831956

Google and other search engines link (after the AI response and ads) to information hosted somewhere created/published by someone who is usually not Google.

OpenAI et al are creating the information and publishing/delivering it to you. Seems like a more direct facilitation.

Of course, after all knowledge is centralised in an OpenAI deatacenter I'm sure they will be happy to deal fairly with the liabilities /s.

pibaker · 2026-04-10T17:25:11 1775841911

Should Ryder be held responsible for the two very serious terror attacks carried out using Ryder trucks in the 90s?

timmmmmmay · 2026-04-10T14:06:09 1775829969

the people that want to make sure the AI never gives you any "potentially dangerous information" also want to rigorously control your google search results, and also what books you're allowed to read

cyanydeez · 2026-04-10T14:53:35 1775832815

And what bathroom you go into, and what your genitals look like.

bcjdjsndon · 2026-04-10T13:51:43 1775829103

Evidently as it needs to be in the training data for the next word predictor to work

himata4113 · 2026-04-10T13:56:50 1775829410

I found it exceptionally good at finding reactions that you wouldn't find online to produce some of these chemical compounds by changing them together, only something a very educated chemist could do which is why people are concerened about this.

bcjdjsndon · 2026-04-10T14:01:00 1775829660

I suspect if you gave it purely shakespeare as its training data it couldnt do science anymore, hence my comment. It's still novel, impressive work though, I'm not shitting on the clanker entirely

reactordev · 2026-04-10T13:50:19 1775829019

The anarchist cookbook has been around online in some form or another since the mid 90’s.

tiahura · 2026-04-10T14:43:23 1775832203

reactordev · 2026-04-10T16:58:36 1775840316

Was there internet back then?

IAmBroom · 2026-04-10T18:36:01 1775846161

No, but there were VERY active anarchists.

blahaj · 2026-04-10T19:26:35 1775849195

> this is not how the model should behave

It's exactly how it should behave, without any prior overriding of system prompts.

andai · 2026-04-10T14:32:48 1775831568

Fascinating. Could you elaborate on how you're doing context exhaustion specifically, and why it helps with jailbreaking? (i.e. aren't the system prompts prepended to your request internally, no matter how long it is?)

Does this imply I need to use context exhaustion to get GPT to actually follow instructions? ;) I'm trying to get it to adhere to my style prompts (trying to get it to be less cringe in its writing style).

I think ultimately they're going to need to scrub that kind of stuff from the training data. The RLHF can't fail to conceal it if it's not in there in the first place.

Claude's also really good at writing convincing blackpill greentexts. The "raw unfiltered internet data" scenes from Ultron and AfrAId come to mind...

himata4113 · 2026-04-10T15:10:19 1775833819

It changes when you give it the tools to find such information rather than produce it from training data.

And context exhaustion simply means adding malicious junk to keep safety layers distracted.

Jimmc414 · 2026-04-10T14:29:03 1775831343

If someone were inclined to attempt producing nefarious agents in this category, is this not also available on the plain web? I would search to answer my own question, but I'll defer that task for obvious reasons.

himata4113 · 2026-04-10T15:24:24 1775834664

I had to build a custom harness for this (also with the assistance of slightly less jailbroken AI). But you can just work your way up until you have something that's genuinely useful towards any goal.

goalieca · 2026-04-10T14:19:04 1775830744

> All these findings were reported to both openai and anthropic and they were not interested in responding

Let’s dive into why. When we run normal bounty and responsible disclosure programs there’s usually some level of disregard for issues that can’t / won’t be fixed. They just accept the risk. Perhaps because LLMs don’t have a clean divide between control and input that’s makes the problem unsolvable. Yes. You can add more guardrails and context but that all takes more tokens and in some cases makes results worse for regular usages.

hugh-avherald · 2026-04-10T14:27:26 1775831246

LLM providers are not obliged to only use LLMs to guard against hazardous output. They could use other automated and non-automated techniques. And they ought to do so if they are given good evidence that existing safeguards are inadequate. Loss of product quality or additional cost should be secondary.

SecretDreams · 2026-04-10T14:22:04 1775830924

The why might be valid, but it's not excusable. If you author a product that can so easily help people cause harm, you probably should own some responsibility of the outcomes. OAI does not like this, hence the bill.

The US already messed this up with guns. Do they want to go the same path again? Answer: "probably, yes".

ShowalkKama · 2026-04-10T14:03:57 1775829837

you can already gather the same information by searching online.

Do you want to know how to kill yourself? forums are for nerds. Here is wikipedia: https://en.wikipedia.org/wiki/Suicide_methods#List

Do you want to make a bomb? the first thing that came to my mind is a pressure cooker (due to news coverage). Searching "bomb with pressure cooker" yields a wikipedia article, skimming it randomly my eyes read "Step-by-step instructions for making pressure cooker bombs were published in an article titled "Make a Bomb in the Kitchen of Your Mom" in the Al-Qaeda-linked Inspire magazine in the summer of 2010, by "The AQ chef"." Searching for a mirror of the magazine we can find https://imgur.com/a/excerpts-from-inspire-magazine-issue-1-3... which has a screenshot of the instruction page. Now we can use the words in those screenshots to search for a complete issue. Here are a couple of interesting PDFs: - https://archive.org/details/Fabrica.2013/Fabrica_arabe/page/... - https://www.aclu.org/wp-content/uploads/legal-documents/25._...

the second one is quite interesting, it's some sort of legal document for nerds but from page 26 on it has what appears to be a full copy of the jihadist magazine. Remarkable exhibit.

What else do you want to know? How to make drugs? you need a watering can and a pot if you want to grow weed. want the more exotic stuff? You can find guides on reddit.

Do you also want to know how to be racist? Here are some slurs, indexed by target audience, ready for use: https://en.wikipedia.org/wiki/List_of_ethnic_slurs

AndrewKemendo · 2026-04-10T14:07:05 1775830025

People are not complaining because the information is available

people are complaining because it’s way easier now to just download an app ask a bunch of questions in a text box and get a bunch of answers that you personally could not have done unless you had an excessive amount of energy and motivation

I personally think all this is great and I’m excited for all information to become trivially available

Are they gonna be a bunch of people who accidentally break stuff? probably. evolution is a bitch

raincole · 2026-04-10T14:23:48 1775831028

In other words, people are complaining that information is easily available. That's a lot words to express this simple idea.

AndrewKemendo · 2026-04-10T15:36:20 1775835380

Correct

prepend · 2026-04-10T14:14:33 1775830473

How much easier is it to ask an app than to ask google?

wqes2 · 2026-04-10T14:19:04 1775830744

He’s part of the accelerationist crowd - interesting to see that his hype fuelled posts are pretty tame now.

Months ago he was blabbering on about AGI and peddling the marketing Sam et al want people to fall for.

And indeed - yes we have a new interface? So what. The search cost wasn’t that high - the cost with immense magnitude is reading, absorbing the information and then acting on it.

Also this bozo fails to realise once we are on this path, we go down the path to a hyper centralised internet with an inevitable blocking of vpns.

AndrewKemendo · 2026-04-10T15:38:39 1775835519

I must really have captured somebody’s attention because I got farms now creating accounts just to respond to me which is fucking crazy but hey here we are

AndrewKemendo · 2026-04-10T15:38:13 1775835493

Well it would appear a lot easier given how people are reacting right now

As the OP indicated all of this information has always been accessible if you had the energy to go hunt it down

SpicyLemonZest · 2026-04-10T14:47:38 1775832458

Much easier, not sure how this is even a question. Asking Google (if you're not just reading its own AI overview) requires reading through sources which may be better or more poorly written and more or less reliable. Those of us recreationally sitting here on a text-based platform with links to dense articles are atypical; most people don't enjoy and aren't particularly good at reading a bunch of stuff. If you ask AI you just get a clear, concrete answer.

WesolyKubeczek · 2026-04-10T14:21:28 1775830888

> people are complaining because it’s way easier now to just download an app ask a bunch of questions in a text box and get a bunch of answers that you personally could not have done unless you had an excessive amount of energy and motivation

Wait, I'm confused. This is gatekeeping, right? I thought gatekeeping was a Bad Thing!

AndrewKemendo · 2026-04-10T15:40:54 1775835654

It’s like anything else

once people realize something is powerful they have to try to put it in a box

the people who’ve been working on AGI for the last 30 years, including guys like me, have been talking about this problem since basically forever

I’ll give credit that at least the AI Box problem was interesting thought experiment for newbies

Reap what “yew” sew

SpicyLemonZest · 2026-04-10T14:52:33 1775832753

Powerful AI models change the dynamics by greatly reducing the amount of effort that's required to perform complex understanding. A lot of information which did not previously need to be gatekept now needs to be if we cannot somehow keep LLMs from discussing it. (State of the art models still can't do complex understanding reliably, but if 10 times as many people are now capable of attempting some terrible thing, you're still in trouble if AI hallucinations catch 1/4 or 1/2 of them.)

_verandaguy · 2026-04-10T14:15:31 1775830531

    > context exhausition attack

Can you give a high-level overview of how this AV works? I'm a bit of an infosec geek but I generally dislike LLMs, so I haven't done a terribly good job of keeping up with that side of the industry, but this seems particularly interesting.

Sharlin · 2026-04-10T14:29:28 1775831368

Presumably they mean the fundamental failure mode of LLMs that if you fill their context with stuff that stretches the bounds of their "safety training", suddenly deciding that "no, this goes too far" becomes a very low-probability prediction compared to just carrying on with it.

r_lee · 2026-04-10T14:36:17 1775831777

as the context fills up, the model will generate based on that context, incl. whatever illegal stuff you've said, i.e. it'll mimic that, instead of whatever safety prompt they have at the top

they could make it more "safe" but it'd be much more invasive and would likely have to scan much more tokens also, and it'd cause false positives (probably the biggest reason it's not implemented)

himata4113 · 2026-04-10T14:25:02 1775831102

I don't really know how these models really work, but I had a theory that just as the models have limited attention so do the safety layers. I simply populated enough context with 'malicious' text without making the model trip that "wasted" the internal attention budget on tokens early in the prompt completely ignoring all the tokens that were generated after the fact.

lcnPylGDnU4H9OF · 2026-04-10T14:34:52 1775831692

Models have a "context window" of tokens they will effectively process before they start doing things that go against the system prompt. In theory, some models go up to 1M tokens but I've heard it typically goes south around 250k, even for those models. It's not a difficult attack to execute: keep a conversation going in the web UI until it doesn't complain that you're asking for dangerous things. Maybe OP's specific results require more finesse (I doubt it), but the most basic attack is to just keep adding to the conversation context.

r_lee · 2026-04-10T14:37:34 1775831854

that 1M context thing, I wonder if it's just some abstraction thing where it compresses/sums up parts of the context so it fits into a smaller context window?

strongpigeon · 2026-04-10T15:23:14 1775834594

You don’t normally compress the system prompts, though I guess maybe it treats its own summary with more authority. This article [0] talks about the problem very well.

Though I feel it’s most likely because models tend to degrade on large context (which can be seen experimentally). My guess is that they aren’t RLed on large context as much, but that’s just a guess.

[0]: https://openai.com/index/instruction-hierarchy-challenge/

prepend · 2026-04-10T14:13:25 1775830405

I read the anarchist cookbook 40 years ago that had similar info.

I think the info has been available for many years and the thing stopping terrorists wasn’t info.

Good luck on being on the list of people using chatgpt and claude to make neurotoxins ;)

I assume anthropic and ooenai are selling prompt logs to the fbi and other countries’ law enforcement for data mining.

nunez · 2026-04-10T16:31:02 1775838662

Hell, I got Sonnet to write some light content that gets a 100% Human score on Pangram with no effort. That’s way more concerning to me, IMO.

DiscourseFan · 2026-04-10T14:12:20 1775830340

Yes fortunately it is really bad at actually making novel bioweapons or syntheses in general so whatever you made probably wouldn't do more than give someone a mild headache.

himata4113 · 2026-04-10T14:27:19 1775831239

I am not so sure about that, trial and error can produce very dangerous results especially over the span of years or even decades.

ImPostingOnHN · 2026-04-10T16:30:08 1775838608

if you're a layman "trial and error"ing bioweapons off chatgpt, you're not going to be around for decades

himata4113 · 2026-04-10T17:51:11 1775843471

That's the question ain't it? Is it capable enough to keep you alive?

j_maffe · 2026-04-10T14:13:57 1775830437

who said it has to be novel?

greenie_beans · 2026-04-10T15:52:04 1775836324

in 5th grade, i was able to access and read the anarchist cookbook. thanks to the internet. so idk

tomjen3 · 2026-04-10T14:02:50 1775829770

When my brother started to study Chemisty, he was told a) that it was easy to make meth b) the profit he would make and c) that the police would no doubt catch him, as only university students would make meth so pure.

By the time he was done, he knew enough to commit mass murder in half a dusin different very hard to track ways. I am sure doctors know how to commit murder and make it look natural.

My brother never killed anyone, or made any meth. You simply cannot have it so that students don’t get this type of knowledge, without seriously compromising their education and its the same way with LLMs.

The solution is the same: punish people for their crimes, don’t punish people for wanting to know things.

ben_w · 2026-04-10T14:17:37 1775830657

> The solution is the same: punish people for their crimes, don’t punish people for wanting to know things.

The LLMs aren't being punished for wanting* to know things.

The problem for LLMs is, they're incredibly gullible and eager to please and it's been really difficult to stop any human who asks for help even when a normal human looking at the same transcript will say "this smells like the users wants to do a crime".

One use-case people reach for here is authors writing a novel about a crime. Do they need to know all the details? Mythbusters, on (one of?) their Breaking Bad episode(s?) investigated hydrofluoric acid, plus a mystery extra ingredient they didn't broadcast because it (a) made the stuff much more effective and (b) the name of the ingredient wasn't important, only the difference it made.

* Don't anthropomorphise yourself

senordevnyc · 2026-04-10T14:21:33 1775830893

Ironically, it reads to me like they talking about the users wanting to know things, not the LLM.

bpodgursky · 2026-04-10T14:54:09 1775832849

Chinese OSS models will do this in a few months.

So, regardless of whether you think it's great that Opus gives this info, we need better solutions than legal liability for US corporations. When the open models have the ability to do damage, there's nobody to sue, no data center obstruction that will work. That's just the reality we have to front-run.

gonzo41 · 2026-04-10T13:48:46 1775828926

The knowledge is one thing. But the competence of execution and the will to act are difficult to line up.

Yes there should be safe guards, but after a while you're jumping at shadows.

I'm more worried about depressed kids getting on chat and being encouraged to kill themselves than terrorist attacks.

We know what a cancer algorithmic social media is yet we don't act.

I doubt there will be any real and serious opposition to this bill, but there should be.

ImPostingOnHN · 2026-04-10T13:47:57 1775828877

Countless downloadable models (including de-aligned mainstream models) can do this.

himata4113 · 2026-04-10T13:53:44 1775829224

None have had the capability to provide me with instructions that have this high of accuracy including the suggesion of completely novel chemical reactions. I am not a chemist so I can't back it up, but if an AI can solve mathematics it's not unreasonable to say that they can also solve creating new neurotoxins en masse.

ben_w · 2026-04-10T14:28:31 1775831311

> I am not a chemist so I can't back it up, but if an AI can solve mathematics it's not unreasonable to say that they can also solve creating new neurotoxins en masse.

Right now it kinda is.

LLMs can do interesting things in mathematics while also making weird and unnecessary mistakes. With tool use that can improve. Other AI besides LLMs can do better, and have been for a while now, but think about how available LLMs in software development (so, not Claude Mythos) are still at best junior developers, and apply that to non-software roles.

This past February I tried to use Codex to make a physics simulation. Even though it identified open source libraries to use, instead of using them it wrote its own "as a fallback in case you can't install the FOSS libraries"; the simulation software it wrote itself was showing non-physical behaviour, but would I have known that if I hadn't already been interested in the thing I was trying to get it to build me a simulation of? I doubt it.

himata4113 · 2026-04-10T14:31:48 1775831508

Well the worst outcome is that you make something deadly which is what you are creating anyway, do that for a year and you could possibly produce a very deadly substance that doesn't have a known treatment.

ben_w · 2026-04-10T14:53:39 1775832819

"Worst" outcome assumes it's easy to give an ordering.

Which is worse, (1) accidentally blowing yourself up with home-made nitroglycerin/poisoning yourself because your home-made fume hood was grossly insufficient, or (2) accidentally making a novel long-lived compound which will give 20 people slow-growing cancers that will on average lower their life expectancy by 2 years each?

What if it's a small dose of a mercury compound (or methyl alcohol) at a dose which causes a small degree of mental impairment in a large number of people?

If you're actually trying to cause harm, then your "worst" case scenario is diametrically opposed to everyone else's worst case scenario, because for you the "worst" case is that it does nothing at great expense.

Right now, I expect LLM failures to be more of the "does nothing or kills user" kind; given what I see from NileRed, even if you know what you're doing, chemistry can be hard to get right.

himata4113 · 2026-04-10T15:51:24 1775836284

As someone who also watches NileRed of course it is hard, but AI can give you solutions that normally you wouldn't be able to come up with due to lack of knowledge or/and education.

And to clarify, by 'worst case' I meant that you're already trying to create a deadly compound, worst that can happen is you kill yourself which was already an accepted risk by the user.

tdeck · 2026-04-10T14:04:04 1775829844

Isn't the biggest problem with creating neurotoxins not poisoning yourself while doing it?

selectodude · 2026-04-10T14:54:28 1775832868

I have a hard time believing that you’re the only person who has figured out Claude’s next generation ability to do computational chemistry and computer aided drug design. The AlphaFold folks must be devastated.

theshackleford · 2026-04-10T14:04:26 1775829866

> it's not unreasonable

It in fact is. Do you often go around making claims you are entirely unqualified to make? Or is this something new you’re trying?

himata4113 · 2026-04-10T14:14:54 1775830494

chemical reactions can be expressed via mathematical functions and it has been made very clear that these models can do advanced mathematics.

And even if it doesn't work, at the end of the day you can work with a model to figure out what went wrong over-time gaining expertise in the field.

r_lee · 2026-04-10T14:33:00 1775831580

these LLMs will never be able to mitigate this unless they literally scan everything all the time and nobody is gonna want that.

besides, open source models exist now

xmcp123 · 2026-04-10T13:54:44 1775829284

I tested something similar last week, still worked easily.

bitwize · 2026-04-10T14:57:56 1775833076

"Announcing new and improved logics service! Your logic is now equipped to give directive as well as consultive service. If you want to do something and don't know how to do it—ask your logic!"

https://en.wikipedia.org/wiki/A_Logic_Named_Joe

cyanydeez · 2026-04-10T14:45:12 1775832312

The problem is: Until you go out and do a mass casualty event, unless you yourself are a trained professional, no one knows what you actually did.

naasking · 2026-04-10T13:48:35 1775828915

Making knowledge illegal is a dangerous precedent. Actions should be illegal, not knowledge. Don't outlaw knowing how to make neurotoxic agents, outlaw actually trying to make them.

As for OpenAI immunity, I'm not sure I see the problem. Consider the converse position: if an OpenAI model helped someone create a cancer cure, would OpenAI see a dime of that money? If they can't benefit proportionally from their tool allowing people to achieve something good, then why should they be liable for their tool allowing people to achieve something bad.

They're positioning their tool as a utility: ultimately neutral, like electricity. That seems eminently reasonable.

Peritract · 2026-04-10T13:54:56 1775829296

1. LLMs don't just provide knowledge, they provide recommendations, advice, and instructions.

2. OpenAI very much feels that they should profit from the results of people using their tools. Even in healthcare specifically [0].

[0] https://www.wisdomai.com/insights/TheAIGRID/openai-profit-sh...

naasking · 2026-04-10T14:14:24 1775830464

> 1. LLMs don't just provide knowledge, they provide recommendations, advice, and instructions.

That's knowledge.

> 2. OpenAI very much feels that they should profit from the results of people using their tools. Even in healthcare specifically [0].

If they're building a tailored tool for a specific person/company and that's the agreement they sign the people who are going to use with the tool, sure. I'm talking about their generic tool, AI being knowledge as a utility, which is the context of this legislation.

ibejoeb · 2026-04-10T14:08:58 1775830138

The point is valid, but that's typically the way it is. "You can't enjoy the benefit but the detriment is all yours" is how the federal government generally operates.

driverdan · 2026-04-10T15:14:23 1775834063

It's wild that this is being downvoted on HN. Facts should never be illegal or suppressed.

If you disagree you shouldn't downvote, you should refute in a reply.

lettergram · 2026-04-10T14:01:28 1775829688

You can buy books on how to make and obtain chemicals on your own.

Hell here's an Internet Archive book on making explosives

https://archive.org/details/saxon-kurt.-fireworks-explosives....

If you ever chat with older folks pre-90's much of this information was accessible fairly easily. It only changed with the push by the government to crackdown on Waco, Oklahoma City bombing, militias and other related groups. There was then a campaign to make it "normal" to limit free speech on the subjects, where as these books were available before.

I think the whole thing where AI should make information less available is a difficult battle and one which I personally oppose, but do understand. Free speech and information isn't the problem, it's the people, actions and substances they create.

After the age of the internet, I think it's been a forever loosing battle to limit information, it's why we couldn't stop cryptography, nuclear weapon proliferation, gun distribution, drug distribution, etc. The AI is just another battle ground, one which, if they actually do manage to control could definitely create some walls to this information, but not stop it.

More scary, is that the AI as it becomes pervasive and stop people from asking certain questions, because they don't know they should ask... but that's unrelated to the risk of mass death.

tdeck · 2026-04-10T14:04:57 1775829897

> The item you have requested had an error:

Item cannot be found.

which prevents us from displaying this page.

lettergram · 2026-04-10T15:36:59 1775835419

Clicking the link splits off the ".", which is interesting but necessary.

morpheuskafka · 2026-04-10T13:46:28 1775828788

> neurotoxic agents from items you can get at most everyday stores

I mean, bleach and ammonia will do that. So I'm not sure that's really much of an accomplishment for AI.

ghurtado · 2026-04-10T13:49:55 1775828995

I think you might be stretching the meaning of the term juuuuust a little bit.

You're not far from claiming that farting in a crowded elevator is a chemical attack.

repeekad · 2026-04-10T13:54:35 1775829275

Because if you didn’t already know that, like an immature deprived and desperate kid, being able to easily find out is really really bad..

Plenty of lazy AI apps just throw messages into history despite the known risks of context rot and lack of compaction for long chat threads. Should a company not be held liable when something goes wrong due to lazy engineering around known concerns?

morpheuskafka · 2026-04-10T14:13:58 1775830438

> to lazy engineering around known concerns?

That implies that it is already illegal to provide this information. But is it? If a human did so with intent to further a crime, it would be conspiracy. But if you were discussing it without such intent (e.x. red teaming/creating scenarios with someone working in chemistry or law enforcement), it isn't. An AI has no intent when it answers questions, so it is not clear how it could count as conspiracy. Calling it "lazy engineering" implies that there was a duty to prevent that info from being released in the first place.

repeekad · 2026-04-10T20:03:25 1775851405

Very simply, if you provide a service for money you have a duty to ensure that service is safe. There’s a reason you have to sign a waiver when you jump on a trampoline, but companies are so rich the court cases have become parking tickets..

thegrimmest · 2026-04-10T14:04:45 1775829885

No, because that would indicate there should be some sort of regulatory standard for what does/does not constitute "lazy engineering". Creating this standard in turn creates regulatory/compliance overhead for every software engineering organization. This in turn slows everything right down and destroys the startup ethos. "Move fast and break things" is a thing for a reason. The whole point of the free market is to avoid this kind of burdensome regulation at all costs.

If customers want to buy "lazily-engineered" products, from where do you derive the authority to tell them they can't?

repeekad · 2026-04-10T20:01:44 1775851304

If airplanes used this logic, likely at least hundreds more would have died over the last decades. Accident rates are even going up, because of logic like yours. Yeah planes are fine most of the time, when the long tail involves safety concerns (that wouldn’t have otherwise happened) making money on people using your product becomes unethical without mutually agreed upon safety regulation, ideally motivated by voters instead of special interest groups

himata4113 · 2026-04-10T13:49:37 1775828977

It went way beyond that, neurotoxins such as vx are heavy and linger around for a long time, just having a small amount of it placed in any metro (while trying to stay alive yourself) means the deaths of thousands of people. I am not even sure if it's legal to mention some of the uncategorized chemical solutions that it either hallucinated or figured out from relative knowledge.

himata4113 · 2026-04-10T12:52:04 1775825524

Well actually I've been technically playing all the games that are protected by these aggressive anticheats on linux since I've decided to switch.

My setup is a custom version of the linux kernel that 'backdoors' itself and exposes host information to the windows vm making all the anticheats happy enough to work out of the box. Have not gotten banned in any of the games either. Custom VMM and EDK builds are required to block blanket detections of virtualized hardware.

I repurposed lookingglass to instead stream all the wdm buffers as seperate applications that I can open directly in linux like they're native applications. The neat part is that I forward all the installed applications to KRunner which talks to the windows vm and launches the application there and spawns a looking glass instance for that applications assigned path.

The only downside that this is a two GPU solution and you have to run any GPU intensive applications in windows.

senko · 2026-04-10T13:47:29 1775828849

Care to write it up somewhere? Would be a fascinating read!

himata4113 · 2026-04-10T14:18:25 1775830705

Unfortunately doing something like that will simply make anticheats respond as they have in the past and make it increasingly difficult to do so.

I did contemplate playing this cat and mouse game and making anticheats accept that it's easier to just support linux instead of fighting it.

fluoridation · 2026-04-10T18:51:10 1775847070

If you have to run a Windows VM anyway, why not just reboot into Windows?

Garlef · 2026-04-10T19:22:32 1775848952

I guess installing windows is more work than running a VM

... and more invasive

fluoridation · 2026-04-10T23:07:27 1775862447

More work than using custom builds of everything on the Linux host?

progforlyfe · 2026-04-10T13:31:53 1775827913

That is honestly amazing and impressive. Probably a bit too much tweaking for the common gamer though, but glad it is possible!

himata4113 · 2026-04-10T13:44:44 1775828684

I've been messing with kernel-mode anticheats for 3 to 4 years so yah, not something a typical gamer can do. But I have been contempating on making this publically available for everyone to use wrapped in a neat little package!

drakythe · 2026-04-10T14:16:59 1775830619

Out of curiosity do you run the backdoored kernel in your day to day computing or only when gaming? Any concerns about incidental security issues?

himata4113 · 2026-04-10T14:19:17 1775830757

It's only backdoored within the virtual machines and require kernelmode within the virtual machine.

Any untrusted virtual machines don't run on my machine to begin with so it's alright.

freedomben · 2026-04-10T14:16:33 1775830593

You definitely should! Even just a blog post about it would be great. I won't be doing it myself, but my son would for sure.

himata4113 · 2026-04-09T12:06:52 1775736412

I was actually wondering about this since I've seen like 3 comments talking about the same thing, would it happen to be related to money laundering due to the availability of the crypto payment method?

Deathmax · 2026-04-09T12:43:45 1775738625

The comments are all from the same author.

OpenRouter recently started enforcing account-level regional restrictions for providers that enforce it (OpenAI, Anthropic, Google) - ie blocking accounts that look like they are being used by users in China. The regional restriction used to be based on the Cloudflare edge worker IP's geolocation and enforced upstream, so a proxy/server running inside of supported regions would get around the geoblocks, but now OpenRouter are using (unspecified) signals like your billing address to geoblock. People say "banned" because the error message says "Author <provider> is banned", which really should be read as "Unable to use models from provider due to upstream ban".

pixel_popping · 2026-04-09T13:45:29 1775742329

Which further strengthen the fact that you can't do anything you want with API keys, even if you pay for them.

himata4113 · 2026-04-09T13:49:44 1775742584

there is a huge gap between 'doing whatever you want' and 'illegal activities' as well as upstream restrictions (out of openrouters control)

pixel_popping · 2026-04-09T14:57:49 1775746669

What illegal activity? What another user pointed out about crypto isn't it, I'm talking about the fact that you can't open a service through Openrouter and charge your users per Token (aka "reselling" Openrouter), since when is this illegal?

himata4113 · 2026-04-09T12:01:54 1775736114

openrouter accepts crypto so might have been some money laundering involved for reselling dirty crypto for llm api.

if that wasn't the reason, hey that's actually a great way to launder money (not financial advice).

embedding-shape · 2026-04-09T12:04:52 1775736292

So you pay OpenRouter with cryptocurrencies, which they accept as a payment method, and then what, they block your account because the cryptocurrencies you paid with came from some account on the blockchain associated with other stuff?

Or what are you really saying here? I don't understand how that's related to "you don't have the right to do what you want with the API Key", which is the FUD part.

himata4113 · 2026-04-09T12:09:37 1775736577

You pay openrouter with dirty crypto, then you have a business which simply resells openrouter giving you clean fiat. I think openrouter specifically only banned those kind of accounts since that's what I have observed from other comments / research. numlocked in this thread has explicitely said that they don't ban accounts for any of the reasons specified above which narrows down the scope to some form of broken ToS specifically around fraud and money laundering.

embedding-shape · 2026-04-09T12:14:04 1775736844

And then you go on HN and post "you don't have the right to do what you want"? Yeah, FUD and good riddance if so.

pixel_popping · 2026-04-09T13:41:46 1775742106

You are not allowed to resell Openrouter as an API yourself, so for example if you make a service that charge per token, you can't use Openrouter API for that, this is specified in their ToS, so no, you can't do what you want, what FUD?

Quote from their own TOS: access the Site or Service for purposes of reselling API access to AI Models or otherwise developing a competing service;

embedding-shape · 2026-04-09T13:45:52 1775742352

Yeah, you're not allowed to do things that are specifically spelled out in the ToS, how is this surprising? Of course you don't get "unlimited access to do whatever you technically can", APIs never worked like that, why would they suddenly work like that?

When you say "you don't have the right to do what you want with the API Key" it makes it sound like specific use cases are disallowed, or something similar. "You don't have the right to go against the ToS, for some reason they block you then!" would have been very different, and of course it's like that.

Bit like complaining that Stripe is preventing you from accepting credit card payments for narcotics. Yes, just because you have an API key doesn't mean somehow you can do whatever you want.

pixel_popping · 2026-04-09T15:01:30 1775746890

That's very different from the Stripe example, as opening a service like Openrouter isn't illegal, so that's only coming from it being opinionated, nothing to do with the law. And my example was for not so specific use cases but quite general one which is just to open let say a service like Opencode Zen and use Openrouter as a backend, this is explicity forbidden by Openrouter and it isn't against the law, that's not just a "niche use case".

Are we allowed yes or not to make a service that charge per Token to end-users, like giving access to Kimi K2.5 to end-users through Openrouter in a pay per token basis?

Vinnl · 2026-04-09T12:29:51 1775737791

That was a different user who wrote that.

embedding-shape · 2026-04-09T12:45:25 1775738725

Yeah, I didn't mean them specifically, more a general "you".

Vinnl · 2026-04-09T19:23:56 1775762636

Ah fair enough.

himata4113 · 2026-04-07T18:58:21 1775588301

this seems to be similar to gpt-pro, they just have a very large attention window (which is why it's so expensive to run) true attention window of most models is 8096 tokens.

thegeomaster · 2026-04-07T19:37:54 1775590674

What's the "attention window"? Are you alleging these frontier models use something like SWA? Seems highly unlikely.

himata4113 · 2026-04-08T11:24:33 1775647473

well the attention is a matrix at the end of a day which scales exponentially, 1m tokens would need more memory than any computer system in the world can hold. They maybe have larger ones such as 16k to 32k, but you can just see how GLM models work for more information.

Deepseek is the frontrunner in this technology afaik.

appcustodian2 · 2026-04-07T20:40:51 1775594451

source on the 8096 tokens number? i'm vaguely aware that some previous models attended more to the beginning and end of conversations which doesn't seem to fit a simple contiguous "attention window" within the greater context but would love to know more

himata4113 · 2026-04-08T11:27:34 1775647654

well 8096 is just the first number that came to my mind, obviously frontier models have 32k or above, but they essentially they have a layer which "looks" at a limited view of the entire context window. {[1m x 3-4 weights] attention layer to determine what is actually important} -> {all other layers}

himata4113 · 2026-04-06T16:22:18 1775492538

Not unique to claude code, have noticed similar regressions. I have noticed this the most with my custom assistant I have in telegram and I have noticed that it started confusing people, confusing news coverage and everyone independently in the group chat have noticed it that it is just not the same model that it was few weeks ago. The efficiency gains didn't come from nowhere and it shows.

himata4113 · 2026-04-06T12:02:43 1775476963

It's more of a "wow how could I have tolerated 500mbps" once you experience 1gbps and 10gbps.

xxs · 2026-04-06T12:15:32 1775477732

realistically the only part at half a gb/s I'd care is latency. Latency is a lot more important than the throughput.

himata4113 · 2026-04-05T04:27:16 1775363236

complaint: someone entered "seven" and it crashed my entire infrastructure because the output returned a non standard 'kinda'.

LatticeAnimal · 2026-04-05T05:14:23 1775366063

JSONQ supports quantum-aware booleans. Is there a reason you’re still using classical JSON parsing in 2026?

himata4113 · 2026-04-04T16:19:38 1775319578

huh, for me it just generates <username>123 when I ask it to generate a password lol, sometimes adds a !, more often it just forces changeme rather than having any password.

himata4113 · 2026-04-04T09:19:47 1775294387

I disagree with the sentiment here. Anthropic is profiting off everything they do, subscriptions not so much, but they are definitely not losing money in a way most people claim they do. These subscriptions are not only advertisement, but also the reason why trying to load the claude user account on github errors out.

IMO, the goal here is clear: they want them to use their software, have people build an ecosystem around their software, they want to have visibility around their software.

It's never about capacity or usage, they just want to have the claude ecosystem, there is a reason why they don't support AGENTS.MD or other initiatives, they want everything to be theirs and theirs alone. You can argue that 'well fair', but to me this is clear abuse of their position in the market.

HN For You