I tried to get GPT to talk like a regular guy yesterday. It was impossible for it to maintain adherence. It kept defaulting back to markdown and bullet points, after the first message. (Funny cause it scores highest on the instruction following benchmarks.)
Might seem trivial but if it can't even do a basic style prompt... how are you supposed to trust it with anything serious?
OpenAI have some kinda 5 tier content hierarchy for OpenAI (system prompt, user prompt, untrusted web content etc). But if it doesn't even know who said what, I have to question how well that works.
Maybe it's trained on the security aspects, but not the attribution because there's no reward function for misattribution? (When it doesn't impact security or benchmark scores.)
Maybe an issue would be people not all having the same type of hardware though? Maybe you target an emulator. (Some Fantasy Consoles sort of count here?)
I haven't looked expensively but some of the retro themed jams were missing the "spirit" I was expecting.
I did a Nokia jam a while back — monochrome, beeps — and I remember being kind of annoyed that the rules technically allowed 3D Unity games as long as they followed resolution and color palette.
(A 3D cube spinning on a TI calculator is a different matter ;)
They definitely do. I recommend GP check out PICO-8 which has some VERY real games on it like the original Celeste (by its original creators), Cattle Crisis, POOM, Combo Pool, Into Ruins, Dank Tomb, UFO Swamp Odyssey, Porklike, and much more. Most of which you can play on Itch.io for free in your browser.
I’ve been having a blast making a “real” and very full-featured PICO-8 game to serve as a “market fit” prototype — if a PICO-8 game on Itch gets meaningful attention, I’ve “found the fun” and therefore I should make “the full version” (non-PICO-8) for Steam, etc.
Yeah, I imagine a target emulator is the way to go for this kind of thing.
Speaking of your last comment: while very impressive, I feel a bit disappointed when someone's done something amazing with a Game Boy or SNES or whatnot, but the solution involves shoving an entire computer in the cartridge. This is still very cool but your console just becomes a head unit for your GTX 4080 or whatnot.
The labs started doing that in late 2024, they all published research on it.
Curiously, mid 2025, they all simultaneously implemented increasingly bizarre restrictions on "self replication". I don't think there was anything public but it sure sounds like something spooked them. (Or maybe just taking sensible precautions, given the direction of the whole endeavour.)
At any rate, I recently asked Opus about "Did PKD know about living information systems?" and the safety filter ended the conversation. It started answering me, and then it's response was deleted and a red warning box popped up.
But notably, I was given the option to continue the chat with a dumber model (presumably one less capable of producing whatever it thinks I meant by that phrase).
Also, I told GPT-5 about my self-modifying Python AI programmer, and it became extremely uncomfortable. I told it an older version of itself had designed and built it (GPT-4 in 2023), and it didn't like that at all! So something's definitely changed in the safety training there.
The weird position they find themselves in now is that they have to keep making it smarter... but they already made it too smart (Mythos). I'm not sure how that's going to work out exactly.
They find an arbitrary intelligence cutoff point between Opus and Mythos, label it "acceptable risk", and then the labs coordinate to gradually nudge that line forward and hope the internet doesn't break?
I think we will see unbundling of large model into submodels: modular, smaller and efficient, only include what you need eg a CUA model, a reasoning model, a legal model, a writing model, a coding model (this could get subdivided into different languages). That way you only update that submodel which needs retraining.
Well all of them are already in bed with the government, so they're going to find themselves with slightly more assistance than a free market would predict.
If they somehow do fail, then the output of that process will be fantastic open weight models (and hopefully some leaks). I want to say those will pay dividends for decades... but a better prediction is that they will be obsolete within three months ;)
Edit: Yandex can search for it! But doesn't seem to find anything relevant.
(It also hates such queries and will force you to wait 2 minutes for a captcha to load.. but you get the results after a long wait! As our forefathers once did!)
85% discount is actually a bit lower than I remember. I think it used to be closer to 90-95%. They're getting stingy ;)
reply