Nvidia went through a lot of effort to make CUDA operational on their entire lineup, and they did it before deep learning even took off.
You do this thing not because you expect consumers with 5 year old hardware to provide meaningful utilization but as a demo ("let me grab my old gaming machine and do some supercomputing real quick") and a signal that you intend to stay the course. AMD management hasn't realized this even after various Nvidia people said that this was exactly why they did it, at some point the absence of that signal is a signal that the AMD compute ecosystem is an unreliable investment, no?
You got it right I think. I’m sitting with two “AI Ready Radeon AI Pro 9700 workstation cards, which are RDNA4 not CDNA. My experience is that my cards are not a priority. Individual engineers at AMD may care, the company doesn’t. I have been trying since February to get ahold of anyone responsible for shipping tuned Tensile gfx1201 kernels in rocm-libs, which is used by Ollama.its been three weeks since I raised enough hell on the discord to get a response, but they still can’t find “who” is responsible for Tensile tuning, and “if” they are even going to do it for the gfx12* cards.
Yeah I own an AMD Instinct MI50 and i need to patch all of my applications to work, like PyTorch, bitsandbytes, blender etc, while Nvidia cards from the same generation are still mostly supported. But the better value and hardware are worth it
LLMs are wordsmith oracles. A lot of effort went into trying to coax interactive intelligence from them but the truth is that you could have probably always harnessed the base models directly to do very useful things. The instruct tuned models give your harness even more degrees of freedom.
A while ago, the autoresearch[1] harness went viral, yet it's but a highly simplified version of AlphaEvolve[2][3][4].
In the cybersecury context, you can envision a clever harness that probes every function in a codebase for vulnerabilities, then bubbles the candidates up to their callsites (and probes whether the vulnerability can be triggered from there) and then all the way to an interface (such as a syscall) where a potential exploit can be manifested. And those would be the low hanging fruit, other vulnerabilities may require the interplay of multiple functions. Or race conditions.
If you were to design an entire ATC system from scratch (pilot interfaces, sensors everywhere in the airport and planes etc) it can be automated. But with pilots having to actually talk to ATC (and sometimes talk over each other with no feedback) instead of observing their status on a screen and pressing buttons on what they want to do or change their status it seems like it will be quite hopeless for quite some time.
What you can probably do is create software which observes traffic and simulates it into the future and notifies the human ATCs about risks. It might even be a good idea to try and digitize it for the ATCs so they talk less and press buttons more (which will feed into the simulation) and use TTS for the legacy transmissions to pilots that don't have an updated interface. Given the regulation on that industry it seems unlikely anyone competent enough to do it will have an interest to even try.
> If you were to design an entire ATC system from scratch (pilot interfaces, sensors everywhere in the airport and planes etc) it can be automated.
Even then you'll probably run into the long-tail distribution issues, similar to self-driving cars. 99.9% of all situations are pretty standard, but once in a while something so abstruse happens that it's not pre-programmed and requires some creativity to solve.
> What you can probably do is create software which observes traffic and simulates it into the future and notifies the human ATCs about risks.
Fully agree. Some of the recent close calls really were "obvious" much earlier, meaning they were not caused by late course changes.
Inserting an undetectable 1-bit watermark into a multi megapixel image is not particularly difficult.
If you assume competence from Google, they probably have two different watermarks. A sloppy one they offer an online oracle for and one they keep in reserve for themselves (and law enforcement requests).
Also given that it's Google we are dealing with here, they probably save every single image generated (or at least its neural hash) and tie it to your account in their database.
The dual-watermark theory makes alot of sense for defensive engineering. You always assume your outer layer will be broken and so keep a second layer that isn't publicly testable. Same as defence in depth anywhere else. I'm curious - as new models are being built constantly and they're naturally non-deterministic, do you think it's possible for end users to prove that?
> I'm curious - as new models are being built constantly and they're naturally non-deterministic, do you think it's possible for end users to prove that?
How is the model relevant? The models are proprietary and you never see any of its outputs that haven't been watermarked.
> Satoshi supported big blocks in his writings and empowered the pro-big block Gavin when he disappeared. Adam is a well known supporter of small blocks, ultimately the "winning" side of the debate. They are not the same person.
From the article:
Then, out of the blue, Satoshi appeared on the list with an email that neatly dovetailed with Mr. Back’s position. It was the first time Satoshi had been heard from in more than four years, other than a five-word post the previous year denying a Newsweek article’s claim to have unmasked him.
Many in the Bitcoin community questioned the new email’s authenticity since another of Satoshi’s email accounts had been hacked. But Mr. Back argued that the email sounded real. In a series of tweets, he called Satoshi’s observations “spot on” and “consistent with Satoshi views IMO” and took to quoting from the email.
Mr. Back was likely correct: To this day, there is no evidence to indicate the email was a forgery, and no other emails from that account have surfaced.
The Satoshi email sounded a lot like Mr. Back had in his posts during the preceding weeks, although no one took notice. Like Mr. Back, Satoshi argued that the Bitcoin network’s increasing centralization jeopardized its security. He called the big block proposal very “dangerous” — the same term Mr. Back had used repeatedly. He also used other words and phrases Mr. Back had used: “widespread consensus,” “consensus rules,” “technical,” “trivial” and “robust.”
At the end of the email, Satoshi denounced Mr. Andresen and Mr. Hearn as two reckless developers trying to hijack Bitcoin with populist tactics and added: “This present situation has been very disappointing to watch unfold.”
It also happened to be densely cited with hyperlinks:
The author has collected more than enough entropy to single out Mr. Back, especially when the anonymity set of who could be Satoshi is so small.
It's either Back or someone who tried to frame him, long before Bitcoin was even remotely successful. Generally, framing someone like this is a poor strategy because it places you in the person's radius as opposed to being absolutely anyone.
It only makes sense to rent out tokens if you aren't able to get more value from them yourself.
I would go a step further and posit that when things appear close Nvidia will stop selling chips (while appearing to continue by selling a trickle). And Google will similarly stop renting out TPUs. Both signals may be muddled by private chip production numbers.
> We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We scale a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically, up to a computation load equivalent to 50 billion parameters.
There are also interesting approaches to more directly compress a large document or an entire codebase into a smaller set of tokens without getting the LLM to wing it. For example, Cartridges: <https://hazyresearch.stanford.edu/blog/2025-06-08-cartridges>
They basically get gradient descent to optimize the KV cache while freezing the network.
There may be server workloads for which the L3 cache is sufficient, would be interesting if it made sense to create boards for just the CPU and no memory at scale.
I imagine for such a workload you can always solder a small memory chip to avoid having to waste L3 on unused memory and a non-standard booting process so probably not.
Most definitely, I work in finance and optimizing workloads to fit entirely in cache (and not use any memory allocations after initialization) is the de-facto standard of writing high perf / low latency code.
Lots of optimizations happening to make a trading model as small as possible.
You do this thing not because you expect consumers with 5 year old hardware to provide meaningful utilization but as a demo ("let me grab my old gaming machine and do some supercomputing real quick") and a signal that you intend to stay the course. AMD management hasn't realized this even after various Nvidia people said that this was exactly why they did it, at some point the absence of that signal is a signal that the AMD compute ecosystem is an unreliable investment, no?
reply