For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | lauriewired's commentsregister

1. not that I can think of, due to the core split. It really has to be independent cores racing independent loads. anything clever you could do with kernel modules, page-table-land, or dynamically reacting via PMU counters would likely cost microseconds...far larger than the 10s-100s of nanoseconds you gain.

what I wished I had during this project is a hypothetical hedged_load ISA instruction. Issue two requests to two memory controllers and drop the loser. That would let the strategy work on a single thread! Or, even better, integrating the behavior into the memory controller itself, which would be transparent to all software without recompilation. But, you’d have to convince Intel/AMD/someone else :)

2. It’s called a “smokeninja”. Fairly popular in product photography circles, it’s quite fun!


Or, even better, integrating the behavior into the memory controller itself, which would be transparent to all software without recompilation.

Yeah it would be neat to just flip a BIOS switch and put your memory into "hedge" mode. Maybe one day we'll have an open source hardware stack where tinkerers can directly fiddle with ideas like this. In the meantime, thanks for your extensive work proving out the concept and sharing it with the world!


If you're able to do it at the memory controller level, would it be as simple as making two controllers always operate in lock-step, so their refresh cycles are guaranteed to be offset 50% from one another?

Given that the controller can already defer refresh cycles, and the logic to determine when that happens sounds fairly complex, I suspect that might already be in CPU microcode.

...which raises the tantalizing possibility that this lockstep-mirrored behavior might also be doable in microcode.


Is there a reason you can think of why AMD, Intel etc. would not want to do this?

Really enjoyed the video and feel that I (not being in the IT industry) better understand CPUs und and RAM now.


I can not think of any reason they would not want to do it.

However, I do seem at least 2 downsides to this method.

Number one it is at least 2x the memory. That has for a decently long time been a large cost of a computer. But I could see some people saying 'whatever buy 8x'.

The second is data coherency. In a read only env this would work very nicely. In a write env this would be 2x the writes and you are going to have to wait for them to all work or somehow mark them as not ready on the next read group. Now it would be OK if the read of that page was some period of time after the write. But a different place where things could stall out.

Really liked her vid. She explained it very nicely. She exudes that sense of joy I used to have about this field.


Nope, there isn’t a tradeoff; median latency isn’t affected. I don’t think you understand the code. The p50 is identical between a single read and the hedged strategy.

The clflush is there because the technique targets data that will miss the cache anyway. If your working set fits in L1, you don’t need this.

Also, AWS Graviton instances absolutely do not expose per-channel memory controller counter PMUs. That’s why you have to use timing-based channel discovery.

The IBM z-system is neat! But my technique will work on commodity hardware in userspace, and you can easily only sacrifice half the space if you accept 2-way instead of 8+ way hedging. It’s entirely up to you how many channel copies you want to use.

Your reply was quite rude, but I hope this is informative.


I was just trying to reconcile his reply with the charts. Have you tested how this scales down for smaller systems, as one might find in on the management side of a network switch?


[flagged]


You were rude for absolutely no reason. You could point out where you think the article comes short and make suggestions on how to improve it. With this approach, you achieved nothing.

Being competent requires being knowledgeable AND getting things done. You might be knowledgeable, but you need to learn how to work with other people.


You were rude. Be nice or don't post.


I really hope you one day re-read these comments and understand just how horrible they are. For absolutely no reason.

So yeah, you will be 'tone-policed' because you're clearly a very rude person.


This is correct. The majority of cases I have to rely on my own expertise.

It's useful for the automation of small repetitive tasks here and there. I was never expecting it to gain the traction that it did; anyone saying they expect it to replace reverse engineers (it won't) is wildly misunderstanding the original intent.

Quite trivial to create binaries that massively confuse LLMs!


I bet red herrings are effective


I wonder if renaming variables to all reference a single movie or book (go through the exe and rename each new variable to the next word or letter in Monty Python's Holy Grail) would do anything.


This is my own channel, but I made a 10+ part series on modern ARM assembly you may find interesting. I used CPUlator for the demonstrations, which is a nice way to inspect the memory as well as the individual registers as you are running a program.

All runs in the browser:

https://youtube.com/playlist?list=PLn_It163He32Ujm-l_czgEBhb...


Thanks for your work on this. I’ve bookmarked several of these videos and used them as reference.

Learning assembly with a really good visualizer or debugger in hand is highly underrated; just watching numbers move around as you run your code is more fun than it has any right to be.

I really like Justine Tunney’s blinkenlights program. (https://justine.lol/blinkenlights/)

A version of that for AArch64 / RISC-V would be really cool.


> If you can't make CPUs and you can't keep the internet up, where are you going to get the equipment for enough "private peering or Sat links" for the privileged?

Storage. You only need a few hundred working systems to keep a backbone alive. Electron migration doesn’t kill transistors if they are off and in a closet.

> You need CPUs to build optical media drives! If you can't build CPUs you're not using optical media in 30 years.

You don’t need to make new drives; there are already millions of DVD/Bluray devices available. The small microcontrollers on optical drives are on wide node sizes, which also make them more resilient to degradation.

> they're definitely f-ing going to have been able to repeat all the R&D to build a 68k CPU in 30 years (and that's assuming you've destroy all the literature and mind-wiped everyone with any knowledge of semiconductor manufacturing).

If you read the post, the scenario clearly states “no further silicon designs ever get manufactured”. It’s a thought experiment, nothing more.


> If you read the post, the scenario clearly states “no further silicon designs ever get manufactured”. It’s a thought experiment, nothing more.

This kind of just breaks the thought experiment, because without the "why?" of this being vaguely answered, it makes no sense. How do you game out a thought experiment that starts with an assumption that humanity just randomly stops being humanity in this one particular way? What other weird assumptions are we meant to make?


If you don't like the rules of the game, you don't have to play it.


But, this is as if people said “well, I can’t carry the soccer ball in my hands, so I’ll carry it with my elbows instead.”


It's not that complicated, you just literally choose to not participate in the thought experiment and you move on with your life.


OK, no silicone. But we might be just fine after all. Just yesterday we had a story about Bismuth transistors that are better in every way than silicon ones. Maybe a tad more expensive. There are a plenty of other semiconductors out there too. We’ll have to adjust manufacturing but it will probably be just one upgrade cycle skip. Even with a complete mind wipe it’s still not that bad if only silicone is out.


It takes a bit of curation, but I find substack's algorithm to be quite good at recommending other bloggers I'd be interested in.

It's also pretty trivial to find what writers other bloggers enjoy based on the "reads" list tab. My algorithm is:

-> Find blogger you like -> Check their substack "reads" for other writers -> Repeat


The three “flaws” that this post lists are exactly what the industry has been moving away from for the last decade.

Arm’s SVE, and RISC-V’s vector extension are all vector-length-agnostic. RISC-V’s implementation is particularly nice, you only have to compile for one code path (unlike avx with the need for fat-binary else/if trees).


I touch on this briefly in the video, beside Claude Desktop, 5ire is a fairly model-agnostic local MCP client, I'm sure there are others.

sama also recently mentioned ChatGPT Desktop is getting MCP client functionality "soon".

As for remote clients, Cloudflare has some really useful tooling, look at their "AI Playground".


That's just my natural speaking voice. I'm a small person, and everyone sounds different.

I'd be happy to focus on the tool, or the content of the channel, rather than how I sound.


It’s more like LLM-optional.

Malimite is first and foremost intended to be a tool to help Reverse Engineer iOS/Mac binaries, much like JADX for Android.

As it turns out, LLMs are quite good at “converting” C-Pseudocode into an approximation of the original Swift or Objective-C code. Therefore, you can optionally use the LLM extension to help analysis.

Of course, it’s not 100% accurate, but significantly easier to read, and I find it to save hours of manual research.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You