My videos would consist mostly of: “Chris is cursing msbuild again.” “Chris is losing a table tennis game to Mike again while he waits for the build to complete.” And so forth.
I find codex superior in speed and equal in quality, so it’s my preference. But Claude Code made prettier UIs last time I tested. Codex produces Microsoft-grade UIs. Very enterprise and ugly unless I actively steer it.
Dunno. I did C++ GUI programming back in the day. It wasn’t hard. It would’ve been hard to get light/dark mode and responsive design right, probably, but in my (albeit fuzzy) memory, it wasn’t any harder than building nice web applications. The hard part was just stupid C++ nonsense that Rust and Zig and other nice things are finally here to put out of its misery.
It reminds me of my younger self when I encountered inexplicable behavior in my own software, “I think I found a bug in Firefox!”
…
“Oh, nope. I forgot to add an event handler.”
Using c++ templates wrong in the year 2000 exposed me to real compiler bugs in the Microsoft c++ compiler at the time, the kind that would make the compiler crash.
LLMs aren't nearly mature or deterministic enough to earn that distinction. I've had an agent tell me it read a link I gave it, when actually it lied. I don't see how you could possibly compare that to a compiler where thinking "maybe it's a compiler bug" means you've almost certainly missed something.
The funny thing is that I was so sensitized to this behavior that when I actually found a hardware bug in a chip, it took me forever to convince myself that the problem wasn't actually my code.
Finally contacted the manufacturer's rep, expecting to be called an idiot, only to find out that "yeah, we know about that bug. It's going to be fixed in the next revision."
We formalized that as "if you didn't find a kernel bug yesterday, you didn't find one today either" (while implicitly glaring at the java developer who kept blaming everything but his own code.) The funny thing is that we actually had one guy who found two kernel bugs (spread over a couple of years, but still) while hunting down weird product issues - we didn't think the kernel was perfect, just that "you need to have exhausted the possibilities in your code before considering blaming the kernel" was well supported by evidence...
Claude Code has gotten so bad about this that I’ve stopped using it for code reviews. I may look into wiring Claude up to Codex as an alternative LLM just to compensate.
I think the issue is that I’m running Claude Code in a container so it sees that it is root, and becomes a lot more cautious. Not sure, though.
reply