Our experience has been that without a good harness you don't really get much out of codex/claude. And you really need to spend time and energy figuring out why coding agents can't find bugs like you can.
Every week I see bugs (as an auditor) that our own harness (https://zkao.io/) can't find, and we have to figure out pretty interesting techniques in order to make the tool find them. Mind you I'm talking mostly about cryptographic vulnerabilities, not just webapp bugs. So IMO it's going to make a lot of sense for companies to have both their own harness (as
tptacek is talking about) and pay for services that focus on making a good harness from experience (and audit firms are going to be the best at doing this, as they see a lot of bugs and can spend time "teaching" their harness about these bugs)
On the other hand, you have to find equally as good techniques to triage, because otherwise you just have some machinery that I call "vibe auditing" that just produces enough false positives to tire all the developers (who are already overwhelmed with crappy AI submissions in bugbounties and other AI tool that review all of their PRs).
At the end of the day, when your harness doesn't return any bug, you're left wondering "does it mean there's no bugs?" We're basically back in this reputation game, where you want to use the best tool, or the best team (that knows what the best tools are), and need to figure out which one is.
I read all the persepolis comics a long time ago and to my memory it was the first time I cried reading a comic. A beautiful work of art. I would recommend to anyone reading this comment to order the first book.
What annoys me the most is that I can’t efficiently track my emails with the default. It’s unusable imo if you have a lot of emails. What I ended up doing was to disable read on preview, and enable shortcuts, so you can navigate with vim shorcuts and have to manually mark emails as read.
I dislike this argument because it’s about limiting the most powerful technology we ever invented because it doesn’t fit well with how we established some social structures.
Solving one of the most famous Erdos problems that has remained unsolved for 80 years without using tools like lean but instead a giant reasoning block is quite a lot more than "kinda nothing"
What are you referring to when you refer to the technology of agriculture? Like John Deere's latest tractor? GMOs? The shift from hunter gathering to agrarian society?
reply