For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | more zmj's commentsregister

Mid-level scissor statement?


I wrote something fiction-ish about this dynamic last year: https://zmj.dev/author_assistant.html


At scale, every test is flaky.


Well, strictly anecdata but there was a time when I worked on code which for $reasons had no dedicated test environment, and externalities were not our problem according to management. So we were reaching out to the big bad world for testing, but we didn't own the network between it and us. I ended up writing tests for the network. Caught a lot of problems with the network. Didn't make any friends.

DISCLAIMER: I've been around IT for probably the majority of y'all's lifetimes, so I'm not saying this happens often. But just because something is fundamentally wrong, doesn't mean that all fundamentally wrong things are the same. In my experience they differ more from each other than the possible good ways of doing the same thing. Don't conflate things without a good reason.


Oh no, a well-written unit test is 100% deterministic.

It's once you start doing tests in real environments with real databases and real networks that the unpredictability of the real world creeps in!


devcontainers, devcontainers, devcontainers


Totally, devcontainers are fantastic! In this agent sandboxing space there's also Leash, which in addition to Docker/Orbstack/Podman provides a sophisticated macOS-native system extension mode - https://github.com/strongdm/leash


I don't think containers are enough especially for the security side of things.

Imo microvm's+ dev containers seem like a good fit though


My experience with agents in larger / older codebases is that feedback loops are critical. They'll get it somewhere in the neighborhood of right on the first attempt; it's up to your prompt and tooling to guide them to improve it on correctness and quality. Basic checks: can the agent run the app, interact with it, and observe its state? If not, you probably won't get working code. Quality checks: by default, you'll get the same code quality as the code the agent reads while it's working; if your linters and prompts don't guide it towards your desired style, you won't get it.

To put that another way: one-shots attempts aren't where the win is in big codebases. Repeat iteration is, as long as your tooling steers it in the right direction.


A friend of mine built something for this: https://howmuch.poiesic.com/


I wrote my first one of these today (making a tool for agents in a c# codebase). Pretty good experience, though AOT does still have some rough edges.


Yes-ish. It's worth keeping up with the rising tide of model capabilities, but it's not worth stressing over eliciting every last drop. Many of the specific techniques that add value today will be wasted effort with smarter models in a month or two.


devcontainers, without credentials to the git remote.


Those collections are more like copy-on-write than actual immutable. System.Collections.Frozen is the real thing.


Isn't Frozen something you do to a set or dictionary to say, I'm not going to add any more values, please give me a version of this which is optimized for lookup only?


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You