More

zmj · 2026-01-22T05:32:28 1769059948

Mid-level scissor statement?

zmj · 2026-01-20T04:42:35 1768884155

I wrote something fiction-ish about this dynamic last year: https://zmj.dev/author_assistant.html

zmj · 2026-01-17T03:07:35 1768619255

At scale, every test is flaky.

m3047 · 2026-01-17T21:04:55 1768683895

Well, strictly anecdata but there was a time when I worked on code which for $reasons had no dedicated test environment, and externalities were not our problem according to management. So we were reaching out to the big bad world for testing, but we didn't own the network between it and us. I ended up writing tests for the network. Caught a lot of problems with the network. Didn't make any friends.

DISCLAIMER: I've been around IT for probably the majority of y'all's lifetimes, so I'm not saying this happens often. But just because something is fundamentally wrong, doesn't mean that all fundamentally wrong things are the same. In my experience they differ more from each other than the possible good ways of doing the same thing. Don't conflate things without a good reason.

gwbas1c · 2026-01-17T13:46:38 1768657598

Oh no, a well-written unit test is 100% deterministic.

It's once you start doing tests in real environments with real databases and real networks that the unpredictability of the real world creeps in!

zmj · 2026-01-14T01:23:18 1768353798

devcontainers, devcontainers, devcontainers

bigwheels · 2026-01-14T01:33:44 1768354424

Totally, devcontainers are fantastic! In this agent sandboxing space there's also Leash, which in addition to Docker/Orbstack/Podman provides a sophisticated macOS-native system extension mode - https://github.com/strongdm/leash

binsquare · 2026-01-14T01:42:45 1768354965

I don't think containers are enough especially for the security side of things.

Imo microvm's+ dev containers seem like a good fit though

zmj · 2026-01-12T22:49:06 1768258146

My experience with agents in larger / older codebases is that feedback loops are critical. They'll get it somewhere in the neighborhood of right on the first attempt; it's up to your prompt and tooling to guide them to improve it on correctness and quality. Basic checks: can the agent run the app, interact with it, and observe its state? If not, you probably won't get working code. Quality checks: by default, you'll get the same code quality as the code the agent reads while it's working; if your linters and prompts don't guide it towards your desired style, you won't get it.

To put that another way: one-shots attempts aren't where the win is in big codebases. Repeat iteration is, as long as your tooling steers it in the right direction.

zmj · 2026-01-10T14:14:13 1768054453

A friend of mine built something for this: https://howmuch.poiesic.com/

zmj · 2025-12-30T22:09:28 1767132568

I wrote my first one of these today (making a tool for agents in a c# codebase). Pretty good experience, though AOT does still have some rough edges.

zmj · 2025-12-30T03:35:52 1767065752

Yes-ish. It's worth keeping up with the rising tide of model capabilities, but it's not worth stressing over eliciting every last drop. Many of the specific techniques that add value today will be wasted effort with smarter models in a month or two.

zmj · 2025-12-27T15:05:24 1766847924

devcontainers, without credentials to the git remote.

zmj · 2025-12-25T22:43:13 1766702593

Those collections are more like copy-on-write than actual immutable. System.Collections.Frozen is the real thing.

rawling · 2025-12-26T12:07:06 1766750826

Isn't Frozen something you do to a set or dictionary to say, I'm not going to add any more values, please give me a version of this which is optimized for lookup only?

HN For You