More

kaoD · 2026-05-23T07:43:59 1779522239

But that's not very informative.

Levenshtein distance is not only a well-understood problem, it's small, self-contained, and extremely well-represented in the training data. The kind of problem where even small/bad models can excel. The golden standard for those tasks is just "use a library" so no wonder the beefy models are expensive: you're chartering a commercial airplane to go grocery shopping.

My personal benchmarks are software engineering tasks (ideally spanning multiple packages in a monorepo) composed of many small decisions that, compounded, make or break the implementation and long-term maintainability.

There's where even frontier models struggle, which makes comparisons meaningful.

CraigJPerry · 2026-05-23T08:27:24 1779524844

>> many small decisions

It’s making guesses not decisions, framing as decisions will lead you astray to wasted time and tokens.

It’s vaguely productive to tell them a ton of relevant info upfront attempting to minimise their need for load bearing guesses. I say vaguely because obedience is generally only around the level where it's good enough to lull you into a false sense of security, not to actually be obedient.

It’s a bit more productive to use the various loop mechanisms (hooks, /goal etc) to evaluate each end of turn against guard rails and reject with clear instruction on whats unacceptable. Obviously if you only do this without the front load of info then you’re likely to spend more tokens to reach a satisfactory end of iteration.

kaoD · 2026-05-23T08:48:15 1779526095

If I perfectly know all the guardrails I need, I don't need an LLM, only Prolog.

mark_l_watson · 2026-05-23T12:47:35 1779540455

While you are correct that something like Antigravity 2 + Opus 4.6 can handle large scale software engineering tasks, I would argue that it is usually (but not always) better "coding agent hygiene" to work on smaller code modules and as the human in the loop be a partner, not someone who prompts and then disengages.

Breaking code up into composable chunks has worked well for me over 50+ years as a professional software developer, and I can't get away from the idea that it is still usually the way to go using agentic coding tools.

kaoD · 2026-05-21T08:53:59 1779353639

> Whenever a central position is formed with power over something, even if it’s only a steering power, it will be sought out by power-hungry people and manipulated

The inevitable "iron law of oligarchy".

https://en.wikipedia.org/wiki/Iron_law_of_oligarchy

kaoD · 2026-05-20T22:44:17 1779317057

That's how you get date bugs.

kaoD · 2026-05-20T22:38:59 1779316739

Ah, that's where you're wrong. There is no long term. Investors want results now. "Later" is for the greater fools.

kaoD · 2026-05-20T09:20:00 1779268800

> I was surprised to find out how much hate there is for AI in art.

I'm surprised you're surprised.

Giant corpos steal work from millions of independent artists and the State ruled that IP laws didn't apply to them, only to us common mortals.

kaoD · 2026-05-18T22:35:20 1779143720

Is the token budget also there? I assume not it they'd be at multiple orders of magnitude negative.

kaoD · 2026-05-15T11:19:54 1778843994

> nobody is sitting their waiting / watching the LLM code anyway

My personal experience is that for production-grade code you need to steer the agent more often than not... so yes, at least some of us are watching the LLM code.

kaoD · 2026-05-10T14:30:59 1778423459

> Don't do that, and this problem evaporates.

Don't do that, and you solved nothing.

Either I'm missing what you mean, or half the comments here are missing the point of idempotency.

Let's say your server received this request twice within one minute:

    {
      items: [ { id: 123, amount: 1 } ],
      creditCardInfo: { ... }
    }

How can you tell from the server if that's a retry (think e.g. some reverse proxy crashed and the first request timed out, but the payment already went through to the user's CC)... or if the user just trying to purchase another item 123 because they forgot they needed 2?

There is simply no way to make the requests idempotent without an idempotency key. The only way to tell both situations apart is to key the requests by some UID. The HTTP verb is irrelevant.

Did I misunderstand what you meant?

randallsquared · 2026-05-10T15:31:37 1778427097

In the case of a PUT or DELETE, the key is in the URI. /custs/12345/orders/20260510T153023.239 for example.

kaoD · 2026-05-10T15:42:56 1778427776

Yes, I understand that, but I'm not sure how that changes anything?

I mean: you still have the problem regardless of following HTTP verb semantics or not.

randallsquared · 2026-05-10T16:38:55 1778431135

In the case in the article, the request is being rebuilt again by the client, and may be slightly different. Typically, the server doesn't have to care about any of that if it's just "did we get something for this ID?" and either it did and errors (could be a 4xx or a 5xx depending on what it now has), or it didn't, and processes the request.

kaoD · 2026-05-10T17:01:48 1778432508

So what you propose is first you create the request payload and POST it, which generates a request-id-bound URL (but it does nothing stateful yet) and then you actually request to perform it? Because otherwise I don't see any difference.

randallsquared · 2026-05-10T18:02:37 1778436157

If you must use POST with a idempotency-key, then my suggestion would be to use it like you'd use a PUT: you generate a guaranteed unique URI on the client, according to a spec agreed upon with the server, along with some idempotency-key value (UUID or whatever per the recommendation of the RFC), and then POST the request to the generated URI. If you get back 200 or 201, great! If you get an error that indicates it might not have worked or you get nothing because of a partition, send the request again. If the server had already processed the first request, the second one should 400 or 409 or something, regardless of any differences between the first and second request. If there was some sort of partial processing, then some other permanent error for that particular ID or URI should convey that.

My original point, though, was that these semantics are well-understood for PUT, so just use PUT, or use POST (with the idempotency-key header) exactly as you would PUT.

kaoD · 2026-05-02T13:45:21 1777729521

I know the individual words in the description but I'm a bit confused about what this is.

What would I use Pollen for?

I'm not sure I understand the "seed" metaphor.

sambigeara · 2026-05-02T14:02:37 1777730557

Well, that’s a good question. I think the best answer for now is “we’ll see”?

I use it in place of Tailscale for some homelab applications. I’ve started to deploy other experiments on a “prod” cluster. The demo I showed shows how Pollen responds to a multi-step pipeline type application; two WASM seeds and a single egress communicating over the provided RPC mechanism (`pln://seed…` etc) whilst handling routing, back pressure and the like.

Right now, the workloads need to be stateless. I’m coming up with a story for state at the moment, which’ll likely start as some WAL-like convergent structure with thin (KV store etc) abstractions layered over it. Probably not dissimilar from the pattern underpinning the current CRDT gossip state.

kaoD · 2026-05-02T14:05:19 1777730719

Let's see if I got this right: so it's something like a private Yggdrasil Network (minus the IPv6 overlay?) meets self-distributing WASM-powered serverless functions? Plus some built-in functions for proxying/serving.

sambigeara · 2026-05-02T16:13:12 1777738392

Ha, at a hand-wavey level, yes? Like you say, there's no IPv6 overlay, each node just exposes it's own primary UDP port which talks Pollen's mesh protocol. It uses a single QUIC transport, one QUIC connection per peer, and a combination of streams and datagrams for different bits serving both the control/data layers.

I'd say "WASM-powered serverless functions" is a reasonable analogy, if your serverless functions maintained a minimal number of live replicas at any one point Also, of course, you're tied to the physical ceiling of the explicit hosts that are underpinning your cluster (N machines which are not dynamic like, say, lambdas are when they auto-provision to match demand).

And yeah, you can also `pln serve` arbitrary services which are then exposed to the cluster, but it's worth mentioning that these will of course not benefit from the inherent, organic autoscaling and locality mechanisms that come with the WASM blobs. I only added it in as a feature so I could retire my (basic) Tailscale usage.

Also, you can `pln seed` arbitrary blobs which can be `pln fetch`ed from other nodes. You can also `pln seed ./public my-site` a static webpage which you can reach from any node with `curl -H "Host: my-site" http://<node-addr>:8080/` (8080 being a configurable port).

kaoD · 2026-05-03T00:23:01 1777767781

And what's the public API/stdlib/bindings inside the WASM workers?

I've been thinking a lot about this today and I think you might have a hidden gem here. Where can I reach you to talk more about this?

Feel free to drop an email (address in my profile)

sambigeara · 2026-05-03T07:26:26 1777793186

Ha, thanks! I'll ping you an email.

> And what's the public API/stdlib/bindings inside the WASM workers?

Wazero (via Extism) carries the load here. As it stands, the runtime lifts three basic host functions into guest code which enable the RPC-like behaviour, injection of caller-context and basic logging[1], which are in turn referenced in the guest code like in the example[2].

In the reverse direction, guest code exposes it's public APIs via build directives[3], which are handled by the runtime code[4].

Figured that concrete examples might be more helpful here (I hope the formatting works).

[1] https://github.com/Sambigeara/pollen/blob/567e85d5f1407932dd... [2] https://github.com/Sambigeara/pollen/blob/567e85d5f1407932dd... [3] https://github.com/Sambigeara/pollen/blob/567e85d5f1407932dd... [4] https://github.com/Sambigeara/pollen/blob/567e85d5f1407932dd...

sambigeara · 2026-05-02T17:05:27 1777741527

Failed to mention in my other reply: a "seed" because I envisioned, perhaps too poetically, "seeding" some generic computational unit into the cluster only for it to organically spread to other nodes in the cluster... sort of like pollen? Maybe.

kaoD · 2026-04-27T21:17:32 1777324652

> You have to join the Union, after all

Uh, how? This might be a country thing but you don't have to join any union in my country. You do, if they represent your interests. Big companies have multiple, competing unions, and the anarchists (which refuse state subsidies and are fully self-funded) are pretty good at what they do.

If you have to join a union isn't that essentially a racket?

satvikpendem · 2026-04-27T22:11:27 1777327887

Of course, and that's why some people are opposed to a union, but then others say you're just falling for propaganda or some such nonsense if you deign to have any, even small, criticism of unions.

Lonestar1440 · 2026-04-27T23:06:22 1777331182

> If you have to join a union isn't that essentially a racket?

Yes, this is a big part of my critique.

I'm no lawyer and can't provide a useful explanation of the "why", but literally every educator I know is in the Educator's union. Same with Cops and Nurses. I don't know any airline pilots but I understand it's the same way.

none2585 · 2026-04-27T22:05:22 1777327522

Some jobs it is mandatory to join the union in America anyway

HN For You