For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | nkko's commentsregister

Yes, this is frustrating, but it doesn’t occur in CC. I run the conversation logs through an agent and opencode source, and it identified an issue in the reasoning implementation of opencode for Zai models. Consequently, I ceased my research and opted to use CC instead.

What happened in 2022?

Starting with 2022 Statamic became the winner.

This wqw an interesting postmortem for me. As a regular user, February definitely felt a bit shakier than usual.

I ran into the unicorn error page a few times and had intermittent issues with PR pages and bunch of weirdness. Nothing long-lived, but enough transient failures that my overall feel was it is degrading in general.


was listening to music while coming with groceries and simultaneously juggling stuff to open the doors and change the track with Siri (the only use for Siri I have)


FWIW I work at Steel (not the OP). While we’ve been iterating on the “right shape” for agent tooling, I’ve been building a benchmark harness to measure how different surfaces affect real web task completion: raw API context, CLI-only, opinionated “skills” (structured outputs + artifact capture), and combinations.

If you’ve run agents on the open web, I’d love suggestions for nasty-but-representative workflows to include in the benchmark.


This rings true, as I’ve noticed that with every new model update, I’m leaving behind full workflows I’ve built. The article is really great, and I do admire the system, even if it is overengineered in places, but it already reads like last quarter’s workflow. Now letting Codex 5.3 xhigh chug for 30 minutes on my super long dictated prompt seems to do the trick. And I’m hearing 5.4 is meaningfully better model. Also for fully autonomous scaffolding of new projects towards the first prototype I have my own version of a very simple Ralph loop that gets feed gpt-pro super spec file.


For no special reason, beside I could, I’ve slop coded this AI agents ephemeral VM orchestrator which I use inside any agent to manipulate and maintain my coding VMs on Proxmox. Probably it could make sense to simplify it further and move from Proxmox to something like this. Link: https://github.com/nibzard/agentlab


This is exciting. But I had to read and check everything twice to figure it out, as some already commented. Strong Feedback loop is an ultimate unlock for AI agents and having twins is exactly the right approach.


YOOO thanks niko! Currently reworking lots of wording to make it easier to understand!


For sure! Just ask enough times "why" and you will find the root. The main issue here it is, how many people do that for real, and how this is becoming even more critical now.


> Just ask enough times "why" and you will find the root.

Once was enough. The root is lack of concern for security, and for GH terms of use.


That magic now moved to ESP32.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You