For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | anilgulecha's commentsregister

Incidentally pi stopped working today - under the Claude subscription ban for other harnesses. Awaiting a plugin that fixes it.

How is anthropic enforcing the ban, are there identifiers sent from harnesses?

From the Claude Code codemap leak, it seems like Claude Code is sending metadata about the binary that is sending HTTP requests

It's got a little zig mystery blob that does the hashing. Messing with that would run afoul of DMCA anticircumvention right?

they but "minimal dump DRM" into their client (supposedly, from people which leaked the linked source code, no me)

easy to circumvent

but would fall under "circumventing security protections"/"hacking their API"/etc. And due to the sometimes very unreasonable laws the US has in that area they can use that to go after anyone providing a workaround.

Through that maybe won't work well for the EU, I'm not sure how much the laws have been undermined in recent years but we had laws which made it explicitly legal to circumvent DRM iff it's for the sake of producing compatibility (with some caveats).


I think the law just says that it's legal to circumvent DRM for compatibility - they don't define DRM or compatibility. It's one of those vague laws that you only know if it matters when it gets tested in court.

I think if you were going to send the same harness/prompt traffic as Claude Code, then you’d just use Claude Code. Alternatives generally are trying to do something different, thus are going to be easy to detect.

I wouldn’t imagine that fingerprinting them based on request patterns is very difficult.

until your account gets banned.

you can figure out the fingerprinting today, but if they change it tomorrow and wait 5 months to force update everyone, they will catch you and ban


They can just look at the system prompt or tool definitions.

Just pay for credits :(

The best way to learn to build an agent is to learn and use pi.dev . The homepage is a masterclass of explaining the main loop with 4 tools (read write edit bash) and how they enable any flow. Skills and Agents.md on how the agent can be guided.

If you understand the above, you're 80% of the way to building any purpose agent. System prompt, project prompt, and the tool call loop.


this is really helpful!

FOSS is dead - long live, FOSS.

FOSS came up around the core idea of liberating software for hardware, and later on was sustained by the idea of a commodity of commons we can build on. But with LLMs we have alternative pathways/enablement for the freedoms:

Freedom 0 (Run): LLMs troubleshoot environments and guide installations, making software executable for anyone.

Freedom 1 (Study/Change): make modifications, including lowering bar of technical knowledge.

Freedom 2 (Redistribute): LLMs force redistribution by building specs and reimplementing if needed.

Freedom 3 (Improve/Distribute): Everyone gets the improvement they want.

As we can see LLM makes these freedoms more democratic, beyond pure technical capability.

For those that cared only about these 4 freedoms, LLMs enable these in spades. But those who looked additionally for business, signalling and community values of free software (I include myself in this), these were not guaranteed by FOSS, and we find ourselves figuring out how to make up for these losses.


Free software apply for software made for humans, by definitions any LLM code can only have the license inherited from the previous work as it's a transformation work. In the end LLM's are just a source of legal troubles and propietary software companies will end suing each other while every pre AI libre software will be 100% legal to use.

I've been saying LLMs are more open than open source for some time...

I have to disagree. LLMs have shown that the only way to participate in the new software ecosystem are through leveraging an extremely powerful position that is create, backed, and maintained through the exploitation of capital, labor, and power (political, legal, corpotate) at levels never really seen before. The model of the Cathedral and the Bazaar was not broken by LLMs but instead the entire ecosystem was changed.

Now the software doesn't matter. The code doesn't matter. The hardware doesn't matter. Anyone can generate anything for anything, as long as they pay the fee. I think it can likely be argued that participation is now gated more than ever and will require usage of an LLM to keep up and maintain some kind of competition or even meager parity. Open weight models are not really a means of crossing the moat; none of the open weight models come close to the functionality, and all of them come from the same types of corporations that are releasing their models for unspecified reasons. The fact remains that the moat created by LLMs for open source software has never been larger.


I've been using this model for TTS.. will check this out. For regular uses I find these fantastic, BTW.

IMO we need more sovereign systems like this (this is too simple IMO). Other sovereign systems are complex to deploy. if good FOSS commodity options come up, then we can expect a hosting/deployment infra and companies to setup and offer it for non self-hosters as well - ala WordPress.

That would simplify things, but in my opinion that is still too high a hurdle. I'm all about privacy and FOSS, but I don't self-host anything (except for my personal website).

I wish that more apps would instead move the logic to the client and use files on file syncing services as databases. Taking tasks as an example, if a task board were just a file, I could share it with you on Dropbox / Drive / whatever we both use, and we wouldn't need a dedicated backend at all.

The approach has limitations (conflict resolution, authorization, and latency are the big ones), but it is feasible and actually completely fine for lots of apps.


Except this is not FOSS. If it was open source, they would have chosen at least AGPL.

And also I don't think their architecture is any good for such a product.

For me personally, it would be sufficient to avoid it based on the license alone. But altogether it just looks very unappealing.


Since we have a single mind, we need "everything apps", this one shows various issues but the idea of an everything app is damm good.

To the folks and Kitten team: I'm working on TTS as a problem statement (for an application), and what is the best model at the latency/cost inference. I'm currently settling for gemini TTS, which allows for a lot of expressiveness, but a word at 150ms starts to hurt when the content is a few sentences.

my current best approach is wrapping around gemini-flash native, and the model speaking the text i send it, which allows me end to end latency under a second.

are there other models at this or better pricing i can be looking at.


Not sure I understand what you're asking: there are faster than realtime models that you can use, locally or as APIs.


Claude models have made very good progress (see BS benchmark), and that probably explains why they're leading now. others will follow this precedent shortly, no doubt.

https://petergpt.github.io/bullshit-benchmark/viewer/index.v...


It's orthogonal IMO. YOLO or not is simply a sign of trust for the harness or not. Trust slightly affects cognition, but not much. My working hypothesis: exhaustion is the residue of use of cognition.

What impacts cognition for me, and IMO for a lot of folks, is how well we end up defining our outcomes. Agents are tremendous at working towards the outcome (hence by TDD red-green works wonderfully), but if you point them to a goal slightly off, then you'll have to do the work of getting them on track, demanding cognition.

So the better you're at your initial research/plan phase, where you document all of your direction and constraints, the lesser effort is needed in the review.

The other thing impacting cognition is how many parallel threads you're running. I have defaulted to major/minor system - at any time I have 1 major project (higher cognition) and 1 minor agent (lower cognition) going. It's where managing this is comfortable.


Has anyone implemented a system of Pi for a team? Basically consolidate all shared knowledge and skills, and work on things that the team together is working on through this?

Basically a pi with SSO frontend, and data separation.

If no one has - I have a good mind to go after this over a weekend.


There is a thing called Mercury that seems very promising. Check https://taoofmac.com/space/ai/agentic/pi for a list of pi-related things I'm tracking.


I have created a separate knowledge base in Markdown synced to git repo. Agents can read and write using MCP. Works fine!


And others pull regularly from the pool? how are knowledge and skills continuously updated? I was thinking these necessarily need to be server side (like the main project under discussion) for it to be non-clunky for many users, but potentially git could work?

Like, let's take a company example - gitlab. If an agent had the whole gitlab handbook, then it'll be very useful to just ask the agent what and how to do in a situation. The modern pi agents can help build such a handbook with data fed in all across the company.


1/ kb is updated on webhook for all agents ~instantly

2/ skills are not updated that fast (but can be if needed), prefer to have a slow update with review here


But typescript is already trained in every model, and needs no additional work.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You