For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | ubutler's commentsregister

ChatGPT already does this, albeit in limited circumstances, through the use of its sandbox environment. Asking GPT in thinking mode to, for example, count the number of “l”s in a long text may see it run a Python script to do so.

There’s a massive issue with extrapolating to more complex tasks, however, where either you run the risk of prompt injection via granting your agent access to the internet or, more commonly, an exponential degradation in coherence over long contexts.


This is called a lexical innovation ;). https://en.wikipedia.org/wiki/Lexical_innovation.

We'd argue it makes a lot of sense to appropriate 'graphitization' as a term for a model designed to transform data into knowledge graphs.


> Also, you really want to tell people how to access it and what it costs. Or put up a "call for quote" if your market is large Enterprise budgets.

Our pricing page can be found in our documentation here: https://docs.isaacus.com/pricing/prices. We're planning on making it more visible on our website; thanks for the feedback!


FWIW we're planning on releasing a self-hostable version on AWS Marketplace quite soon followed by one on the Azure Marketplace. In both cases, deployments live entirely in your tenancy, are fully air-gapped (ie, they can't access the internet), and your usage is unmetered.

We do already have a government-facing client using one of our self-hosted deployments given the privacy and security concerns the legal industry tends to have (rightfully in our view) around AI.


> Weirdly, the blog announcement completely omits the actual new context window size which is 400,000: https://platform.openai.com/docs/models/gpt-5.2

As @lopuhin points out, they already claimed that context window for previous iterations of GPT-5.

The funny thing is though, I'm on the business plan, and none of their models, not GPT-5, GPT-5.1, GPT-5.2, GPT-5.2 Extended Thinking, GPT-5.2 Pro, etc., can really handle inputs beyond ~50k tokens.

I know because, when working with a really long Python file (>5k LoCs), it often claims there is a bug because, somewhere close to the end of the file, it cuts off and reads as '...'.

Gemini 3 Pro, by contrast, can genuinely handle long contexts.


Why would you put that whole python file in the context at all? Doesn't Codex work like Claude Code in this regard and use tools to find the correct parts of a larger file to read into context?


Here's a copy of the only known recording of a prank call to the Queen: https://www.youtube.com/watch?v=-YFFhc3XZDw


After having read the article in its entirety, I’m still not sure what Cybersyn is…


You might think of it as a nation-scale business intelligence system. It’s part of the case study at the end of Stafford Beer’s Brain of the Firm, an (over?)ambitious project cut short by the fall of Allende in the Chilean coup.


The name of a bunch of linked geographically distributed telex machines. Like a phone network but for text? Uh, actually, I guess what they built it something like email? And the central hub had a simulator that could help with taking decisions based on the input from that data?

They als had a very swell control room.


Personally, I like it. However, I like being able to comment and upvote more. At the same time, I'd be reluctant to say the least to hand over my login credentials. It could be quite cool to see this turned into a FOSS RES-style browser extension. Or maybe even a commercial product. I already paid for the HACK app.


We were unfortunately disappointed to discover that, yes, Voyage, Cohere, and Jina all train on the data of their API customers by default.

Voyage's terms say:

> you grant Voyage AI (and its successors and assigns) a worldwide, irrevocable, perpetual, royalty-free, fully paid-up, right and license to use, copy, reproduce, distribute, prepare derivative works of, display and perform the Customer Content: ... (iii) to train, improve, and otherwise further develop the Service (such as by training the artificial intelligence models we use).

Cohere's terms say:

> YOU GRANT US A ... RIGHT TO ... USE ... ANY DATA ... TO ... IMPROVE AND ENHANCE THE COHERE SOLUTION AND OUR OTHER OFFERINGS AND BENCHMARK THE FOREGOING, INCLUDING BY SHARING API DATA AND FINETUNING DATA WITH THIRD PARTIES ...

Jina's terms say:

> Jina AI shall, subject to applicable mandatory data protection requirements, be entitled to retain data uploaded to the Jina AI Systems or otherwise provided by the Customer or collected by Jina AI in the course of providing the Services and to use such data in anonymized/pseudonymized format for its business purposes including to improve its artificial intelligence applications.


This is the most interesting part of this article.


In my experience, maintaining a very popular software library, supporting open source, and blogging have absolutely all contributed to my success, and, additionally, as someone who is now a founder seeking like-minded, highly skilled engineers, those are key signals for an attractive hire.

I can understand though, perhaps in a work environment where management is unlikely to be able to retain high skilled talent, you may want 'low-profile' workers that aren't going to have as many competitors chasing after them...


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You