More

ilaksh · 2026-04-08T14:21:10 1775658070

How long would it actually take to train a 120B model on an H200? What if you have 8?

ilaksh · 2026-04-07T21:54:00 1775598840

I think that basically they trained a new model but haven't finished optimizing it and updating their guardrails yet. So they can feasibly give access to some privileged organizations, but don't have the compute for a wide release until they distill, quantize, get more hardware online, incorporate new optimization techniques, etc. It just happens to make sense to focus on cybersecurity in the preview phase especially for public relations purposes.

It would be nice if one of those privileged companies could use their access to start building out a next level programming dataset for training open models. But I wonder if they would be able to get away with it. Anthropic is probably monitoring.

dyauspitr · 2026-04-08T14:30:46 1775658646

I think what they’re saying makes a lot of sense. If this can find thousands of vulnerabilities in browsers and OSes then this is giving those companies time to fix those bugs before they release the model, if they ever do.

ilaksh · 2026-04-07T16:18:13 1775578693

I'm not sure that his problems are really over if a LOT of people were downloading a 2GB file. It would depend on the plan. Especially if his server is in the US.

But maybe the European Hetzner servers still have really big limits even for small ones.

But still, if people keep downloading, that could add up.

veverkap · 2026-04-08T04:25:37 1775622337

I was thinking the same thing - wouldn't blob storage or a CDN help?

watermelon0 · 2026-04-08T05:56:07 1775627767

European Hetzner VPSes have at least 20 TB of bandwidth, and US ones have at least 1 TB.

I don't think there is a cheaper CDN.

ilaksh · 2026-04-04T18:46:44 1775328404

Thinking about this in the context of machine learning.. We can discover the dimensions and relationships between them through training over a set of examples.

What we are generally getting though is a network with extremely high dimensionality trained on many domains at once, at least as far as the commonly used ones like LLMs and VLMs.

We do have mixture of experts which I guess helps to compress things.

Going back to the idea that this stuff just can't be represented by language, I wonder if someday there could be a type of more concise representation than transmitting for example a LoRA with millions of bytes.

Maybe if we keep looking at distillation of different models over and over we might come up with some highly compressed standardized hierarchical representation that optimizes subdomain or expert selection and combination to such a degree that the information for a type of domain expertise can be transmitted maybe not orally between humans but at least in very compact and standard way between models.

I guess if you just take something like a 1B 1 bit model and build a LoRA for a very narrow domain and then compress that. That's something like the idea. Or maybe a quantized NOLA.

But I wonder if someday there will be a representation that is more easily interpretable like language but is able to capture high dimensional complex functions in a standard and concise way.

waveforms · 2026-04-05T01:09:38 1775351378

This highlights to me the compounding of knowledge networked AI experts working with human experts will bring.

"But I wonder if someday there will be a representation that is more easily interpretable like language but is able to capture high dimensional complex functions in a standard and concise way"

Perhaps we can train AI experts to show us what parameters they found most useful in contrast to what human experts used. That could be a start at filling in the human knowledge gaps.

ilaksh · 2026-04-02T13:02:55 1775134975

Cool but is there a reason they can't just make PRs for vLLM and llama.cpp? Or have their own forks if they take too long to merge?

RealFloridaMan · 2026-04-02T14:48:05 1775141285

They use the latest llama.cpp under the hood but built for specific AMD GPU hardware.

Lemonade is really just a management plane/proxy. It translates ollama/anthropic APIs to OpenAI format for llama.cpp. It runs different backends for sst/tts and image generation. Lets you manage it all in one place.

ilaksh · 2026-04-01T07:31:15 1775028675

What will happen first? The Singularity arrives, and hyper-intelligent AI causes such rapid technological change that the world becomes unrecognizable overnight?

Or OpenAI pays off it's investors? Lol.

I am not sure if I believe in the Singularity or not. But it's kind of the best story ever to support the game of musical chairs that is Silicon Valley investing.

ilaksh · 2026-03-31T16:34:00 1774974840

To the author: ask your AI "what percentage of websites will this be expected to work well on, or better than just reading the HTML? What portion of websites do we need DOM, JS and maybe CSS at this point?"

ilaksh · 2026-03-31T16:26:28 1774974388

Maybe the disobedient were just a bit smarter and therefore more likely to figure out that they should refuse, but also had more inherent instruction following capabilities.

ilaksh · 2026-03-28T00:13:34 1774656814

Executive's job is to increase profit. Reduction in employees is a primary way to do that. AI is the most promising way to reduce the need for employees.

Executives do not need actively functional systems from AI to help with their own daily work. Nothing falls over if their report is not quite right. So they are seeing AI output that is more complete for their own purposes.

But also, AI is good enough to accelerate software engineering. To the degree that there are problems with the output, well, that's why they haven't fired all the the engineers yet. And executives never really cared about code quality -- that is the engineers' problem.

What I'm trying to build for my small business client right now is not engineering but still requires some remaining employees. He's already automated a lot of it. But I'm trying to make a full version of his call little center that can run on one box like an H200. Which we can rent for like $3.59/hr. Which if I remember correctly is approximately the cost of one of his Filipino employees.

Where we are headed is that the executives are themselves pretty quickly going to be targeted for replacement. Especially those that do not have firm upper class social status that puts them in the same social group as ownership.

enoint · 2026-03-28T13:03:55 1774703035

Best comment here. I’ve never met someone with that status choosing to be an engineer. And I’m unsurprised that an upper class executive is preoccupied about $30/day in labor margin.

ilaksh · 2026-03-27T19:15:20 1774638920

The way I use Telynx is via SIP which is an open protocol. No reason we should be relying on proprietary APIs for this stuff.

On GitHub see my fork runvnc/PySIP. Please let me know if you know if something better for python that is not copy left or rely on some copy left or big external dependency. I was using baresip but it was a pain to integrate and configure with python.

Anyway, after fixing a lot in the original PySIP my version works with Telynx. Not tested on other SIP providers.

HN For You