More

AstroBen · 2026-04-06T16:03:16 1775491396

Do you really think developers are going through the hellish pain of dealing with Google and Apple for no reason? Real world users prefer and expect apps as opposed to web versions for many product categories.

AstroBen · 2026-04-05T17:41:52 1775410912

Kimi K2.5 (as an example) is an open model with 1T params. I don't see a reason it has to be local for most use cases- the fact that it's open is what's important.

Art9681 · 2026-04-06T03:27:28 1775446048

That is just idealism. Being "open" doesnt get you any advantage in the real world. You're not going to meaningfully compete in the new economy using "lesser" models. The economy does not care about principles or ethics. No one is going to build a long term business that provides actual value on open models. They can try. They can hype. And they can swindle and grift and scalp some profit before they become irrelevant. But it will not last.

Why? Because what was built with an open model can be sneezed into existence by a frontier model ran via first party API with the best practice configurations the providers publish in usage guides that no one seems to know exist.

The difference between the best frontier model (gpt-5.4-xhigh or opus 4.6) and the best open model is vast.

But that is only obvious when your use case is actually pushing the frontier.

If you're building a crud app, or the modern equivalent of a TODO app, even a lemon can produce that nowadays so you will assume open has caught up to closed because your use case never required frontier intelligence.

adrian_b · 2026-04-06T07:49:07 1775461747

A model with open weights gives you a huge advantage in the real world.

You can run it on your own hardware, with perfectly predictable costs and predictable quality, without having to worry about how many tokens you use, or whether your subscription limits will be reached in the most inconvenient moment, forcing you to wait until they will be reset, or whether the token price will be increased, or your subscription limits will be decreased, or whether your AI provider will switch the model with a worse one, and so on.

Moreover, no matter how good a "frontier model" may be, it can still produce worse results than a worse model when the programmer who manages it does not also have "frontier intelligence". When liberated of the constraints of a paid API, you may be able to use an AI coding assistant in much more efficient ways, exactly like when the time-sharing access to powerful mainframes has been replaced with the unconstrained use of personal computers.

When I was very young I have passed through the transition from using remotely a mainframe to using my own computer. I certainly do not want to return to that straitjacket style of work.

verdverm · 2026-04-06T16:25:16 1775492716

The vision has been that the open and/or small models, while 8-16 months behind, would eventually reach sufficient capabilities. In this vision, not only do we have freedom of compute, we also get less electricity usage. I suspect long-term the frontier mega models will mainly be used for distillation, like we see from Gemini 3 to Gemma 4.

AstroBen · 2026-04-05T17:08:02 1775408882

Things must be bad if they're doing this before their IPO

rvnx · 2026-04-05T17:36:46 1775410606

Billions of USD in debt, a business model bleeding cash with no profit in perspective, high-competition environnement, a sub-par product, free-to-use offline models taking off, potential regulatory issues, some investor commitments pulling out... tricky.

But let's not cry for the founders, they managed to get away with tons of money. The problem is for the fools holding the bag.

AstroBen · 2026-04-05T17:38:17 1775410697

Unfortunately the fools holding the bag are going to be those who own index funds when these companies are inserted into them.

mike_hearn · 2026-04-05T18:31:57 1775413917

How is it a subpar product? I've been very happy with GPT 5.4 and the Codex CLI tooling, as well as ChatGPT web. I'd say product is one of their strengths.

surgical_fire · 2026-04-05T21:29:25 1775424565

It's heavily subsidized.

I pay for it, but I don't think it's worth much more than the 20 bucks a month I have been paying.

Once they start charging something that makes sense, I doubt it will be as good.

throwatdem12311 · 2026-04-05T19:41:11 1775418071

Will you be as happy when your $1000/mo of inference you’ve been getting for $30/mo is gonna cost $1000/mo?

mike_hearn · 2026-04-06T13:54:25 1775483665

I don't use anywhere near $1000/mo of inference. But yes, the question of what to do when prices go up a lot does concern me. However, with respect to product alone, Codex is still very good.

timacles · 2026-04-05T20:32:27 1775421147

Yeah you guys have to pay attention to the state of the overall economy. We are in the credit-crunch phase of a recession. The funny money has ran out and infinite loans are no longer available. These companies have to find way to pay their debt now

AstroBen · 2026-04-05T15:20:41 1775402441

By working in this way you're proactively de-skilling yourself. Do it long enough and you're now replaceable by anyone that can type a prompt.

AstroBen · 2026-03-16T19:57:59 1773691079

Fun fact: the person who wrote that original "Claude Code has re-ignited a passion" never commented or posted again.

In fact that was their first and only contribution.

Weird.

AstroBen · 2026-03-16T18:42:05 1773686525

"Notably, increases in codebase size are a major determinant of increases in static analysis warnings and code complexity, and absorb most variance in the two outcome variables. However, even with strong controls for codebase size dynamics, the adoption of Cursor still has a significant effect on code complexity, leading to a 9% baseline increase on average compared to projects in similar dynamics but not using Cursor."

AstroBen · 2026-03-16T18:12:10 1773684730

They're measuring development speed through lines of code. To show that's true they'd need to first show that AI and humans use the same number of lines to solve the same problem. That hasn't been my experience at all. AI is incredibly verbose.

Then there's the question of if LoC is a reliable proxy for velocity at all? The common belief amongst developers is that it's not.

andai · 2026-03-16T22:36:28 1773700588

See also

-2000 lines of code

https://news.ycombinator.com/item?id=26387179

This is actually one thing I have found LLMs surprisingly useful for.

I give them a code base which has one or two orders of magnitude of bloat, and ask them to strip it away iteratively. What I'm left with usually does the same thing.

At this point the code base becomes small enough to navigate and study. Then I use it for reference and build my own solution.

kaffekaka · 2026-03-17T18:43:13 1773772993

Was it Bill Gates who likened LoC to measuring airplane construction progress by weight?

otabdeveloper4 · 2026-03-16T20:51:17 1773694277

> They're measuring development speed through lines of code.

Yeah, this is the biggest facepalm.

Didn't we grow out of this idiocy 40 years ago? This shit again? Really?

AstroBen · 2026-03-16T17:27:57 1773682077

Uh huh.. but the data in Andrej's visualizer is showing software development growth outlook is at 15% (much faster than average)

Over the past year (where Opus has supposedly changed the game), we're seeing ~10% more job postings for software developers compared to this time last year [1,2]

A huge amount of our work is not easily verifiable, therefore it's extremely hard to actually train an LLM to be better at it. It doesn't magically get better across the board.

AI HAS WON. SURF OR DROWN. YOU DONT KNOW WHATS COMING!!!?!?!

Stop with this doomer drivel. It's sick. It's not based in reality and all it does is stress innocent people out for no reason.

1: https://fred.stlouisfed.org/series/IHLIDXUSTPSOFTDEVE

2: https://trueup.io/job-trend

AstroBen · 2026-03-16T15:42:36 1773675756

This is fantasy completely disconnected from reality.

Have you ever tried writing tests for spaghetti code? It's hell compared to testing good code. LLMs require a very strong test harness or they're going to break things.

Have you tried reading and understanding spaghetti code? How do you verify it does what you want, and none of what you don't want?

Many code design techniques were created to make things easy for humans to understand. That understanding needs to be there whether you're modifying it yourself or reviewing the code.

Developers are struggling because they know what happens when you have 100k lines of slop.

If things keep speeding in this direction we're going to wake up to a world of pain in 3 years and AI isn't going to get us out of it.

raw_anon_1111 · 2026-03-16T15:49:50 1773676190

I’ve found much more utility even pre AI in a good suite of integration tests than unit tests. For instance if you are doing a test harness for an API, it doesn’t matter if you even have access to the code if you are writing tests against the API surface itself.

AstroBen · 2026-03-16T15:56:00 1773676560

I do too, but it comes from a bang-for-your-buck and not a test coverage standpoint. Test coverage goes up in importance as you lean more on AI to do the implementation IMO.

AstroBen · 2026-03-16T05:22:50 1773638570

Yes I'm with you. I spent the last 2 months heavily doing "agentic engineering" and I don't think it's optimal to work like that as a default.

LLMs are for sure useful and a productivity boost but generating 99% of your code with it is way overdoing it.

HN For You