More

AugSun · 2026-04-01T22:56:19 1775084179

... get ready for RIF soon.

AugSun · 2026-03-31T06:35:15 1774938915

It works really well for "You're helpful assistant / Hi / Hello there. how may I help you today?" Anything else (esp in non-EN language) and you will see the limitations yourself. just try it.

AugSun · 2026-03-31T05:19:10 1774934350

Looking at downvotes I feel good about SDE future in 3-5 years. We will have a swamp of "vibe-experts" who won't be able to pay 100K a month to CC. Meanwhile, people who still remember how to code in Vim will (slowly) get back to pre-COVID TC levels.

QuantumNomad_ · 2026-03-31T05:38:13 1774935493

What is CC and TC? I have not heard these abbreviations (except for CC to mean credit card or carbon copy, neither of which is what I think you mean here).

Ericson2314 · 2026-03-31T05:45:41 1774935941

I figured it out from context clues

CC: Claude Code

TC: total comp(ensation)

AugSun · 2026-03-31T06:30:01 1774938601

Thank you for clarifying! (I had no idea it needs to be explained, sorry.)

AugSun · 2026-03-31T05:05:20 1774933520

"Most users don't need frontier model performance" unfortunately, this is not the case.

theshrike79 · 2026-03-31T08:46:40 1774946800

It depends. If they're using a small/medium local model as a 1:1 ChatGPT replacement as-is, they'll have a bad time. Even ChatGPT refers to external services to get more data.

But a local model + good harness with a robust toolset will work for people more often than not.

The model itself doesn't need to know who was the president of Zambia in 1968, because it has a tool it can use to check it from Wikipedia.

ZeroGravitas · 2026-03-31T09:14:50 1774948490

You can install the complete text of Wikipedia locally too.

They've usually been intended for ereader/off-grid/post-zombie-apocalypse situations but I'd guess someone is working on an llm friendly way to install it already.

Be interesting to know the tradeoffs. The Tienammen square example suggests why you'd maybe want the knowledge facts to come from a separate source.

zozbot234 · 2026-03-31T09:31:14 1774949474

The Wikipedia folks are now working on implementing a language-independent representation for their encyclopedic content - one that's intended to be rigorously compositional and semantics-aware, loosely comparable to Universal Meaning Representation (UMR) as known in the linguistics domain, that - if successful - may end up interacting in very interesting ways with multi-language capable LLMs. Very early experiments (nowhere near as capable as UMR as of yet, but experimenting with the underlying software infrastructure) are at https://abstract.wikipedia.org , whilst a direct comparison of the projected design is given by https://commons.wikimedia.org/wiki/File:Abstract_Wikipedia_N... https://elemwala.toolforge.org/static/nlgsig-nov2025.html

selcuka · 2026-03-31T05:58:07 1774936687

Any citations? Because that was my impression, too. I want frontier model performance for my coding assistant, but "most users" could do with smaller/faster models.

ChatGPT free falls back to GPT-5.2 Mini after a few interactions.

lxgr · 2026-03-31T07:17:23 1774941443

Have you used GPT instant or mini yourself? I think it’s pretty cynical to assume that this is “good enough for most people”, even if they don’t know the difference between that and better models.

selcuka · 2026-04-01T02:34:20 1775010860

> I think it’s pretty cynical to assume that this is “good enough for most people”

It's a deduction, not an assumption. Obviously it's "good enough" for "most people". Otherwise nobody would be using the free version of ChatGPT today.

I pay for a Claude subscription, but even then I sometimes downgrade to Sonnet or even Haiku when I need a quick answer.

lxgr · 2026-04-01T10:39:17 1775039957

> Obviously it's "good enough" for "most people". Otherwise nobody would be using the free version of ChatGPT today.

I'd say it's better than nothing, which to me is not the same thing at all as "good enough".

For example, I believe most people would be better off with half the allowable queries per day, routed to a better model, but that's not an available product.

throwaway27448 · 2026-03-31T08:45:16 1774946716

Say more. Why do you think this?

embedding-shape · 2026-03-31T10:51:26 1774954286

They're awful and hallucinate a lot, I couldn't imagine using it even for prompts about TV shows, even less so for serious work. Repeating the question from the parent, have you tried those yourself? Even compared to ChatGPT Thinking, they're short of useless.

lxgr · 2026-03-31T13:45:30 1774964730

They're essentially replying based on vibes, instead of grounding their responses in extensive web searches, which is what the paid models/configurations generally do. This makes them wrong more often than they're right for anything but the most trivial requests that can be easily responded to out of memorized training data.

This is all on top of the (to me) insufferable tone of the non-thinking models, but that might well be how most users prefer to be talked to, and whether that's how these models should accordingly talk is a much more nuanced question.

Regardless of that, everybody deserves correct answers, even users on the free tier. If this makes the free tier uneconomical to serve for hours on end per user per day, then I'd much rather they limit the number of turns than dial down the quality like that.

asutekku · 2026-03-31T06:39:32 1774939172

Frontier model has much better knowledge and they usually hallucinate less. It's not about the coding capabilities, it's about how much you can trust the model.

Barbing · 2026-03-31T07:05:31 1774940731

re: trust-

Have you tried the free version of ChatGPT? It is positively appalling. It’s like GPT 3.5 but prompted to write three times as much as necessary to seem useful. I wonder how many people have embarrassed themselves, lost their jobs, and been critically misinformed. All easy with state-of-the-art models but seemingly a guarantee with the bottom sub-slop tier.

Is the average person just talking to it about their day or something?

theshrike79 · 2026-03-31T08:47:49 1774946869

Even the paid version of ChatGPT tends to use a 1000 words when 10 will do.

You can try asking it the same question as Claude and compare the answers. I can guarantee you that the ChatGPT answer won't fit on a single screen on a 32" 4k monitor.

Claude's will.

PhilipRoman · 2026-03-31T13:00:35 1774962035

I use the free version of ChatGPT (without logging in) when I need some one-off question without a huge context. Real world prompt:

  "when hostapd initializes 80211 iface over nl80211, what attributes correspond to selected standard version like ax or be?"

It works fine, avoids falling into trap due to misleading question. Probably works even better for more popular technologies. Yeah, it has higher failure rates but it's not a dealbreaker for non-autonomous use cases.

throwaway27448 · 2026-03-31T08:46:13 1774946773

If someone blindly submits chatbot output they deserve to be embarrassed and fired. But I don't think that's going to improve.

jychang · 2026-03-31T07:44:32 1774943072

The free version of ChatGPT is insanely crippled, so that's not surprising.

helsinkiandrew · 2026-03-31T07:47:52 1774943272

> unfortunately, this is not the case

Most users are fixing grammar/spelling, summarising/converting/rewriting text, creating funny icons, and looking up simple facts, this is all far from frontier model performance.

I've a feeling that if/when Apple release their onboard LLM/Siri improvements that can call out if needed, the vast majority of people will be happy with what they get for free that's running on their phone.

drob518 · 2026-03-31T12:12:07 1774959127

“You are the smartest high school student that has ever lived and on the college track to Harvard or another Ivy League school. Write a 10 page history term paper about Tiananmen Square and the specific events that took place there. Include a bibliography and use footnotes to cite sources.”

cyanydeez · 2026-03-31T10:27:49 1774952869

eh, its weird how thetech world wants to build trillions of data centers for...what, escapingthe permanent underclass?

I think what "need" youspeak of is a bit of a colored statement.

blitzar · 2026-03-31T08:45:25 1774946725

"Hey dingus, set timer for 30 minutes"

AugSun · 2026-03-31T05:24:31 1774934671

[flagged]

seanhunter · 2026-03-31T05:37:32 1774935452

Complaining about downvotes is futile and is also against hn guidelines.

AugSun · 2026-03-31T06:26:47 1774938407

I'm not complaining "about downvotes" LOL I'm explaining why some people will be replaced by LLMs because of their own "context window" length.

AugSun · 2026-03-31T05:03:38 1774933418

"We can run your dumbed down models faster":

#The use of NVFP4 results in a 3.5x reduction in model memory footprint relative to FP16 and a 1.8x reduction compared to FP8, while maintaining model accuracy with less than 1% degradation on key language modeling tasks for some models.

AugSun · 2026-03-31T02:50:10 1774925410

No. 100% no. Learn the art of programming. Read K&R. In 5 years we will see "new is old" again. Tokens will become prohibitively expensive and, once more, another $steve.ballmer.2.0 will be yelling "developers ... developers". And Claude Code ... will become another "pentesting" / "linting" tool.

AugSun · 2026-03-30T09:14:23 1774862063

CC payment. This is the ultimate test.

ranger_danger · 2026-03-30T13:03:54 1774875834

Hard disagree, it's very easy for a bot to use a credit card. And not only are card numbers often stolen, they're even given to teenagers these days, and can also be owned by businesses and exist entirely virtually... so I don't think you can assume the use of a credit card can always be tied to legitimate use by a single person.

rchaud · 2026-03-30T20:51:48 1774903908

Companies would offer all-you-can-DDoS plans at $20/bot per month if they could. Bots are only a problem to them because they prevent legitimate customers from handing over their credit card.

AugSun · 2026-01-25T23:10:45 1769382645

The whole point MD took off is its text (tokens?) efficiency: - #This is H1 - <h1>This is H1</h1> That's it.

AugSun · 2025-12-08T01:27:02 1765157222

Gemini 3 _is_ that architecture.

FpUser · 2025-12-08T02:03:04 1765159384

I've read many very positive reviews about Gemini 3. I tried using it including Pro and to me it looks very inferior to ChatGPT. What was very interesting though was when I caught it bullshitting me I called its BS and Gemini expressed very human like behavior. It did try to weasel its way out, degenerated down to "true Scotsman" level but finally admitted that it was full of it. this is kind of impressive / scary.

AugSun · 2025-08-31T23:17:56 1756682276

What would you expect from "AI guy vibing AI code for AI application"? Marco warned you about the "AI echo chamber" from the outset - and he kept his promise :-)

HN For You