More

f311a · 2026-06-05T13:40:17 1780666817

You don’t need to connect to duckdb, it’s just a process that you spawn.

co0lster · 2026-06-06T07:20:22 1780730422

You spawn in memory instance of duckdb and connect to it.

f311a · 2026-06-05T11:03:56 1780657436

ClickHouse also supports a lot of data sources and has a local mode where you just use a single binary with local-only access.

Coincidentally, I wrote an article today on how I use it for similar scenarios. It can fetch from S3, multiple databases at once, and so on.

And you get all the benefits of a database when you need to join or postprocess data from multiple sources.

https://rushter.com/blog/clickhouse-data-processing/

f311a · 2026-06-04T19:43:19 1780602199

Infrastructure is a much harder problem. They can't even improve Claude Code, which eats 1GB+ of RAM. Meanwhile, my editor only consumes 80MB of RAM.

airstrike · 2026-06-04T20:03:59 1780603439

This might explain it, in the opposite way it was meant to:

https://fxtwitter.com/trq212/status/2014051501786931427

> Most people's mental model of Claude Code is that "it's just a TUI" but it should really be closer to "a small game engine".

javcasas · 2026-06-04T20:59:36 1780606776

> For each frame our pipeline constructs a scene graph with React then

> -> layouts elements

> -> rasterizes them to a 2d screen

> -> diffs that against the previous screen

> -> finally uses the diff to generate ANSI sequences to draw

Yup. Overengineering.

AceJohnny2 · 2026-06-04T22:06:01 1780610761

This is a decades-old design pattern when CPU >> IO. Emacs has been doing just that since the 80s, when people were complaining about "Eight Megs And Constantly Swapping". See "redisplay" [1]

This minimizes screen flash. You can't rely on terminals doing double-buffering.

[1] https://github.com/emacs-mirror/emacs/blob/c29071587c64efb30... or a more user-friendly overview, Daniel Colascione's seminal "Buttery Smooth Emacs", snapshotted at e.g. https://gist.github.com/ghosty141/c93f21d6cd476417d4a9814eb7...

skydhash · 2026-06-05T01:11:35 1780621895

> This minimizes screen flash. You can't rely on terminals doing double-buffering.

GUI and TUI have different architecture model. Most GUI have have a 2D surface that is redrawn multiple times per second. Double buffering is for decoupling update and render. TUI is a grid of characters that are updated one at a time via an active element, the cursor. Double buffering there is very wrong. Like adding airbags to a bicycle.

There’s a reason you see most old TUI either have an option to redraw the screen (automatically like top, or manually) and those that have a scrolling option allow to scroll by page. The TTY (the underlying concepts) used to be slow and it can be slow today as well (ssh connection). You need to be thoughtful about whole screen updates.

strix_varius · 2026-06-05T00:42:10 1780620130

lol what? There are definitely ways to make non flashing terminal UIs without this total insanity.

jaggederest · 2026-06-05T02:09:42 1780625382

ncurses (new curses) was "new" in 1993...

xiaoyu2006 · 2026-06-05T02:27:29 1780626449

Even with that, 1G of RAM usage is still not justified.

Melatonic · 2026-06-04T21:13:35 1780607615

It's like the Citrix of AI :-D

stego-tech · 2026-06-05T00:05:53 1780617953

OOF. As a former Citrix admin, I felt that burn in my bones.

An upvote well earned.

Aperocky · 2026-06-04T21:44:53 1780609493

It's product bloat.

It's not recognizing that they are just one building block that should do one thing well, like tmux.

You don't need a computer display on your fridge for the same reason, but Anthropic think you do. You should see virtual ice getting created and they should correspond to the actual ice behind the door - think of how amazing that is!

And it's not even completely a bad idea. make it claude-code-react-beauty of some way to take it off, it would be far more palatable.

mapBasketWand · 2026-06-04T22:53:14 1780613594

I love the idea of installing high resolution cameras in the fridge to monitor the ice maker to feed into a vision model that renders digital ice to the exact position of the real ice on the fridge’s giant screen

Aperocky · 2026-06-04T22:58:11 1780613891

See this is the kind of things I hope I'd be doing when I'm retired, but not when I'm shopping.

throwway120385 · 2026-06-04T23:18:12 1780615092

Or you could... open the door and look inside.

steve_adams_86 · 2026-06-05T00:59:07 1780621147

Sounds like you've got a lot of time on your hands

icepush · 2026-06-05T03:16:54 1780629414

Put a servo on the door and a camera on the front. Train a vision model to recognize when your eyes are looking at the door and automatically open it for you.

Another camera inside will detect when you are done and close it.

asdff · 2026-06-05T02:02:52 1780624972

What, like a poor?

irishcoffee · 2026-06-05T03:24:14 1780629854

You mean like… a transparent door? Is that the joke?

yuanBuilds · 2026-06-05T03:06:47 1780628807

Yup. For me, this translates to "we are using Ink, the react-compatible TUI framework to build Claude Code"

megous · 2026-06-04T23:40:57 1780616457

React part maybe. The rest is what any TUI that's using ncurses would do. :)

It really bothers me that most of the TUI harnesses are using 100% CPU quite a lot just printing stuff to terminal. Seems ridiculous.

I guess it comes from syntax highlighting/formatting, which is probably not done incrementally, but over the entire so far displayed block of output, recomputed from the beginning for each new streamed in character. Can't imagine anything else causing the rendering to gradually grind to halt when eg. thinking block is open in opnecode and updates get palpably slow as it grows.

Terminal output itself is fast and consumes almost nothing. You can have 60fps terminal apps that update content every frame and that consume almost no CPU time.

skydhash · 2026-06-05T01:30:22 1780623022

> Terminal output itself is fast and consumes almost nothing. You can have 60fps terminal apps that update content every frame and that consume almost no CPU time.

The TUI mode is a client-server architecture. An analogy would be like an html page where all content is updated server side. Try to do 60 fps and you’ll have flickering as well.

megous · 2026-06-05T10:57:07 1780657027

No. Fetching pages from remote server will just make the client wait for I/O. That takes 0 CPU load and if the server can't respond at 60fps, lowered redrawing frequency would mean even less CPU load from the terminal redrawing itself.

This does not explain 100% CPU load these harnesses sometimes exhibit.

skydhash · 2026-06-05T12:03:11 1780660991

If it’s localhost, then it’s just the cpu doing stuff as localhost is a pseudodevice.

Animats · 2026-06-04T21:03:03 1780606983

What is "frame" in this context? Video frame, or something else?

javcasas · 2026-06-04T21:12:08 1780607528

> -> rasterizes them to a 2d screen

> We have a ~16ms frame budget so we have roughly ~5ms to go from the React scene graph to ANSI written.

It looks like video frame, full framebuffer, generated and parsed at 60fps. It surprises me they haven't introduced GPU shaders, 16x oversampling and raytracing. Maybe for next release.

layer8 · 2026-06-04T21:06:07 1780607167

The contents of the terminal screen at any given point in time.

abletonlive · 2026-06-04T22:17:25 1780611445

Care to explain how you'd engineer it instead?

hungryhobbit · 2026-06-04T22:38:05 1780612685

Why would anyone ever do that? Make Claude do it!

godelski · 2026-06-05T22:40:51 1780699251

Problem is that Claude did do it. If you look at the leak it's pretty clear there's a lot of LLM code

mudkipdev · 2026-06-05T00:33:04 1780619584

A reminder that anthropic has great rust/go sdks that they could have written their own tui in.

stevenhuang · 2026-06-04T23:36:22 1780616182

Not use react native for a cli app for one, lol.

Ratatouille rust cli lib will be a good start.

munificent · 2026-06-04T21:04:41 1780607081

As someone who maintains a roguelike with a terminal-like UI that:

1. Maintains an internal representation of what the game thinks is on screen.

2. Runs the game for one frame which updates that representation.

3. Generates a diff to see how that differs from what's actually on screen.

4. Executes the minimum set of draw calls to get the screen to match the internal representation.

It's really not that hard. It's a few hundred lines of code.

javcasas · 2026-06-04T21:09:51 1780607391

Sure. For a videogame.

> -> rasterizes them to a 2d screen

Also you forgot "render to a framebuffer, then parse the framebuffer back to chars".

Anyway, I'm off to construct the new `ls` command. It will render the list of files to a mesh of billions of polygons in a GPU with advanced shaders, 16x oversampling, HDR and all the graphic acronyms I don't understand, then read the resulting image, find the nearest character in the ANSI charset and use that one.

It will be _glorious_ (and profoundly stupid)

ux266478 · 2026-06-04T22:14:59 1780611299

Could be improved. Encode the image to webp with high compression settings and handle the ASCII mapping by spinning up a local LLM to do OCR on it. Individually. For each cell.

javcasas · 2026-06-05T09:14:12 1780650852

Thanks for the idea for V2.0. Hopefully the Claude team doesn't do it first.

munificent · 2026-06-05T04:49:01 1780634941

My roguelike's "graphics" are a simulated terminal, so it's a 2D grid of colored characters. It's essentially a TUI, just like Claude Code, except instead of rendering to a real terminal using ANSI escapes, I render to a web canvas using... something probably more complex than what Claude has to do. It's still not hard.

fc417fc802 · 2026-06-05T08:32:52 1780648372

Vaguely related to your glorious idea. https://www.shadertoy.com/view/NtcGRr

tikimcfee · 2026-06-04T22:43:23 1780613003

lol... I know you meant this comically, but you just called me out and it's glorious: https://glyph3d.dev

I built a truly glyph based instanced quad system to render millions of characters in space at once.

applfanboysbgon · 2026-06-04T20:22:35 1780604555

I hadn't seen that quote before, what an embarrassing thing to go on the internet and write...

replwoacause · 2026-06-04T20:38:31 1780605511

Why the hell does it need to be so complex? People have been making TUIs for decades. Did we need a small game engine to run claude code?

imjonse · 2026-06-04T20:58:12 1780606692

They forgot to add 'make it as simple as possible' in the prompt is one possible cause.

On a more serious note using a react-like lib for TUI in the hope you'll share the codebase with the web version is a more likely explanation. Still not the best idea.

javcasas · 2026-06-04T21:14:34 1780607674

React is not that stupid to re-render in a loop at 60fps and instead waits for changes to happen before re-rendering. It even batches changes and stuff.

the_gipsy · 2026-06-05T00:36:54 1780619814

You don't need React for reactive TUIs - at all. I can understand chosing React for web, but for a TUI it sounds like a really poor idea. And in practice we can see that the claude code TUI is also poor.

uxhacker · 2026-06-05T01:30:24 1780623024

So how much more improvements are there for efficiency in the Claude code base if they are using react for a tui, in the rest of the code?

I also wonder about the wasted cycles and just the environmental damage caused by all these wasted cpu time . (Edited added a comma for clarity)

comex · 2026-06-04T22:26:06 1780611966

It doesn’t need to be that complex, but it can be that complex without being slow. Claude Code’s interface is extremely simple. It has tons and tons of headroom to tack on performance overhead without it being noticeable at all. You just have to not do dumb things like redraw the entire UI every time a spinner spins.

hungryhobbit · 2026-06-04T22:40:10 1780612810

"We made our app chew up so many unnecessary resources that we can use even more resources in the future, and no one will notice" is not the strongest engineering idea I've ever heard.

refactor_master · 2026-06-05T02:25:50 1780626350

It's like when Bill Gates tried to guess grocery prices. "How much memory does a regular computer have? I don't know, 50 GB? Like a small EC2?"

grogers · 2026-06-04T23:41:23 1780616483

It may not be slow, but this crazy complexity is probably a hint at why it can't even scroll up without jumping to the beginning of time.

Quekid5 · 2026-06-04T20:57:19 1780606639

Must have 120 fps for answers arriving in [buffering] 30 seconds.

shepherdjerred · 2026-06-05T01:40:17 1780623617

It is an excellent example of how LLMs let you try new ideas, even if they aren’t necessarily good ones

wyre · 2026-06-04T21:34:32 1780608872

I can't help but think it's their engineer's and PM's making these decisions, since I know that if you asked Claude to write a TUI there is no world it would recommend whatever the frontend architecture of claude code is.

qwery · 2026-06-04T21:39:46 1780609186

~ "it's not a TUI! <describes an outrageously overengineered TUI> and my dad works at Nintendo"

curses, bud. curses.

It's genuinely difficult to tell how much of this is true. The post is obviously 100% posturing, but some of the words describe things that could be done.

Very few game engines do anything I'd describe as rasterisation. That's kind of the point of a GPU. Well, it used to be. I suppose "small game engines" might be more likely on average to include a rasteriser. The typical reason for this is because the author wanted to write it. Whereas big engine make triangle give hardware go brrr.

So I assume here 'rasterize' means 'printf'. And diffing screens means diffing 50..150 lines of text. And "generating ANSI sequences to draw" means 'printf' with some ANSI sequences interpolated in.

Then there's the frame budget. You have to understand they are operating within a strict frame budget -- they're not messing around, OK. They have a 16 ms frame budget, so they burned 11 ms and now have a (roughly) ~5 ms approx. budget for the final 'printf' in the chain???

fc417fc802 · 2026-06-05T00:49:14 1780620554

Your broader point is well taken but I thought I'd stop by with some trivia. High end engines such as unreal will rasterize absurd quantities of micro-geometry manually using compute shaders in order to avoid the bottleneck of the hardware rasterizer.

solid_fuel · 2026-06-05T00:59:11 1780621151

> High end engines such as unreal

High end engines such as unreal have the excuse of being tasked with rendering millions of polygons, in which case a complex approach makes sense. Claude Code is only being asked to render a few thousand UTF-8 characters.

fc417fc802 · 2026-06-05T01:27:19 1780622839

Hence my prominent note that it was trivia which implies it to be at least somewhat tangential to the original conversation.

layer8 · 2026-06-04T20:34:02 1780605242

> For each frame our pipeline constructs a scene graph with React then -> layouts elements -> rasterizes them to a 2d screen -> diffs that against the previous screen -> finally uses the diff to generate ANSI sequences to draw

That’s rather sickening.

Fr0styMatt88 · 2026-06-04T20:54:53 1780606493

So I’m wondering what ‘rasterizing’ literally means in this case. I imagine it’s just creating a 2D map of elements at a very low (probably character) resolution, then diffing that against the last generated map to come up with an optimal ANSI sequence to send to the terminal, would that be right?

Seems like a cool puzzle to solve. I wonder what the engineering and organisation tradeoffs were that lead to it — does it let them reuse a bunch of existing code?

I wrote a TUI library back in the day for Turbo Pascal — it was essentially taking an immediate-mode approach (which in this context is just a fancy way of saying it was procedural haha).

fluoridation · 2026-06-04T21:43:01 1780609381

"Rasterizing" means just one thing in this context: to transform a data structure into an array of pixels. It seems absurd to do this, given that the next step must be to convert back from pixels to text data, but maybe they have some way to generate predictable sequences of pixels (e.g. the character "t" is always rendered as the same pattern of pixels), such that they're cheap to convert back.

If they're doing anything else, the word "rasterizing" is being misused.

fc417fc802 · 2026-06-05T00:51:32 1780620692

Yes, the much more plausible explanation is that the word rasterize was misused there. They are generating and diffing text data which has been a standard approach to drawing a TUI since the dawn of computing. It is not even remotely resource intensive.

skydhash · 2026-06-05T01:51:59 1780624319

> They are generating and diffing text data which has been a standard approach to drawing a TUI since the dawn of computing. It is not even remotely resource intensive

No one has ever done that. Even top[0], which does full screen refresh, clear the screen (if necessary) and write the new information (the period is in seconds, not ms). No need to diff. That would be like diffing a file, just to find which bytes to update.

[0]: https://cvsweb.openbsd.org/checkout/src/usr.bin/top/display....

fc417fc802 · 2026-06-05T04:58:01 1780635481

I don't understand why you would make such a confident negative claim rather than ask for an example or otherwise engage in discussion. Particularly given that you replied to a comment elsewhere in this very thread that links to a real world example of exactly such an implementation! [0] See in particular this part of the source. [1]

I agree that most programs don't bother to do that but please recall that my claim was merely that what Claude Code is claimed to be doing with regards to diffing is a well established and long standing optimization. The important point being that it is neither expensive, novel, or particularly complex thus not an excuse for poor performance.

[0] https://news.ycombinator.com/item?id=48405259

[1] https://github.com/emacs-mirror/emacs/blob/c29071587c64efb30...

skydhash · 2026-06-05T11:45:57 1780659957

The emacs code is not purely diffing. They already have the final output, they’re mostly comparing it to see a cheaper way to update than render the output. I’m pretty sure the curses library have the same thing.

But ink, the library Claude is using, defines a tree data structure for the main concept. The diff there is about comparing the old tree and the new tree created by the update, and then updating the node that has changed. That means if a single character change inside a bing panel, the whole thing is rewritten. And if you have something that is updating a lot, that means flickering.

The diffing that ink does is just architecturally wrong. You can create a dom, but a dom is not a concept for the terminal. It’s up to you to optimize its rendering. But just diffing the dom structure like react does is not optimizing, it’s busywork.

yrds96 · 2026-06-05T01:48:21 1780624101

I can't still conceive the fact that a tool that only send/receive text from an external API consumes an absurd amount of RAM

dom96 · 2026-06-04T22:44:36 1780613076

> https://fxtwitter.com

What is this?

nemomarx · 2026-06-05T00:11:44 1780618304

Proxy that makes Twitter links embed on discord, for whatever reason. Something about api access without accounts I assume

f311a · 2026-06-05T07:57:25 1780646245

It used to allow reading replies without being signed in.

Not sure what changed, but now it just redirects me to x.com.

pragmatic · 2026-06-04T20:31:15 1780605075

Somebody read/watched too much Casey Muratori.

CamperBob2 · 2026-06-04T21:20:07 1780608007

No, somebody didn't read/watch enough Casey Muratori.

agumonkey · 2026-06-04T22:12:56 1780611176

this allows for comfortable ergonomics IMO

not that it could be leaner for sure but i get the reasoning behind the tui rendering layer

airstrike · 2026-06-05T03:26:49 1780630009

comfortable ergonomics? you can't scroll up more than 50 lines before it starts to garble up text

i'd be ashamed of publishing software with this level of polish as a solo dev, let alone as the hottest multibillion startup on the planet

agumonkey · 2026-06-05T08:20:14 1780647614

Hmm I thought this was due to me using tmux with claude-code, also it seems that `claude agents` doesn't have this issue.

By comfortable ergonomics, meant the forgiving and asynchronous input system. You can start typing, cancel, retry with previous input, accumulate messages while the agent is active. I don't know all TUIs but this is not common IMO.

Other than that I agree with you.

skydhash · 2026-06-05T12:11:15 1780661475

> You can start typing, cancel, retry with previous input, accumulate messages while the agent is active. I don't know all TUIs but this is not common IMO.

Literally every audio player or anything that uses threads.

agumonkey · 2026-06-05T12:51:48 1780663908

good point, i didn't classify tui audio players in a way, they don't converse, they allow asynchronous effects and stacking, that said i might be lagging about these, last i used was mocp, any names i should check out ?

orliesaurus · 2026-06-04T22:48:40 1780613320

when they announced /pet mode or whatever - that was really the end of the line for me.

ariwilson · 2026-06-05T03:09:53 1780628993

Maybe Claude is operating at a higher, self-improving level than all of us poor HN commenters. Wasting the local machine's resources to look pretty is a plausibly deniable way to make the Claude Code FE unusable with local LLMs, starving the competition.

PunchyHamster · 2026-06-04T21:40:50 1780609250

Well it runs on something they didn't design (Electron) using GUI library they didn't design (React)

For company with that much AI you'd think if it was actually good, doing that part in fast and performant way would be "easy"

f311a · 2026-06-04T22:29:04 1780612144

It runs in a terminal, it’s not electron

overgard · 2026-06-04T21:52:14 1780609934

And yet, nobody that writes game engines would do it this way because game engines need to be efficient..

0xbadcafebee · 2026-06-04T22:07:04 1780610824

If they used an actual game engine to render a 3D UI from scratch it would be more efficient

andai · 2026-06-04T19:56:10 1780602970

Try 64K! https://en.wikipedia.org/wiki/Turbo_Pascal

Also remember when XP was super bloated cause it needed 64MB?

TimMeade · 2026-06-04T20:05:37 1780603537

I loved Turbo Pascal....

bigbuppo · 2026-06-04T20:27:43 1780604863

I loved XP. My laptop had 256MB of RAM.

Erenay09 · 2026-06-04T20:47:05 1780606025

I dont think they need to optimize their infrastructure (at least not from their perspective). They have high-end PCs with 64GB of RAM, so 1GB doesn't matter to them. For example, I have 8GB of RAM, and I make my apps very performant. Honestly, I probably wouldn't bother if I had 16GB+ of RAM

tjwebbnorfolk · 2026-06-04T22:21:36 1780611696

The purpose of RAM is to be used.

solid_fuel · 2026-06-05T00:56:51 1780621011

> The purpose of RAM is to be used.

For useful things, by the computer's owner. It's not there to be used just because Anthropic can't be bothered to give a shit about the quality of their product.

abletonlive · 2026-06-04T22:13:34 1780611214

> which eats 1GB+ of RAM. Meanwhile, my editor only consumes 80MB of RAM

And why are you comparing Claude Code to your editor?

> They can't even improve Claude Code

That depends on how you define "improve". They've added a ton of features to it over time. Who said minimizing RAM usage was something they are prioritizing right now?

wild_egg · 2026-06-04T22:17:44 1780611464

> why are you comparing Claude Code to your editor?

Because the editor does more. All the compute-intensive parts of the agent are in the cloud. Zero reason for an agent harness to require anything beyond a potato to run.

javascriptfan69 · 2026-06-04T22:35:08 1780612508

Do you work for Anthropic or something?

You seem weirdly invested in defending bad decisions.

Even if you're and AI booster, shouldn't you want a better UI?

They're a multi billion dollar company. Surely they can dedicate a small amount of their resources to improving UX?

solid_fuel · 2026-06-05T01:04:07 1780621447

> And why are you comparing Claude Code to your editor?

Because Claude Code is also used to - get this - EDIT CODE. It fills the same purpose as an editor, it just has extra hooks for their agentic garbage.

f311a · 2026-06-04T16:51:17 1780591877

I hope he does not use it and just wanted to advertise his project to get some Github stars...

derwiki · 2026-06-04T17:47:26 1780595246

Why do you hope that?

f311a · 2026-06-04T07:29:26 1780558166

You can ask LLMs about high-level techniques, and their answers will usually be good enough. What you can't get from LLMs is the taste and judgment, which you can only obtain by having a strong CS base and coding manually for years.

High-level techniques were never a problem. You could Google tens of articles on this topic. They are useless too, it's like learning how to drive a racing bicycle from reading a book. Sure, you will know a lot about nuances, but you will fail miserably when it comes to a real race.

smallstepforman · 2026-06-04T08:08:03 1780560483

The other day I just wanted to loop through characters in a std::string to copy data to a new string with a few escape characters (sending to peripheral device). Simple enough task for AI. I got a coroutine monstrocity back, with copies to std::array and a range based iterator, since I specified C++23. If I specified C++11, I would have received a: char p = src.data(); while (p) { … p++; }

I had the experience to keep calling out AI to simplify and downgrade the solution to something primitive, which ended up smaller, faster, easier to maintain. Juniors with real world experience would not bother, they’ll take the first working AI result.

ListeningPie · 2026-06-04T07:37:13 1780558633

taste and judgment, which you can only obtain by having a strong CS base and coding manually for years.

I disagree, the definers of taste; art and food critics, movie and book reviewers, don’t need to have learned the craft by doing. Taste is a separate skill.

TheOtherHobbes · 2026-06-04T11:45:34 1780573534

No one seriously expects a food critic to be able to cook a Michelin-starred meal. The job of that kind of critic is to be insightful and entertaining, and it's very different to the taste required to create top quality food, which is a combination of solid technical skill and creative flair.

Taste in coding is a combination of insight, experience, native talent, technical skill, and flair. Tasteful coding produces clever but straightforward minimal elegant solutions that an average developer can't imagine but can adapt and maintain.

Der_Einzige · 2026-06-04T12:55:50 1780577750

If critics were forced to be actually skilled at the craft of creation, the world would be infinitely better off. Both the cello and the player are better off by the cello maker also finally being the cello player. Alienation was a mistake and this part Marx of all people understood well.

This is why "critical thinking" is a meme. Being a critic takes no skill. I want far fewer critics and far more constructive thinking. GenAI being the ultimate constructor is a bonus.

GoblinSlayer · 2026-06-04T10:52:21 1780570341

I'd say taste is a consequence of lifestyle, which is learned by doing. And art critics often have bad lifestyle, which is visible in their bad taste. When art is virtual life, it would define a lifestyle, which is adopted by doing, in its turn producing taste.

eueje · 2026-06-04T10:57:19 1780570639

Agreed.

Taste implicitly requires discipline of what one chooses to expose themself to and what not to.

adamcharnock · 2026-06-04T08:01:25 1780560085

> which you can only obtain by having a strong CS base and coding manually for years.

I hope this isn’t the case. It is the route I took, but it also doesn’t seem to be a likely route going forward. Strong CS grounding is feasible for sure, but I have a hard time believing that a meaningful number of people will be spending the requisite years coding manually.

xiaoyu2006 · 2026-06-04T07:38:07 1780558687

Exactly. Repeating or rephrasing a definition is trivial, teaching someone is not.

f311a · 2026-06-03T13:45:45 1780494345

Can you show me an example of a hard task that can't be achieved using light models? When we don't want the model to work on autopilot without reviewing the code at all. Even SOTA models will produce garbage code, if you don't guide them all the time.

Hard tasks require a lot of guidance and code reviewing, unless you are creating another throw away project where correctness, maintainability and code understanding does not matter.

f311a · 2026-06-03T13:32:30 1780493550

How many more months do we need to wait, until big companies realize that flash models work just fine if you:

1) Don't ask LLMs for big changes

2) Review everything and point them in the right direction

Large models still suck at big changes, they produce questionable architecture and you still have to review the code, if your project is serious enough.

The codebase quickly become a mess, if you don't pay enough attention. Does not matter which model.

So why bother with big models, when flash models are 10x cheaper and much faster to iterate under guidance? Large models can be used for security and bug audits. Flash models work almost the same for changes under 300 LOC when you dictate how you want your code to look.

_jab · 2026-06-03T19:34:47 1780515287

It's pretty simple; organizations are willing to tolerate paying $1500/month/engineer, which seems to be roughly inline with "normal" consumption for most full-time engineers. If that number grows significantly, then I bet companies will start exploring flash models more, as you propose.

lavezzi · 2026-06-03T19:55:49 1780516549

They are willing to tolerate it now, which is quite a switch up from the free for all we had a few weeks ago, and if they aren’t able to tie in this new ~$1500p/m cap to demonstrable productivity and revenue increases then that will be kneecapped even faster

phreeza · 2026-06-04T06:26:54 1780554414

There are plenty of expenses in this order of magnitude that are not tied to direct increases in productivity. I think it may become a serious hiring impediment for companies to be really skimpy on these budgets for example.

minraws · 2026-06-04T16:33:39 1780590819

There was a time when some employees wanted 1000$ per month for rent, imagine that.

It's absolutely insane, 1500 * 12 is north of 17K dollars, I know that in Google outside of few specific cities and roles.

Getting a 17k bump in salary is good enough to switch, if I was being 17k extra I am more than willing to use my local qwen and hand code most if not all stuff.

Companies can pay for code review tools to make life easier but writing code with AI if it's 10-15% pay cut is just too much.

Everyone is happy right now because this money hasn't been a line item in your salary/benefits.

Imagine 10k yearly AI allowance, I will probably just ask to keep that money.

All the work I do if I was judicious I could do just as much with a 20$ spend or on a local model.

Few tasks need Mythos like models, and if your task does you are already doing too much with AI

aiisjustanif · 2026-06-04T17:04:01 1780592641

I mean we saw this with cloud spending and especially with logging and database read write cost across numerous companies.

It’s a clear pattern in service delivery for software for a while now. Hell for many goods and services in general, like Uber rides themselves.

Start cheap, get some vendor lock in, service provider reduces discounts, consumer notices and then reacts to the price by reducing consumption.

rudedogg · 2026-06-03T20:39:31 1780519171

> organizations are willing to tolerate paying $1500/month/engineer

One organization, that is a software company

> which seems to be roughly inline with "normal" consumption for most full-time engineers

My peers are using $20/mo plans, only a handful are using more than $100/mo in tokens. We haven’t had any limits imposed yet.

epolanski · 2026-06-03T22:25:13 1780525513

Which organizations?

Uber is not representative of any trend beyond big tech and VC over funded startups.

mrothroc · 2026-06-03T19:25:18 1780514718

The easy decision is to just go with the biggest SOTA model you can afford.

But this overlooks the other critical part of getting the most out of these things: the harness. I run an autonomous plan/design/code/build/test pipeline with agents using my own orchestrator. Different models are better at different stages, and I use LLMs to judge the output between them. Not everything needs Opus 4.8.

The harness provides both the scaffolding to get the right things into the model, and the right things out. But it also lets you dictate which model does which work.

It's the pipeline, not the model, that gets you quality at a given token budget.

chaoz_ · 2026-06-04T14:04:05 1780581845

There is something about using the most advanced tooling possible. Why would you pay for IntelliJ, if Eclipse can do the same thing a bit worse?

You want to master your craft, develop "optimal" systems, understand where things are going by utilizing SOTA.

You can call it FOMO, but you get the point.

jmtulloss · 2026-06-03T22:21:13 1780525273

Is your argument that $1500 / mo is too much? Why would the engineering team not be more rigorous in their model selection given a constraint?

gravypod · 2026-06-03T22:45:27 1780526727

If you had a business task to complete that was only possible with ai and it cost you >$1500/month of work, how long would you have to delay the task so that it's cheaper long run to buy hardware and do local models?

$1,500/mo * 14 months = $21,000.

If local models are 14mo behind as many in HN say it may be profitable to just wait. Maybe just spend a few hundred dollars of your tokens and buy hardware piece by piece.

therealdrag0 · 2026-06-04T05:23:02 1780550582

Nearly no one is doing anything that is “only possible with AI”. This doesn’t seem like a relevant calculation. People spend on AI as an investment in their current productivity.

pchristensen · 2026-06-04T00:04:03 1780531443

There's a lot of opportunity cost to waiting 14 months to build something.

garrickvanburen · 2026-06-04T01:28:40 1780536520

I agree, outside of the AI bubble, there's a lot of wait-and-see happening in the B2B world right now, I'd say we're currently 6-8 months into that 14 months.

edmundsauto · 2026-06-04T02:50:48 1780541448

It also presupposes that open models will bridge that gap towards opus4.5, which was really when I drank the AI coding koolaid

econ · 2026-06-03T19:11:06 1780513866

I wonder to what extent models should figure out which model to forward a query to. Or perhaps the big models could learn the difference between an easy and a hard question and charge accordingly? Perhaps, if it can measure complexity, even generate a quote?

Small models are fine for small coding tasks but I don't see why big ones can't be broken down most of the time.

AgentMasterRace · 2026-06-03T19:56:32 1780516592

Many harnesses do this, I've recently dropped all my big subscriptions for using deepseek. Codewhale (formerly deepseek-tui) will use pro for large tasks and route smaller ones to flash. It's pretty good, but I just use pro and everything as the cost is quite low.

This one does not have routing, but reasonix is insane, absolutely insane for saving money. I've used 1.3billion tokens at the cost of 4$. (99-100% cache hit)

ValentineC · 2026-06-03T19:18:03 1780514283

> I wonder to what extent models should figure out which model to forward a query to. Or perhaps the big models could learn the difference between an easy and a hard question and charge accordingly?

This sounds like something a harness could do (and might already be doing), with work delegated to subagents running on lower-cost models.

jorl17 · 2026-06-03T21:01:42 1780520502

Yes, they are all already doing this

warmwaffles · 2026-06-03T18:45:53 1780512353

> Don't ask LLMs for big changes

> Review everything and point them in the right direction

Sorry upper management doesn't care. That's an engineering problem that you need to solve.

eikenberry · 2026-06-03T18:55:58 1780512958

They were proposing a solution.. To use flash models and use them in a way that best amplifies your work.

AgentMasterRace · 2026-06-03T19:51:47 1780516307

He was making a joke.

warmwaffles · 2026-06-04T16:13:56 1780589636

Indeed I was. But that's lost on people here.

eikenberry · 2026-06-04T19:12:02 1780600322

Maybe because it wasn't funny? ;)

warmwaffles · 2026-06-05T00:25:31 1780619131

You win some, you lose many. I savor the times I win.

lanthissa · 2026-06-04T20:02:58 1780603378

opus to produce workflows, flash 3.5 to do them.

Chinese models prob work too, but idk since i cant use them at work

andersmurphy · 2026-06-03T20:39:52 1780519192

This a thousand times. The bigger models also have a habit of overcomplicating things.

epolanski · 2026-06-03T22:24:14 1780525454

I'm legit annoyed at opus 4.8 at any setting above 4.8.

I believe it can be great for vibe coding, but mundane day work? Hell no, I'd rather work with Haiku. It's too slow, checks too many things, it's annoying as hell.

f311a · 2026-06-01T19:45:56 1780343156

This web site is very hard to read because of the colors and font sizes.

jbvlkt · 2026-06-01T20:16:26 1780344986

I use firefox reader view for websites like this.

201984 · 2026-06-02T01:07:05 1780362425

I liked it. It has character.

redlewel · 2026-06-01T22:54:04 1780354444

Had to zoom to 150% to make it bearable

prmoustache · 2026-06-01T20:40:50 1780346450

It is targeted at tech literate people who obviously know about reader's mode, that they can load a custom css or not even load a css and set the font and size of their choice from their web client.

hluska · 2026-06-01T22:12:31 1780351951

It’s still ugly as hell, the contrast ratios are godawful and it makes me question why I’d even do that to my eyes. Target or not, this is really hard to read.

bastardoperator · 2026-06-01T20:21:15 1780345275

Agreed, this is kind of the perfect use case for AI. I can see the prompt now "using css, make this website readable and use a proper color scheme"

rurp · 2026-06-01T23:21:33 1780356093

Or... just hit the reader mode browser button that has been standard for years longer than LLMs have existed.

I had a coworker talking about how great some AI coding tool is because it can sort a bunch of strings alphabetically for him. This coworker is a programmer. Working with a language that has a builtin sort function.

48terry · 2026-06-01T23:52:26 1780357946

Plus the dozens of other means of changing a website's styling from the user's side.

Does anything wrap up the AI craze more succinctly than boldly calling AI the perfect use case for an already-solved problem?

f311a · 2026-05-25T07:16:21 1779693381

Loops make the code even worse. The more local the changes, the better LLMs at it.

f311a · 2026-05-25T07:12:07 1779693127

It all depends on the code quality bar. If it's high, a lot of tasks will not be completed much faster. The main speed comes from trusting LLMs output. When you review each change and reprompt LLMs to make the code look like you want. Suddenly, things become much slower and reviews/reprompts are very mentally exhausting.

HN For You