For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | smokel's commentsregister

For those too lazy to watch someone talk on video for ages to make a point:

The link is to a famous YouTuber called PewDiePie and he uses a local LLM to parse his email, to save time with that. They have an autoreply system and get notified about urgent matters.


Most kids that grew up during the timeline you described had no interest in computer architecture. The small minority that did care is probably the same size now.

The other 99% who were into yoyo-ing back then are now into TikTok, that's all.


> The other 99% who were into yoyo-ing back then are now into TikTok, that's all.

Hey dude, some of us were yo-yoing while waiting for Gentoo to build from stage 0. Compiling an OS on a single-core Athlon takes time.

For the 3 days it takes to build all the way up to KDE, you have no computer. Hope you didn’t forget something


Don't forget the fan controllers to try to make it silent and the neon lights. I still have that machine at my parents house, used it a couple of years ago to rip all my teenage CDs to digital formats.

distcc-pump And, I forget what the toolchain setup is called, but on gentoo its literally just `emerge -1av <toolchain-thing> distcc` on machine with beef and just `emerge -1 distcc` on athlon...

I found out how to do it consistently in 2010 and its like black magic knowing how to target a real OS at BS hardware.


I was doing this in 2003 and my computer was also the internet/network router for our house. When that thing was down, you had no access to external information that you didn’t pre-save somewhere.

One time I forgot to install network drivers and had to download them through my flip phone via GPRS and then awkwardly load onto the computer via a clunky USB connection. Fun times.

Also my English wasn’t this good yet. I’m sure it would’ve been a lot easier had I actually understood all the tutorials and documentation fully.


Some of my least favourite nights and most cherished childhood memories involve troubleshooting broken or missing network drivers the only functional Linux box I had working. Never had to use a flip phone, but sure came close a few times.

Nothing I’d ever willingly re-live if given the chance, but always fun to look back on and grin.


I'd wager that even if you didn't nerd out on computer architecture, just living through progression of CDs -> mp3s -> ipods -> streaming gives kids a better grounding than the iPad is where music comes from they have today

At school we have Disney Plus with a box with a thing with a hole in it! https://www.reddit.com/r/KidsAreFuckingStupid/comments/1tv4f...

I would argue yoyo is way more healthy than TikTok.

The difference is more of those people can use local file management than the new generation joining office environments.

Do you swap SIM cards all the time? This seems to be the biggest blocking issue for me.

I tried switching phones once a week, which was heavenly. Might try that again, it requires some discipline.


Apparently some phone companies can/will provide you with aux SIMs, which allows two phone to share a number, or so I've been lead to believe. I can't find a single provider here in Denmark that will issue me such a SIM. Kinda sad about that, because it would solve most of my issues.

I need a smartphone for a few things every so often, but most of the time a dumb phone is perfectly fine.


I do not, the gym phone has its own SIM. I have it running a cheap data-only eSIM from esimdb.com

If you need synchronized phone / text messages, I suggest Google Voice. When anyone rings your (free) Google Voice number, it will forward the call to multiple phone numbers. It will ring as a regular phone call, not as an app notification. However, text messages appear as app notifications.


The academic paper is here: https://arxiv.org/abs/2606.03811

It's not fully described how things work exactly, but apparently it does not transfer entire LLMs as part of the worm. Now that would be interesting :)


The abstract says:

> The worm parasitically uses compromised machines to run open-weight large language models (LLMs) to sustain its reasoning, or extend its reach for further attacks.


Thanks for pointing that out. I scanned the paper and found that in their main experiments, they use a shared GPU resource and do not copy LLMs to target machines. Apparently they did other experiments in the ablation study where they did copy LLMs.

So it's even worse than I expected. The intended worm can spread through my thermostat, and when it reaches a GPU host, it can spread even harder. Fun times ahead.


I wonder if gamma ray memory corruption will induce a sort of mutation and selection effect on non-ecc-memory hosts which will make the worms effectively evolve.

This reminded me of the geth from mass effect. They get smarter as more geth "agents" network together.

What if there is a worm that spread through thermostats and another that spread through smart fridges and they finally infect a laptop with a gpu. They can exchange notes while they are there. Fun times


You'll just have to starve it with a bunch of thermostats that lead it towards the GPU rich honey pot where you will extract it...

I think an approach could be to use some engineered security issue or however people build botnets, and give it some AI llm that is small and minimal but comes with instructions to download models from hugging face, and some other minimal prompts and descriptions of tools. Then it could use this to grow in infected computers and try find more capable and vulnerable computers to run better capable models and also devise some minimal communication between the different points of the botnet. Perhaps set itself a goal to dominate the biggest amount of compute and have some other goal. Would be curious to see what happens.

When the worm makes someone's machine start to sound like a leaf blower, you are found out.

In the abstract, what does it mean "the attacker's marginal cost per new infection is zero"?

If you infect a machine with GPU enough to run the localLLM needed to steal another machine, you can let it burn tokens all day for free because whoever you stole the first one from will pay the electric bill.

We're getting closer to the Matrix's "We do know it was us who blackened the skies"

Well, people used MS-DOS which had basically no security model at all for at least 10 years. Most viruses were benign, but it was almost trivial to simply wipe the entire hard disk. People generally didn't care, and made backups.

Things have become a bit more complicated now that machines are connected all the time, and the risk of infection is no longer limited to physically inserting a floppy disk into a machine.

I suspect that the solution is not so much in trying to make our current systems secure, but to make disconnection more practical.


Looks really nice. Do you plan to support this in the future? Are you planning to foster a community of developers around it, or are you thinking about hosting it as a service?

Asking for a friend, who also enjoys building projects with LLMs, but publishing and supporting them not so much.


yes, I'm planning to support it indefinitely - my company (~10 users) just started using it so we're very invested in it's success.

I'd love for people to write whatever crazy packages they can think of for it so it has a rich ecosystem. It has the ability for admins to add a git or NPM reference into the packages list and install packages on the fly like wordpress supports.

As for hosting as a service, maybe someday. I also own the tinycld.com so who knows.


AutoCAD is $175 per user per month [1].

[1] https://www.autodesk.com/products/autocad/buy


AutoCAD is still the budget-friendly CAD program it has always been. You don't build big boats in AutoCAD.

Winch Design [0], which have built some of the world's largest superyachts [1], seem to be using AutoCad. [2] Afaik it's also the same with Lürssen (but don't quote me on that)

[0] https://winchdesign.com/ [1] https://www.superyachts.com/directory/1516/winch-design/flee... [2] https://www.autodesk.com/design-make/articles/naval-architec...


Likely not the "base model" of AutoCAD.

Those tools are used in ways that they're integral to processes. They have their equivalents of ticket systems that are linked to code repositories with LFSs and bunch of IDE type tools and automated and manual test systems and build systems. Their equivalents of PR discussions and Selenium screenshots needs to check all boxes in the right ways for legal and traceability purposes.

Without all that might be $175/user/month but you're not shipping apps with just vi and bare gcc.


>Without all that might be $175/user/month but you're not shipping apps with just vi and bare gcc.

You're right, Linus uses Emacs.


As someone completely outside the 3D design world who always thought of AutoCAD as the gold standard - really? What program would be used instead? Please enlighten me.

Except LLM's even with Vision are still useless at AutoCAD let alone Revit (please dont quote SCAD LLM's at me, useless). Knowledge based approaches still win.

I might agree "AutoCAD" is the current level LLM's are at, but wait until your design departments discovers "Revit", its another ballpark (in wasted cots, engineers on site still get "clashes").

Revit costs are high, and the end results are marginally better - but local LLM's tokens are cheaper 24/7 at "AutoCAD" level - "Revit" level tokens will make Ubers CTO/COO weep harder than they already do. While producing results no better than "Revit" does (engineers still face "clashes").


Cadence and Ansys have entered the chat. A bunch of other highly-specialized engineering software has entered the chat. Licenses are on the order of 10-100k/seat.

For a pretty funny comment about pricing.

https://www.reddit.com/r/chipdesign/comments/1ajrli2/cadence...


Glad to run into this after some time!

I guess we are welcoming the software people to the world of expensive tools. Just sad that the FOSS alternatives of these tools are not as powerful whereas software industry still has FOSS tools to fall back on.


Does this analysis factor in potential caching of tokens on the server side? It seems that if they organize things well (as a model provider), they can save quite a lot on that. Looking at my Cursor statistics makes it clear that the token calculations are not at all trivial.

I believe the ccusage tool I used takes cached token pricing into account.

For coding assistance, I have tried OpenCode with several large open models through OpenRouter. All were fairly bad compared to Claude Opus. Could you provide some hints on how I should be holding these open models so that I might get more value out of them?

I agree with the common trope that open models lag behind by about a year, but something magical happened just around a year ago when the state of the art models became extremely useful. By this reasoning we're about to see open models perform well, but I'm afraid there is more to it than just waiting for another revolution around the sun.

Note, my application is coding assistance. Open models can be great for other purposes.


I tried almost all OS models on opencode, none of them is on levels as opus 4.7.

In latest experiment I used opus for implementation plan then used cursor composer 2.5 for execution.

I must say that combo is really good. Main drawback of claude code is that is super slow. So when paired with composer that is super fast it flies.


No one is claiming that OS is as good. They are saying it isn't that far behind SOTA commercial products. So why pay exorbitantly just to get something only a few percent better than the free option?

But there have been very good open source office apps for decades and few enterprises use them, so perhaps this is just the nature of B2B purchasing committees and 'nobody getting fired for buying IBM.'


Because failures compound. My productivity has substantially improved since I switched from open models to a Codex subscription, because it doesn't need hand holding, and it doesn't pull stupid tricks occasionally.

Do more planning yourself, be smart about the context, break down tasks into smaller components, give it more guidance. You can't just lazily prompt it to complete large features autonomously and expect good results.

But if the closed-source models can do this without the additional effort, that's a significant gap, no?

See that's the thing, they can't. Every model needs hand holding and guidance.

some require less hand-holding than others though

No one is trying to argue that OS models are better than Opus 4.7. It's simply that they're good enough and cheaper.

The point is that the price gap is so much larger than the capability gap, that even with the extra compute needed to make up for the lack of capability, you can still come out ahead in terms of amortized $/work done.

Is it really when they are hundreds of times more expensive?

That is the 3-6 month sota-open gap people talk about, a time-window that continues to move as new models are released on both sides.

Do you know what economic trade offs are?

Both implicit and explicit..?


+1 .. just wanted to reiterate that this is the answer. The open models work great if you just do a little more of the design/architectural work up front and organize your work appropriately.

a good harness is supposed to do what you are describing. sonnet on pi.dev is pretty terrible but fast. Claude Code has ridiculous amounts of prompt engineering at system prompt level and sub session spawing combined with low temperature, to provide the predictable results people like. CC screws up and you never see, because the harness auto corrects, while on OSS you see everything, and does not comes with the level of monitoring by default.

In order to find out how real humans reply:

Please guess a number between 1 and 100.



nice

6*7=42





Sure!

49.5

√67


i+7up


null


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You