For those too lazy to watch someone talk on video for ages to make a point:
The link is to a famous YouTuber called PewDiePie and he uses a local LLM to parse his email, to save time with that. They have an autoreply system and get notified about urgent matters.
Most kids that grew up during the timeline you described had no interest in computer architecture. The small minority that did care is probably the same size now.
The other 99% who were into yoyo-ing back then are now into TikTok, that's all.
Don't forget the fan controllers to try to make it silent and the neon lights. I still have that machine at my parents house, used it a couple of years ago to rip all my teenage CDs to digital formats.
distcc-pump
And, I forget what the toolchain setup is called, but on gentoo its literally just `emerge -1av <toolchain-thing> distcc` on machine with beef and just `emerge -1 distcc` on athlon...
I found out how to do it consistently in 2010 and its like black magic knowing how to target a real OS at BS hardware.
I was doing this in 2003 and my computer was also the internet/network router for our house. When that thing was down, you had no access to external information that you didn’t pre-save somewhere.
One time I forgot to install network drivers and had to download them through my flip phone via GPRS and then awkwardly load onto the computer via a clunky USB connection. Fun times.
Also my English wasn’t this good yet. I’m sure it would’ve been a lot easier had I actually understood all the tutorials and documentation fully.
Some of my least favourite nights and most cherished childhood memories involve troubleshooting broken or missing network drivers the only functional Linux box I had working. Never had to use a flip phone, but sure came close a few times.
Nothing I’d ever willingly re-live if given the chance, but always fun to look back on and grin.
I'd wager that even if you didn't nerd out on computer architecture, just living through progression of CDs -> mp3s -> ipods -> streaming gives kids a better grounding than the iPad is where music comes from they have today
Apparently some phone companies can/will provide you with aux SIMs, which allows two phone to share a number, or so I've been lead to believe. I can't find a single provider here in Denmark that will issue me such a SIM. Kinda sad about that, because it would solve most of my issues.
I need a smartphone for a few things every so often, but most of the time a dumb phone is perfectly fine.
I do not, the gym phone has its own SIM. I have it running a cheap data-only eSIM from esimdb.com
If you need synchronized phone / text messages, I suggest Google Voice. When anyone rings your (free) Google Voice number, it will forward the call to multiple phone numbers. It will ring as a regular phone call, not as an app notification. However, text messages appear as app notifications.
It's not fully described how things work exactly, but apparently it does not transfer entire LLMs as part of the worm. Now that would be interesting :)
> The worm parasitically uses compromised machines to run open-weight large language models (LLMs) to sustain its reasoning, or extend its reach for further attacks.
Thanks for pointing that out. I scanned the paper and found that in their main experiments, they use a shared GPU resource and do not copy LLMs to target machines. Apparently they did other experiments in the ablation study where they did copy LLMs.
So it's even worse than I expected. The intended worm can spread through my thermostat, and when it reaches a GPU host, it can spread even harder. Fun times ahead.
I wonder if gamma ray memory corruption will induce a sort of mutation and selection effect on non-ecc-memory hosts which will make the worms effectively evolve.
This reminded me of the geth from mass effect. They get smarter as more geth "agents" network together.
What if there is a worm that spread through thermostats and another that spread through smart fridges and they finally infect a laptop with a gpu. They can exchange notes while they are there. Fun times
I think an approach could be to use some engineered security issue or however people build botnets, and give it some AI llm that is small and minimal but comes with instructions to download models from hugging face, and some other minimal prompts and descriptions of tools. Then it could use this to grow in infected computers and try find more capable and vulnerable computers to run better capable models and also devise some minimal communication between the different points of the botnet. Perhaps set itself a goal to dominate the biggest amount of compute and have some other goal. Would be curious to see what happens.
If you infect a machine with GPU enough to run the localLLM needed to steal another machine, you can let it burn tokens all day for free because whoever you stole the first one from will pay the electric bill.
Well, people used MS-DOS which had basically no security model at all for at least 10 years. Most viruses were benign, but it was almost trivial to simply wipe the entire hard disk. People generally didn't care, and made backups.
Things have become a bit more complicated now that machines are connected all the time, and the risk of infection is no longer limited to physically inserting a floppy disk into a machine.
I suspect that the solution is not so much in trying to make our current systems secure, but to make disconnection more practical.
Looks really nice. Do you plan to support this in the future? Are you planning to foster a community of developers around it, or are you thinking about hosting it as a service?
Asking for a friend, who also enjoys building projects with LLMs, but publishing and supporting them not so much.
yes, I'm planning to support it indefinitely - my company (~10 users) just started using it so we're very invested in it's success.
I'd love for people to write whatever crazy packages they can think of for it so it has a rich ecosystem. It has the ability for admins to add a git or NPM reference into the packages list and install packages on the fly like wordpress supports.
As for hosting as a service, maybe someday. I also own the tinycld.com so who knows.
Winch Design [0], which have built some of the world's largest superyachts [1], seem to be using AutoCad. [2] Afaik it's also the same with Lürssen (but don't quote me on that)
Those tools are used in ways that they're integral to processes. They have their equivalents of ticket systems that are linked to code repositories with LFSs and bunch of IDE type tools and automated and manual test systems and build systems. Their equivalents of PR discussions and Selenium screenshots needs to check all boxes in the right ways for legal and traceability purposes.
Without all that might be $175/user/month but you're not shipping apps with just vi and bare gcc.
As someone completely outside the 3D design world who always thought of AutoCAD as the gold standard - really? What program would be used instead? Please enlighten me.
Except LLM's even with Vision are still useless at AutoCAD let alone Revit (please dont quote SCAD LLM's at me, useless). Knowledge based approaches still win.
I might agree "AutoCAD" is the current level LLM's are at, but wait until your design departments discovers "Revit", its another ballpark (in wasted cots, engineers on site still get "clashes").
Revit costs are high, and the end results are marginally better - but local LLM's tokens are cheaper 24/7 at "AutoCAD" level - "Revit" level tokens will make Ubers CTO/COO weep harder than they already do. While producing results no better than "Revit" does (engineers still face "clashes").
Cadence and Ansys have entered the chat. A bunch of other highly-specialized engineering software has entered the chat. Licenses are on the order of 10-100k/seat.
I guess we are welcoming the software people to the world of expensive tools. Just sad that the FOSS alternatives of these tools are not as powerful whereas software industry still has FOSS tools to fall back on.
Does this analysis factor in potential caching of tokens on the server side? It seems that if they organize things well (as a model provider), they can save quite a lot on that. Looking at my Cursor statistics makes it clear that the token calculations are not at all trivial.
For coding assistance, I have tried OpenCode with several large open models through OpenRouter. All were fairly bad compared to Claude Opus.
Could you provide some hints on how I should be holding these open models so that I might get more value out of them?
I agree with the common trope that open models lag behind by about a year, but something magical happened just around a year ago when the state of the art models became extremely useful. By this reasoning we're about to see open models perform well, but I'm afraid there is more to it than just waiting for another revolution around the sun.
Note, my application is coding assistance. Open models can be great for other purposes.
No one is claiming that OS is as good. They are saying it isn't that far behind SOTA commercial products. So why pay exorbitantly just to get something only a few percent better than the free option?
But there have been very good open source office apps for decades and few enterprises use them, so perhaps this is just the nature of B2B purchasing committees and 'nobody getting fired for buying IBM.'
Because failures compound. My productivity has substantially improved since I switched from open models to a Codex subscription, because it doesn't need hand holding, and it doesn't pull stupid tricks occasionally.
Do more planning yourself, be smart about the context, break down tasks into smaller components, give it more guidance. You can't just lazily prompt it to complete large features autonomously and expect good results.
The point is that the price gap is so much larger than the capability gap, that even with the extra compute needed to make up for the lack of capability, you can still come out ahead in terms of amortized $/work done.
+1 .. just wanted to reiterate that this is the answer. The open models work great if you just do a little more of the design/architectural work up front and organize your work appropriately.
a good harness is supposed to do what you are describing.
sonnet on pi.dev is pretty terrible but fast. Claude Code has ridiculous amounts of prompt engineering at system prompt level and sub session spawing combined with low temperature, to provide the predictable results people like. CC screws up and you never see, because the harness auto corrects, while on OSS you see everything, and does not comes with the level of monitoring by default.
The link is to a famous YouTuber called PewDiePie and he uses a local LLM to parse his email, to save time with that. They have an autoreply system and get notified about urgent matters.
reply