For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | weavie's commentsregister

Why?

I think this chap is just joking.

How good are local LLMs at coding these days? Does anyone have any recommendations for how to get this setup? What would the minimum spend be for usable hardware?

I am getting bored of having to plan my weekends around quota limit reset times...


Some claim that some of the recent smaller local models are as good as Sonnet 4.5 of last year and the bigger high-end models can be as almost as good as Claude, Gemini and Codex today, but some say they're benchmaxed and not representative.

To try things out you can use llama.cpp with Vulkan or even CPU and a small model like Gemma 4 26B-A4B or Gemma 4 31B or Qwen 3.5 35-A3B or Qwen3.5 27B. Some of the smaller quants fit within 16GB of GPU memory. The default people usually go with now is Q4_K_XL, a 4-bit quant for decent performance and size.

https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF

https://huggingface.co/unsloth/gemma-4-31B-it-GGUF

https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF

https://huggingface.co/unsloth/Qwen3.5-27B-GGUF

Get a second hand 3090/4090 or buy a new Intel Arc Pro B70. Use MoE models and offload to RAM for best bang for your buck. For speed try to find a model that fits entirely within VRAM. If you want to use multiple GPUs you might want to switch to vLLM or something else.

You can try any of the following models:

High-end: GLM 5.1, MiniMax 2.7

Medium: Gemma 4, Qwen 3.5

https://unsloth.ai/docs/models/minimax-m27

https://unsloth.ai/docs/models/glm-5.1

https://unsloth.ai/docs/models/gemma-4

https://unsloth.ai/docs/models/qwen3.5

https://github.com/ggml-org/llama.cpp


Thank you, I'll look into it. For someone who is used to just working with second hand thinkpads, this stuff gets expensive fast!

The very best open models are maybe 3-12 months behind the frontier and are large enough that you need $10k+ of hardware to run them, and a lot more to run them performantly. ROI here is going to be deeply negative vs just using the same models via API or subscription.

You can run smaller models on much more modest hardware but they aren't yet useful for anything more than trivial coding tasks. Performance also really falls off a cliff the deeper you get into the context window, which is extra painful with thinking models in agentic use cases (lots of tokens generated).


You can also run these models on the cloud with Ollama. You might say what's the difference, but these are models whose performance will stay consistent over time, whether run locally or in the cloud. For $200 a year I'm getting some pretty fantastic results running GLM 5.1 and even Minimax 2.7 and Kimi 2.5 and Gemma 4 on Ollama's cloud instances. And if you don't like Ollama's cloud instance, you can run it on your own cloud instance from the very same providers that Ollama is using. They use NVIDIA cloud providers (NCPs) although not sure which ones specifically and claims that the "cloud does not retain your data to ensure privacy and security." [https://ollama.com/blog/cloud-models]

Interesting. On the pricing page, there are still limits placed on the usage. How restrictive have you found them?

What are the best open models?

This includes open and closed models ranked by popularity and other metrics.

https://openrouter.ai/rankings


This includes open and closed models ranked by popularity and other metrics.

https://openrouter.ai/rankings


You could make it happen in about a week and $50 worth of tokens.

I don't think it's so clear cut. The problem is that his personality defects have allowed him to be influenced by people who are truly malevolent. Those people lurk more in the shadows and so avoid the condemnation that they deserve. Trump is their obvious useful idiot with the target painted on his head.


I've worked at companies before where they have balked at spending $300 to buy me a second hand thinkpad because I really wanted to work on a Linux machine rather than a mac. I don't see them throwing $unlimited at tokens to find vulnerabilities, at least until after it's too late.


I think you’re right that they’re going to skimp as much as regulators & the market let them, but that Thinkpad would cost a lot more than $300: a new platform is an ongoing cost for maintenance, security, and interoperability – not crushing, but those factors quickly outweigh the hardware.


I get the ocassional file named `1` lying around.


Have you tried reading the documentation it generates?


As someone in the UK for whom that link is blocked, I wonder if that meme is doubly apt.


Children successfully protected. Nevermind the child groomers on Roblox.


What is the upper limit of Hertzner? Say you have an AWS bill in the $100s of millions, could Hertzner realistically take on that scale?


An interesting question, so time for some 100% speculation.

It sounds like they probably have revenue in the €500mm range today. And given that the bare metal cost of AWS-equivalent bills tends to be a 90% reduction, we'll say a €10mm+ bare metal cost.

So I would say a cautious and qualified "yes". But I know even for smaller deployments of tens or hundreds of servers, they'll ask you what the purpose is. If you say something like "blockchain," they're going to say, "Actually, we prefer not to have your business."

I get the strong impression that while they naturally do want business, they also aren't going to take a huge amount of risk on board themselves. Their specialism is optimising on cost, which naturally has to involve avoiding or mitigating risk. I'm sure there'd be business terms to discuss, put it that way.


Why would a client who wants to run a Blockchain be risky for Herzner? I'm not a fan, I just don't see the issue. If the client pays their monthly bill, who cares if they're using the machine to mine for Bitcoin?


They are certain to run the machines at 100% continually, which will cost more than a typical customer who doesn't do this, and leave the old machines with less second-hand value for their auction thing afterwards.


I’d bet that main reason would be power. Running machines at 100% doesn’t subtract much extra , but a server running hard for 24 hours would use more power than a bursty workload.

(While we’re all speculating)


Also very subject to wildly unstable market dynamics. If it's profitable to mine, they'll want as much capacity as they can get, leading Hetzner to over provision. Then once it becomes unprofitable, they'll want to stop all mining, leaving a ton of idle, unpaid machines. Better to have stable customers that don't swing 0-100 utilization depending on ability to arbitrage compute costs.

I wouldn't be surprised if mining is also associated with fraud (e.g. using stolen credit cards to buy compute).


Who are you thinking of?

Netflix might be spending as much as $120m (but probably a little less), and I thought they were probably Amazon's biggest customer. Does someone (single-buyer) spend more than that with AWS?

Hertzner's revenue is somewhere around $400m, so probably a little scary taking on an additional 30% revenue from a single customer, and Netflix's shareholders would probably be worried about risk relying on a vendor that is much smaller than them.

Sometimes if the companies are friendly to the idea, they could form a joint venture or maybe Netflix could just acquire Hertzner (and compete with Amazon?), but I think it unlikely Hertzner could take on Netflix-sized for nontechnical reasons.

However increasing pop capacity by 30% within 6mo is pretty realistic, so I think they'd probably be able to physically service Netflix without changing too much if management could get comfortable with the idea


A $120M spend on AWS is equivalent to around a $12M spend on Hetzner Dedicated (likely even less, the factor is 10-20x in my experience), so that would be 3% of their revenue from a single customer.


> A $120M spend on AWS is equivalent to around a $12M spend on Hetzner Dedicated (likely even less, the factor is 10-20x in my experience), so that would be 3% of their revenue from a single customer.

I'm not convinced.

I assume someone at Netflix has thought about this, because if that were true and as simple as you say, Netflix would simply just buy Hetzner.

I think there lots of reasons you could have this experience, and it still wouldn't be Netflix's experience.

For one, big applications tend to get discounts. A decade ago when I (the company I was working for) was paying Amazon a mere $0,2M a month and getting much better prices from my account manager than were posted on the website.

There are other reasons (mostly from my own experiences pricing/costing big applications, but also due to some exotic/unusual Amazon features I'm sure Netflix depends on) but this is probably big enough: Volume gets discounts, and at Netflix-size I would expect spectacular discounts.

I do not think we can estimate the factor better than 1.5-2x without a really good example/case-study of a company someplace in-between: How big are the companies you're thinking about? If they're not spending at least $5m a month I doubt the figures would be indicative of the kind of savings Netflix could expect.


We run our own infrastructure, sometimes with our own fincing (4), sometimes external (3). The cost is in tens of millions per year.

When I used to compare to aws, only egress at list price costs as much as my whole infra hosting. All of it.

I would be very interested to understand why netflix does not go 3/4 route. I would speculate that they get more return from putting money in optimising costs for creating original content, rather than cloud bill.


> I would be very interested to understand why netflix does not go 3/4 route. I would speculate that they get more return from putting money in optimising costs for creating original content, rather than cloud bill.

I invest in Netflix, which means I'm giving them some fast cash to grow that business.

I'm not giving them cash so that they can have cash.

If they share a business plan that involves them having cash to do X, I wonder why they aren't just taking my cash to do X.

They know this. That's why on the investors calls they don't talk about "optimising costs" unless they're in trouble.

I understand self-hosting and self-building saves money in the long-long term, and so I do this in my own business, but I'm also not a public company constantly raising money.

> When I used to compare to aws, only egress at list price costs as much as my whole infra hosting. All of it.

I'm a mere 0,1% of your spend, and I get discounts.

You would not be paying "list price".

Netflix definitely would not be.


Of course netflix is optimising costs, otherwise it would not be a business, I just think they put much more effort elsewhere. They could be using other words, like "financial discipline" :)

My point is that even if I get 20 times discount on egress its still nowhere close, since i have to buy everything else - compute, storage are more expensive, and even with 5-10x discounts from list price its not worth it.

(Our cloud bills are in the millions as well, I am familiar with what discounts we can get)


Figma apparently spends around 300-400k/day on AWS. I think this puts them up there.


How is this reasonable? At what point do they pull a Dropbox and de-AWS? I can’t think of why they would gain with AWS over in house hosting at that point.

I’m not surprised, but you’d think there would be some point where they would decide to build a data center of their own. It’s a mature enough company.


That $120m will become $12m when they're not using AWS.


> Hertzner's revenue is somewhere around $400m, so probably a little scary taking on an additional 30% revenue from a single customer

A little scare for both sides.

Unless we're misunderstanding something I think the $100Ms figure is hard to consider in a vacuum.


I'm largely just thinking $HUGE when throwing out that number, but there are plenty of companies that have cloud costs in that range. A quick search brings up Walmart, Meta, Netflix, Spotify, Snap, JP Morgan.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You