More

chatmasta · 2026-04-22T01:51:54 1776822714

> despite no longer being in vogue with consumer devs

Is it in vogue with enterprise devs?

chatmasta · 2026-04-22T01:49:38 1776822578

3x growth in ten years is the “most generous” estimate?

darth_avocado · 2026-04-22T02:10:53 1776823853

Yes because outside Starlink and govt contracts, there isn’t that massive of a demand growth in the sector. There a limit to how many satellites can be in orbit at a time and land based telecom infrastructure makes it so that satellite based infra isn’t necessary unless you’re in remote areas.

inemesitaffia · 2026-04-22T02:38:38 1776825518

Starlink is already most of the revenue.

What's the point of the except?

The main problem is the AI stuff.

chatmasta · 2026-04-22T01:48:57 1776822537

They are decades ahead of their nearest competition, in multiple verticals, and their barrier to entry is a literal gravity well.

sroussey · 2026-04-22T02:52:20 1776826340

All the money they are burning is for grok. And it is not decades ahead.

plugger · 2026-04-22T02:46:15 1776825975

BO has entered the chat New Glenn and are arguably equal to Super Heavy given they've also recovered and reused their heavy booster.

I think you're going to be surprised at the level of competition BO provides SpaceX in the Artemis program.

chatmasta · 2026-04-21T20:32:21 1776803541

I’m not clear on it either. Was the Context.ai OAuth application compromised? So the threat actor essentially had the same visibility into every Context.ai customer’s workspace that Context.ai has? And why is a single employee being blamed? Did this Vercel employee authorize Context.ai to read the whole Vercel workspace?

chatmasta · 2026-04-21T20:30:35 1776803435

Next.js renders configuration that’s shared by client and server into a JSON blob in the HTML page. These config variables often come from environment variables. It’s a very common mistake for people to not realize this, and accidentally put what should be a server-only secret into this config. I’ve seen API secrets in HTML source code because of this. The client app doesn’t even use it, but it’s part of the next config so it renders into the page.

socalgal2 · 2026-04-21T20:40:26 1776804026

IIRC, react had this issue so they required env vars seen in react to be prefixed by REACT_ The hope being that SECRET is not prefixed and so is not available. Of course it requires you to know why they are prefixed and not make REACT_SECRET

whh · 2026-04-21T20:38:23 1776803903

That's essentially what NEXT_PUBLIC_ is for... but serializing process.env is a new one for me.

chatmasta · 2026-04-21T20:54:43 1776804883

They don’t serialize process.env, but devs will take config values from environment variables. Obviously you’re not supposed to do this but it’s a footgun.

chatmasta · 2026-04-20T22:40:30 1776724830

What about Apple Silicon?

Krastan · 2026-04-20T23:02:55 1776726175

Its been more than 5 years since the M1 came out in Nov 2020

pzo · 2026-04-20T23:04:50 1776726290

yes they innovated with apple sillicon but I would say it only shines in macOS environment. On iOS / iPadOS it's completely untapped - like having ferrari with only gravel roads around.

p1necone · 2026-04-21T02:36:53 1776739013

The level of power in the iPad, and the level of underutilization of that power due to it being handicapped by the OS is mindboggling to me. Although to some extent it makes sense - with Apple owning the whole supply chain it probably wouldn't actually save them much money to make a less powerful chip just to put in it, and they need selling points for the top end models.

asimovDev · 2026-04-21T07:01:42 1776754902

>ferrari with only gravel roads around

sounds like a ton of fun to me. Just sending it rally-style everywhere :)

a better comparison is buying a Ferrari to drive around your town at 40 km/h

Danox · 2026-04-21T05:14:01 1776748441

And yet it is the best tablet you can buy on the planet top to bottom software and hardware, is it perfect no, what is this phantom alternative to an iPad M4 Pro? Note I already have a desktop computer. I don’t need two of the same thing in short I don’t need Mac OS on two devices.

adastra22 · 2026-04-20T23:02:36 1776726156

Believe it or not, more than five years ago.

chatmasta · 2026-04-20T17:13:14 1776705194

Who are you paying $10/month? OpenRouter?

0xbadcafebee · 2026-04-20T23:39:57 1776728397

OpenCode Go, BlackBox, Chutes. https://codeberg.org/mutablecc/calculate-ai-cost/src/branch/...

chatmasta · 2026-04-21T00:10:32 1776730232

I find Chutes very intriguing… has anyone used it? I found it when I started wondering what sort of $/performance I could get by simply renting GPU machines by the hour and running my own inference.

tgrowazay · 2026-04-20T17:54:15 1776707655

https://platform.minimax.io/docs/guides/pricing-token-plan

chatmasta · 2026-04-20T17:12:39 1776705159

Is this going to be an open weights model or not? The post doesn’t make it clear. It seems the weights are not available today, but maybe that’s because it’s in preview?

zozbot234 · 2026-04-20T17:14:54 1776705294

The Max series has never been open.

chatmasta · 2026-04-20T17:10:02 1776705002

So we could use this with Postgres by putting DuckDB in front with its Postgres extension, pointing to the source data in PG?

georgestagg · 2026-04-20T18:09:32 1776708572

In principle, yes, that’s the idea! However I will say we have focused mainly on the grammar and using DuckDB for reading from local files for this alpha release, so I expect there may be some bugs around connecting to remote databases still to iron out!

chatmasta · 2026-04-20T21:32:03 1776720723

Dunno, I expect if DuckDB works as advertised it might just work! That's the beauty of how they've separated the syntax parsing into frontend/backend from the rest of the engine.

chatmasta · 2026-04-19T14:35:58 1776609358

Claude Code system prompt diffs are available here: https://cchistory.mariozechner.at/?from=2.1.98&to=2.1.112

(URL is to diff since 2.1.98 which seems to be the version that preceded the first reference to Opus 4.7)

dhedlund · 2026-04-19T15:20:55 1776612055

The "Picking delaySeconds" section is quite enlightening.

I feel like this explains about a quarter to half of my token burn. It was never really clear to me whether tool calls in an agent session would keep the context hot or whether I would have to pay the entire context loading penalty after each call; from my perspective it's one request. I have Claude routinely do large numbers of sequential tool calls, or have long running processes with fairly large context windows. Ouch.

> The Anthropic prompt cache has a 5-minute TTL. Sleeping past 300 seconds means the next wake-up reads your full conversation context uncached — slower and more expensive. So the natural breakpoints:

> - *Under 5 minutes (60s–270s)*: cache stays warm. Right for active work — checking a build, polling for state that's about to change, watching a process you just started.

> - *5 minutes to 1 hour (300s–3600s)*: pay the cache miss. Right when there's no point checking sooner — waiting on something that takes minutes to change, or genuinely idle.

> *Don't pick 300s.* It's the worst-of-both: you pay the cache miss without amortizing it. If you're tempted to "wait 5 minutes," either drop to 270s (stay in cache) or commit to 1200s+ (one cache miss buys a much longer wait). Don't think in round-number minutes — think in cache windows.

> For idle ticks with no specific signal to watch, default to *1200s–1800s* (20–30 min). The loop checks back, you don't burn cache 12× per hour for nothing, and the user can always interrupt if they need you sooner.

> Think about what you're actually waiting for, not just "how long should I sleep." If you kicked off an 8-minute build, sleeping 60s burns the cache 8 times before it finishes — sleep ~270s twice instead.

> The runtime clamps to [60, 3600], so you don't need to clamp yourself.

Definitely not clear if you're only used to the subscription plan that every single interaction triggers a full context load. It's all one session session to most people. So long as they keep replying quickly, or queue up a long arc of work, then there's probably a expectation that you wouldn't incur that much context loading cost. But this suggests that's not at all true.

wongarsu · 2026-04-20T00:22:11 1776644531

They really should have just set the cache window to 5:30 or some other slightly odd number instead of using all those tokens to tell claude not to pick one of the most common timeout values

stingraycharles · 2026-04-20T02:10:17 1776651017

This is somewhat obvious if you realize that HTTP is a stateless protocol and Anthropic also needs to re-load the entire context every time a new request arrives.

The part that does get cached - attention KVs - is significantly cheaper.

If you read documentation on this, they (and all other LLM providers) make this fairly clear.

dhedlund · 2026-04-20T05:42:19 1776663739

For people who spend a significant amount of time understanding how LLMs and the associated harnesses work, sure. For the majority of people who just want to use it, it's not quite so obvious.

The interface strongly suggests that you're having a running conversation. Tool calls are a non-interactive part of that conversation; the agent is still just crunching away to give you an answer. From the user's perspective, the conversation feels less like stateless HTTP where the next paragraph comes from a random server, and more like a stateful websocket where you're still interacting with the original server that retains your conversation in memory as it's working.

Unloading the conversation after 5 minutes idling can make sense to most users, which is why the current complaints in HN threads tend to align with that 1 hour to 5 minute timeout change. But I suspect a significant amount of what's going on is with people who:

* don't realize that tool calls really add up, especially when context windows are larger.

* had things take more than 5 minutes in a single conversation, such as a large context spinning up subagents that are each doing things that then return a response after 5+ minutes. With the more recent claude code changes, you're conditioned to feel like it's 5 minutes of human idle time for the session. They don't warn you that the same 5 minute rule applies to tool calls, and I'd suspect longer-running delegations to subagents.

NitpickLawyer · 2026-04-20T04:23:59 1776659039

Unless I'm parsing your reply very badly, I see no world in which anything dealing with HTTP would be more expensive than dealing with kv cache (loading from "cold" storage, deciding which compute unit to load it into, doing the actual computations for the next call, etc).

stingraycharles · 2026-04-20T10:45:05 1776681905

No, that’s not the issue. What people fail to understand is that every request - eg every message you send, but also tool call responses - require the entire conversation history to be sent, and the LLM providers need to reprocess things.

The attention part of LLMs (that is, for every token, how much their attention is to all other tokens) is cached in a KV cache.

You can imagine that with large context windows, the overhead becomes enormous (attention has exponential complexity).

HN For You