This line seems to sum up the Musk Inc. situation perfectly:
> Musk’s detractors have been correct about Tesla’s terrible fundamentals, its Full Self-Driving lies, its robotaxi fantasies, its shaky accounting. But when they have imagined these things might affect the stock price, they have been wrong.
> Someday, someone, somewhere will make a lot of money shorting Tesla or SpaceX. But it’s unlikely to be you.
> For now, Tesla remains better understood as a religion than a financial investment, and we can now add SpaceX to that category.
I can't speak for Teams (that is just an Electron app), but all of the legacy Mac Office apps are still a subset of the capabilities of their Windows counterparts.
If you can't tell the difference between Google Sheets and Excel, you probably won't notice the difference between Mac and Windows Excel. But if you are in some role like finance where you spend a ton of time in Excel, the gaps become obnoxiously noticeable. Especially because VBA is completely non-existent on Mac.
Models this small and this capable bode really well for the usefulness of a PC like the RTX Spark that Nvidia/Microsoft announced this week. 128GB of unified memory will likely be more than sufficient for effective local agentic coding, even if SOTA cloud models will still be even better.
Up until this point, I've found the cost/value to unequivocally favor using a cloud subscription, but I would be lying if I didn't worry that one day OpenAI is going to increase the price for my subscription by 5-10x. I rely on these tools enough that if there is a real viable local option, I'm going to take it.
Not really. There's a reason the announcement didn't include ANY benchmark (!) and didn't mention EXACTLY what is the memory bandwidth. It's going to be dog-slow unusable for large models, as tok/sec is basically bandwidth divided by active weights. Rumoured 300GB/s / 30GB active weights (decent model) = 10 tokens per second, which is really slow
Yep, I have a Strix Halo and while it can run models bigger than Qwen 3.6 27b, it's not usable interactively when you do. ds4 patched for ROCm works, but at such a slow speed, it's not usable for coding agents.
The Nvidia boxes have only slightly more memory bandwidth, so I wouldn't expect them to be notably faster. At least not enough to make it useful interactively at that scale.
Why does everyone expect interactivity from local AI? It's not the best use of the hardware, especially not miniPC hardware. Long-term batched inference with larger and more capable models is much more feasible AIUI.
I can't speak for others but IMO the only reason to run models locally right now is privacy - i.e. you don't trust any of the cloud providers to not look at your prompts. Price-wise the market is extremely competitive and cheap model serving favors large scale so anything that can be run locally can be run cheaper in the cloud. But if privacy is important, then it's important for everything, including traditional chatbot applications, which kinda do require interactivity.
Even batched it's uncomfortably slow. I started to benchmark ds4 with my security vulnerability benchmark (after Qwen 3.6 dense and MoE and a bunch of cloud models), but it was going to tie up the Strix Halo for more than a day, so I decided not to run it as it would prevent me from doing other stuff with it during that time.
Even batched usage needs to be fast enough to deliver results in a reasonable time. Overnight runs are useful, 24 hour runs are...less so.
Anyway, most of the time people are talking about interactive use, and there's currently an upper bound on how large a model can be for local hosting on a reasonable budget (i.e. not a crazy amount more expensive than what a high end developer desktop or laptop costs). The sweet spot is probably currently the big Qwen 3.6 or Gemma 4 models, which are in the ~60GB range for 8-bit quantization plus a large context.
The 6-bit versions + 8-bit KV cache seems to save a good bit of memory without a significant loss of quality. The Qwen 35B is pretty fast in my testing, but MiniMax M2.7 230B is in some ways faster (way fewer tokens to arrive at an answer) even though it is much larger.
Qwen 3.6 35B-A3B with MTP at 8 bits is blazing fast, something like 50-60 tokens per second. That's plenty fast for interactive use, so I haven't tried lower bits. Unfortunately the MoE is notably dumber than the dense model (for the case I have data about...I've been benchmarking models for security vulnerability scanning, and 27B is notably better on hard bugs).
The dense model is almost usable, but feels really sluggish, even with MTP. I think it's about 12-15 tokens/second on the Strix Halo. Slow enough to where I'd rather pay to use a cloud model.
I might try the 6-bit version of the dense model to see how it behaves, though. Maybe it'll retain its bug hunting abilities while making it fast enough to use interactively and not take all day for benchmark runs.
Same chip, with a 6 bit 35B and 8 bit KV cache I see about 500 prefill and 55 decode at 30k into the context window. MiniMax seemed a bit lower token rate but much, much less prone to 40k tokens of monologue before generating an answer. A pattern I like is to use a smaller model to do most execution and then a larger model to review transcripts and output and do any fixups and tooling improvements (this is all batch jobs so all I care about is overall throughput).
Ryzen 395 is what I'm using, anything with 128GB+ of RAM accessible to the GPU should work fine for a 4 bit version of the model (so Spark or Mac Studio should be ok too).
The RTX/DGX Spark, Mac Ultras with 128GB unified ram are all ~$5k. Its still an expensive toy for rich people, it might as well be an H100 for 99.9% of the population (not devs with high paying jobs, of course).
the value of local models is allowing normal people to access AI without needing to subscribe to cloud services. this is esp imp for the rest of the world where even a 12GB gpu is extremely expensive.
there is no real viable local option that will come even close to Sonnet/Gemini Flash or the cheaper chinese models. Even if your pc costs <$2k you are never going to recoup the hw costs, and the results will be far worse.
My Framework Desktop with 128GB was about half that. I did luck out by buying before RAM prices went crazy, though.
I'm looking forward to the fallout when the data center bubble bursts. There's a good possibility we'll see a glut of hardware, either on the used market or from manufacturers that no longer have massive orders from OpenAI and the like.
RTX Spark is pretty much the DGX Spark in a laptop form factor, plus some lower-performing chips in the same series to be released later according to rumors. We know quite well how the top-of-the-line chip performs: it's very interesting for some application areas, less so for others.
This simultaneously seems like: 1) such an obvious attack vector that it is extreme negligence to not have had planned for appropriate security protections against this, and 2) the most obvious outcome for Meta to be this security lax and stupid. If it doesn't hurt their ad sales, it doesn't matter to Meta.
They implied it might have options with < 128G of memory. That could significantly reduce the price of components. There's also the very real possibility that the whole venture is being subsidized by Microsoft - or even NVIDIA itself - as a bid to get into a different space. Even with that, though, I doubt it will be cheap.
Same old story of too big to fail. The government will "inject cash", that is borrowed, so that retirees 401k accounts don't go down. But who pays back the borrowed funds? The non-retirees. Everything is optimized for the boomer generation to be fine, who cares about anyone else?
Up until this point, the potential for an AI bust blast radius was limited to corporate investors, but this is going to cause regular retail/401k investors to get exposure, which could have far bigger impacts on a downturn.
Not to mention the insane wake-up call it is going to be for these AI stocks when 3 months after they launch they have to start making earnings calls and showing their financials. That quarter-by-quarter pressure and scrutiny is no joke, and probably the biggest downside of going public.
I'm bullish on AI, but kind of bearish on any specific AI company. None of the initial big dotcom companies like AOL or Yahoo survived at the scale they briefly had.
If we're doing historical comparisons, there was so much hype for AOL and Yahoo that drove valuations far beyond the economics. In time, the hypesters were proved wrong.
In contrast, there was overwhelming doom and gloom for Google's IPO, in spite of their incredible growth and margin economics. In time, the doomers were proved wrong.
There's so much doom and gloom about Anthropic that directly contradicts their astounding growth and margins. For a long-term investor, Anthropic is looking a lot more like Google not AOL.
I can only hope the doomer narrative dominates until I can get a few shares at a reasonable valuation.
Vibes are almost always wrong. Ignore the vibes and focus on revenue growth rates and inference margins.
Google is an excellent example of the companies that followed after the initial batch of big dotcom companies. They ate Yahoo's lunch. The dotcom bust was in 2000, and Google went public in 2004.
I'm betting more on the successors to this initial group of AI companies. The ones that have to build actual profitable businesses.
Google was easily 10x better than any of their competition. It was effectively alone in the market.
Most of us were using 56k modems to access the internet back then, Google's search returned results within a couple of seconds. Yahoo, Lycos, Excite, Alta-Vista were still loading. Then the search results themselves were so good you could often just pick the first result. They eventually added a button which just took you directly to the first result. Which I used.
Your memory is faulty. AltaVista was always super fast--it never had the advertising bloat that the other ones had until the very end.
The problem AltaVista had was that it didn't scale when the Internet went exponential--so AltaVista would give you good search results until you asked current, topical questions. AltaVista relied on running a single, super-expensive stonking huge Alpha machine while Google ran on lots of commodity servers that spidered constantly.
This is inaccurate. When I was running AV operations around 2000, we were running on a couple dozen huge Alpha machines for the index layer and queries. We had a bunch of smaller machines for Web serving, and a high memory set of Alphas as a caching layer.
We also spidered constantly. A couple of those huge backend Alphas were dedicated to holding the constant spider index. AV had a well earned reputation for quick discovery, although I think Google wound up faster. We suffered a bit from maintaining separate indexes for the main corpus and recent pages, and I imagine Google handled that better.
But the period of time when our main index went to hell was the period of time when we failed to do a new main index crawl for several months. I won’t get into why that happened politically because my memory isn’t perfect and I don’t want to criticize anyone who won’t see this to stand up for themselves, but it’s absolutely the case that we let the index get stale.
And I will say that I think the execs were distracted by the idea of challenging Yahoo by buying a shopping site and a local news site of sorts and, unlike the Google of the time, they lacked the wisdom to focus on our primary product.
And now I fade back into the hedges, until the next time AV comes up… I suspect a high percentage of my HN comments are on this exact topic. It makes me sad.
And I still miss the AltaVista illustrated diagram (Java Applet) that would allow you to drill down and specialize the search results. No modern search has ever matched that, again.
That feature sounds amazing! I tried searching wayback etc but wasn't able to find any more details. Do you happen to know of any screenshots / deeper descriptions for it?
Perhaps we could nerd snipe Marginalia Search to add it :)
I was not a software engineer but yeah, I think so. Every now and then we’d have to go get Mike Burrows to do some consulting work to rewrite a bit more of the code in assembly.
Thanks for replying, now I remember why I used to spend all day on this site :-) A lot of the political changes and my exposure to some of the VC people outside of public view have soured tech for me. That and the current cult-like behaviour and clear fraud from the crypto and now AI waves... anyhow, I digress.
We were big AV users initially, I think for 2 years? This was 94-97 so my memory of the time periods is fuzzy. When Google came along I have very vivid memory of it providing not only better search results but also faster loads times.
I wonder if Google was already geo-distributed at the time? Latency was real then, it wasn't uncommon to hit 350ms (compared to 20-50ms to Europe) and the difference would have been felt back then. It was a killer for Counter Strike.
I was a relatively early investor (2008), but I was very hesitant early on because Microsoft was building an integrated search function, which became Windows Live Search, which became Bing. I definitely remember it took me to the beginning of the financial crisis to finally decide that it was going nowhere. I suspect it was the development of Google Maps that changed my mind.
"Google" of today is really AdSense ($102M, 2003) -> Android ($50M+?, 2005) -> YouTube ($1.6B, 2006) -> Google Docs ($50M+?, 2006)
Without those prescient and lucky acquisitions, we'd be talking about a "Google" that looked much more like Yahoo.
It wasn't search proficiency that built the empire, it was leveraging a transient search quality advantage into cash flow, then plowing that cash into acquisitions to construct a durable moat.
Was on a team that was trying to sell AltaVista a social media presence (before facebook/myspace/etc). Our people were mostly using Google, but we still wanted the client. One of the "moderation experts" on our side (i.e. - not tech or busniess) who evidently didn't understand what AltaVista was about asked them "Why don't you just use Google? It's better".
There were many search engines around during that time. Yahoo, Excite, Microsoft Live Search, Lycos... I don't recall any of them improving enough to rival early 2000's Google.
AdSense wasn't a thing until 2003. Google didn't have much revenue before that. However, they still surpassed their competition in quality of search results long before...
Yes. My point is that Google had a temporary search quality advantage… then AdSense-fueled revenue allowed them to convert that to a durable moat by outspending their competitors.
That didn't happen because they were magically amazing at search forever.
It happened because Google had a good business plan and could afford to throw gobs of money at engineers and infrastructure, in quantities that even Microsoft was unwilling to match.
Erm 2008 isn't early, I had been using it for almost a decade by then. They had won by 2001. No-one who knew Microsoft thought that they had a chance with Bing. This was post-Gates and Microsoft were already a laughing stock in 2008 with respect to the web.
I couldn't even tell you when I used the Google search page. It's been years at least. I wouldn't be surprised if many other people also don't go there to search. I assume most search straight in the url bar.
100x is staggering. These companies are priced as though we are already chewing through the solar system to create AI computronium. I'll pass--I expect I'll get a UBI when that scenario happens anyway.
I don't think it's really doom and gloom, that's mostly on here.
The normies are all still excited/scared and the valuation based on secondary trading is going up and up.
Maybe not quite as crazy as the dot com boom but I'd say the current environment for AI and related equities is a lot closer to the mid/late 90s than 2004
I think both have a similar amount of people who know about it. But it might just be my circle which is mostly in finance and some in engineering/medicine. These are also the type of people who actually invest. There's little doom vibes among those who're older if anything they're the one who think we'll get to some agi type situation.
Exactly. Many people will choose the cheapest possible model that's smart enough for their use case. "Frontier" is a transient property; open models tend to catch up in 6 months or so.
The issue is that the way the rules have been changed, risky stocks have been added to a product that is meant to be stable.
A 401k, any retirement focused product, is not serving its purpose when it tags on risk.
Having people in the later part of their lives find they are broke, becuse despite them doing everything right, a loophole was created to extract their savings.
> focus on revenue growth rates and inference margins
And ignore debt you can't pay back? Fine during ZIRP era because there was always another $50M around the corner. There is no extra $50B around the corner.
They've all over-invested in AI, same as the railroads, and it will collapse the same way.
I'm not against a fundamentals-based argument. The revenue growth is wild and their margins are reported to be great. But the existential concern remains: what happens if models start plateauing?
I could be wrong, but the margins are so good because there isn't a "substitute" for the frontier models. The performance difference between the latest Opus and a more open model provider is large enough to justify the extra cost. If that difference shrinks, I think the cost people are willing to pay will go way down.
They've changed the rules that will force these companies into every ETF commonly held by people's 401ks. The doomer narrative doesnt matter, they're forcing the common man to be the exit liquidty for the elite before the bubble pops.
>I can only hope the doomer narrative dominates until I can get a few shares at a reasonable valuation.
I conjecture that some amount of the "doomer posting" is a consequence of other people realizing what you realized here and attempting to sway public sentiment for personal gain.
I doomer post because I see three basic possibilities:
* It's a bubble, it crashes (no moat etc.)
* It's not a bubble, we get superintelligence, it's not nice, it squishes us all like bugs
* It's not a bubble, we get superintelligence, it's nice, we all get UBI
From the perspective of your personal financial security, the range of scenarios where you want to invest in Anthropic seems rather narrow. And I don't like to fund the creation of something which might squish me like a bug.
I am really surprised that people are comparing dot com with AI. Atleast dot-com era was deterministic, comparatively AI is just a probabalistic unreliable slop
The growth of the Internet will slow drastically, as the flaw in “Metcalfe’s law”—which states that the number of potential connections in a network is proportional to the square of the number of participants—becomes apparent: most people have nothing to say to each other! By 2005 or so, it will become clear that the Internet’s impact on the economy has been no greater than the fax machine’s.
(Why do people try to criticise AI as "probabalistic" like this matters? Unreliable I get, but early Wikipedia and Geocities were as deterministically unreliable as the amateur and fiction sections respectively of a bookstore)
An economist exytrpolating tech trends is already a hard sell. And deterministic unreliability is atleast deterministic meaning you can choose to ignore, this is still better than probabilistic AI hubris
Anthropic is selling a commodity item that was just invented. It’s like investing in someone who is blowing lightbulb glass by hand.
We’ve already seen a startup make a chip which generates a hundred pages of text in milliseconds. When companies start bringing out hardware like that for cutting edge models, the entire business is dead. AWS will just eat the market.
History does not repeat itself, but it rhymes. Drawing these comparisons to the Dotcom bubble is only of limited utility. I think there's good reason to believe that recursive self-improvement is a bust, and LLM models will become a commodity. The real value lies in multi-modal integration and good harnesses. The current frontier labs are theoretically in a good position to capitalize on this, but it is far from obvious that they will succeed. I think Google and some of the chinese giants are in a far better position to actually go the last mile.
It's the indices we need to be concerned about and it's especially the bloated carcass of xAI hanging onto SpaceX.
SpaceX was a profitable company, it was heavily invested into R&D and had managed to build a tidily profitable connectivity business in Starlink. Now the company is being burdened with all the worthless debt of X and xAI with a likely merger with Tesla following launch just to hand Musk a big check when he hits the valuation targets.
IPO inclusion on indices should be illegal, the price discovery simply hasn't happened yet and it's a direct grab at the most vulnerable retail investors - the passive index huggers that were told that if they just buy an index it'll never be spectacular and it might dip but it'll steadily go up.
I would not be surprised if the US Government ends up bailing out retirees over this and cements the country's descent into debt. Pretty much everyone can see it coming, but we have to act as if Elon is valuing his companies in good faith and not just trying to rob a payday.
Reminds me of this...
During Apple's 1980 Initial Public Offering (IPO), Massachusetts regulators banned residents from purchasing the stock. The state's securities regulators deemed the offering "too risky" and "over-valued," enforcing a state rule that prohibited IPOs with a price exceeding 25 times earnings.
> The investors who wanted to take more risk could do that.
What? How? By moving out of Massachusetts? I could understand banning such a speculative stock for e.g. pension funds or whatever, but blocking private individuals from buying with their own money seems insane.
Well we are far more corrupt and in the last stages of late stage capitalism. 15 day waiting period for Nasdaq 100. Your 401k is now the exit liquidty for the country's 5k richest people.
I really dont see how America doesnt collapse on the weight of its own corruption. But maybe the was the plan all along....
Just to put some numbers on it, under current rules and valuations, if all 3 of SpaceX, OpenAI and Anthropic were to go public with their current valuations and be part of the nasdaq 100, and you were 100% allocated to that, you’d have ~8% portfolio exposure to them. I suspect if you are the type of person who is 100% allocated to QQQ I’d guess you’d want _more_ exposure to those symbols than that.
For someone holding VTI its closer to 3% and a 2050 fund its more like 1.5%. Indexing is how most people are investing in their retirement accounts because controversies like these just don’t matter much. Hedging this off is going to cost more than its worth.
The stock market (and in particular the US stock market) has been an incredibly positive influence on the average American's 401k... Great driver of allowing Americans and beyond to share in the upside of successful companies... big reason why Americans can retire.
This is not a useful response to "how long does this last? I've been hearing it for a decade."
The phrase "late capitalism" itself has been used for just over a century now[0]; while I believe the USA is destroying what made the nation successful and will therefore probably[1] go into decline before 2035, there's no particular reason to tie the weird aspects of the economic system to the USA's political nonsense.
Also, the USSR would be the opposite example of your first sentence, as the collapse was a big surprise to everyone (both inside and outside the bloc) even a few years before it happened.
The Roman Empire would be the example for long-term decline, as that took one or more centuries depending on where you count the zenith (180 CE to 376 CE as the start, the Western Roman Empire was definitely dead by 476 CE).
But you seem to want to present it as a "in our lifetime" kind of thing; for that, I would suggest the British Empire, which peaked just before WW1, yet was obviously a broken power by the time of the Suez Crisis of 1956 just 42 years later though full decolonisation took longer (and if you ask Sinn Féin, Plaid Cymru, and the SNP, isn't really finished).
[1] There's a very narrow path where enough anti-corruption votes combine for a government which does something that, ironically, Trump promised in his first term: "drain the swamp". It can't just be "Dems win", it has to be broad consensus that not only takes this seriously but also is seen by the world to do so.
Let's get it in perspective though. The S&P500 market cap is currently $70T.
Assume that Anthropic, OpenAI and SpaceX all IPO and get included in SPY with the new fast listing rules. They are likely to be worth $3-4T combined, which means 'retail' investors are going to have perhaps 5% of their portfolio in it.
_Arugably_ that's a pretty fair allocation for retail investors to have to these "moonshot" style companies.
Also - if any one of these IPOs don't go well; I suspect the other(s) will have to postpone, further reducing exposure.
> which means 'retail' investors are going to have perhaps 5% of their portfolio in it
If I'm not reading it wrong though NASDAQ introduced a 3x multiplier for low-float stocks like SpaceX is most likely going to be (and maybe OpenAI and Anthropic too if they see that it works). A 15% exposure is then going to be pretty big.
Don't think so - the 3x is a separate cap. It actually reduces it down from market cap.
Eg say spaceX has $50bn of float at $1.5T valuation. If there wasn't _any_ cap at all, the full $1.5T would be used as the market cap. With the (new) 3x cap, it means only $150bn of the $1.5T valuation is taken into account in the index weighting.
Before this change, SpaceX wouldn't clear the 10% requirement to be listed in QQQ at all. So the 3x basically allows them to be included but _does not_ increase their market cap from $1.5T to $4.5T.
Btw, for clarity, I'm not saying there isn't questionable behaviour going on here. My main point is that even if SpaceX, openai and anthropic all went to 0 (unlikely IMO), it's not going to have a material impact on people's retirements which is what OP was proposing.
Everyone I know who invests in an index fund is doing so to mitigate the risks of things like "moonshots" which are typically much riskier investments.
> Everyone I know who invests in an index fund is doing so to mitigate the risks of things like "moonshots" which are typically much riskier investments.
The whole point of an index fund is to capture the growth of the whole market. If you wanted low risk you'd be buying bonds.
Is it? I thought the idea was diversity of risk, not "mitigating risk". You clearly don't want 100% of your 401k in OpenAI or Anthropic. But you probably do want 1 or 2% of it in, to give you the long term growth potential?
Regardless SPY is actually a pretty "risky" index fund on some measures - it pays a (very) low dividend compared to many other intl/ETF funds and is weighted very heavily towards tech stocks (atm).
If you genuinely wanted to mitigate risk you would probably not choose SPY.
But the US has never had $1T+ IPOs before. And also a huge amount of enormous private companies that don't want to go public for various reasons.
Also, the rules have changed before. It's not the first time these rules have changed.
I see both sides of the argument (it's definitely _not_ good for 401k investors if Anthropic/OpenAI/SpaceX make huge leaps in technology that allow for far higher earnings that they aren't able to access, for example).
But my main point is that these investors regardless would "only" have 5% exposure to these. That surely cannot be considered a systemic risk that the OP is inferring.
> 'retail' investors are going to have perhaps 5% of their portfolio in it.
If they are the only moonshot style companies in their portfolio, and if they crater that's the physical equivalent of a 160lb person carrying a gallon of milk around with them wherever they go. At least until they've drunk it I guess.
Lots of "ifs" in that sentence now I read it back though.
Most (all?) 401k plans limit you to a pre-picked list of ETFs and mutual funds you can invest in. Not to mention the standard advice for decades has been 'broad market index fund'.
Afaik this is the first time that an IPO is big that it immediately gets a significant share of a broad market index fund. The rules among the providers are actually quite diverse, so it's complicated. The Rational Reminder podcast discussed it in April: https://rationalreminder.ca/podcast/406
Their conclusion: It might be bad, but so be it. No need to change strategy.
Good thing is that index funds don't hold stocks at market capitalization but only at free float value. So a company whose shares are mostly held by founders, employees, and strategic investors gets a weight well below its headline valuation.
I believe it's the opposite :)
All major indices (S&P500, MSCI, FTSE...) use free-float adjustments. And recently also NASDAQ - they've changed to cap of 3x the value of free-floating shares.
If your plan uses Fidelity you can move your 401k into Brokeragelink and that lets you pick individual stocks. Schwab, TIAA, Alight and some others also have something similar.
I started as being very skeptical circa 2024, became more open minded towards the end of 2025, and am becoming skeptical again now. Reason being, I interact with entrepreneurs now and I see what they hope for in AI. The universal desire seems to be "people will just talk to AI instead of me while paying me the same as before or more". This is typically covered with coping mechanisms (e.g. "I am not building a chat bot, I am building..." after which they describe a chat bot).
I think the crash is getting more likely because the disconnect between what the technology can be used for does not match what people want it to do.
imho Anthropic publicly posting accurate information about their revenue and operations would be a step in a healthy direction for the economy/markets if there's an "AI bust blast" coming. This filing is movement towards that
Agreed. Ben Felix has a video about this, I think he focused on SpaceX in it. The problem with the standard total market funds is they gobble it up right away. There are funds that do wait some period of time to purchase new ipos to let them smooth out, but I'm not sure those are typically available in 401k plans.
Hedge funds already know broad based mutuals will have to purchase these so can sneak in before them and then sell to them for a marginal gain. Mayhaps the newest strategy for exiting is generating so much hype that you're guaranteed an exit by retail retirement funds?
I don’t think your first sentence is true. The hyperscalers have spent north of 1 trillion in the capex boom as a direct response to AI demand, If you’re a retail investor, you’re already quite exposed.
And who would have thought it was the online bookstore that would be the big survivor of the dotcom era? They were a comparatively small player relative to AOL/Yahoo/etc at the time of the dotcom bust. Which company is the 1994 Amazon of AI now?
The narrative is that inference on existing models is profitable. All of the profits and many billions of additional capital invested go into training the next model, which is some multiple more expensive to train than the last. Each new model generation also leads to more revenue growth. Newer models are more compute-efficient when distilled (so could possibly be higher margin) but also they work on longer time-horizon tasks and can make greater use of test-time compute which increases token counts. So the inference ROI on each model can pay back the cost of training it, but future growth demands put all that money and more into training the next model. The numbers we’d need to prove whether this is true are not public, but it makes sense and fits what info we do have.
Theoretically, if training more expensive models stops resulting in better capabilities or isn’t economically viable, the labs can shift gears into making profit on old models. A lot of future growth is priced in so this would lead to a collapse in share price if it happens anytime soon.
There’s a story out that Anthropic might be profitable this quarter. This is in one sense bad news - it means that the company wasn’t aggressive enough about acquiring capacity last year, because they didn’t foresee how fast their inference business would grow. Anthropic is now forced to make suboptimal choices about serving existing users vs. training the next model (need to scrounge for capacity by paying other players like SpaceX). And as a Claude Code user I feel like I’ve been affected by that, what with the random outages and performance degradations.
I don’t believe similar scores on small bounded tasks mean models are interchangeable. I’ve found that heavy token-burning workflows are good for my productivity (letting multiple sessions run async working of different stuff). Claude ultracode is an easy example to point to, but there are tons of harnesses out there doing similar things. I find using a higher quality model matters because it affects how far it can get unattended before heading the wrong direction. I’ve tried using the cheaper/faster models and it’s a real downgrade (or completely useless). A model that’s even smarter with longer time horizon would be even better for my productivity. I don’t think we are at the ceiling for model quality or price. My employer pays a lot for my tokens but it’s still a lot less than they pay me.
I agree Anthropic faces some risk they could get commoditized, but on the other hand if things go well they could end up leading adoption into more industries. There are upside and downside scenarios. Recursive self-improvement is obviously an important unknown and could lead to winner-take-all.
There's the "how much of my company exists in a black box controlled by some asshole" angle as well, but in my mind the biggest issue is that current models are already capable of saturating a dev in like four hours.
Yes - IIRC, Amazon was profitable on books by 1996, with other sectors following as they expanded and it was clear that they could post profits any time they wanted by slowing expansion. It was surreal through the bubble years to see “analysts” equating them with companies which were losing money on every sale with no clear way to change that.
Exactly right. Even though ride sharing industry lost money in subsidy arms race and side bets it was likewise fundamentally sound in major metros since early on. Popular "analyses" kept equating Uber/Lyft with firms losing money on every sale with no path to fix it but the demand was always there as riders had already left taxis and transit on reliability and convenience grounds.
There is a big difference between "Every customer is a loss" and "We are profitable and re-investing all of our money". Amazon continued to grow, and reinvested its revenue with solid business fundamentals.
This may finally be the chip family ARM on Windows has always needed. Qualcomm's chips have always been dogs with slow off-the-shelf ARM CPU cores that have pathetic single-threaded performance compared to x86 AMD/Intel or ARM Apple Silicon designs.
For reference, this is just a single benchmark, but as an idea of each vendor's top mobile CPU single-threaded performance:
Geekbench Single Thread Score:
- DGX Spark (same CPU as RTX Spark): 3125
- Snapdragon X1 Elite: 2950
- Snapdragon X2 Elite Extreme: 4050
- AMD Ryzen 9 9955HX: 3225
- Intel Core Ultra 9 290HX Plus: 3175
- Apple M5 Max: 4350
I'm happy to be wrong about Qualcomm's latest X2 chip performance, even if it is shipping in only a single product so far. Their previous best was the lowest in this list.
Qualcomm Snapdragon x1 and upcoming x2 use their Oryon core and have much faster single-thread performance than Intel/Amd and this nvidia soc that uses off-the-shelf arm cores
That wasn't true of the X1, but apparently the X2 (which is only in a single device so far) does appear to finally be fast. The first Windows ARM CPU to be faster than any of its x86 rivals. Competitive with Apple Silicon single-thread performance even.
I was disappointed to see that the RTX Spark has the ARM cores from the DGX Spark. I was hoping it had their new in-house developed cores that Nvidia is starting to use on their latest gen server parts. They look really fast. That said, if RTX Spark has CPU performance like the DGX Spark, it will be almost as fast as the top AMD/Intel parts.
It is funny how Mac OS is a draw for some, when it is the main reason I don't use a Mac. Their hardware is excellent, but when I've tried using a Mac as my main machine, my productivity suffered. The only part of the Apple ecosystem I wish I could get on Windows is iMessage, and maybe FaceTime.
> The only part of the Apple ecosystem I wish I could get on Windows is iMessage, and maybe FaceTime.
It annoys me that these are such a draw. There are a dozen other viable messaging and video call apps, but there's always someone who feels like spending two minutes to install and activate one is a major imposition.
It is fascinating to me to see a new product category that improves so vastly year-after-year, where people commonly state that this is now the peak already.
I couldn’t even imagine having to go back to a model from 12 months ago, much less 24 months ago. GPT-5.5 is so much better than GPT-4o that it sure seems like they keep finding new juice to squeeze.
This is like going from dialup internet to DSL and acting like it has peaked before gigabit cable and fiber come along. We are at the beginning of hardware truly made for AI.
> I couldn’t even imagine having to go back to a model from 12 months ago, much less 24 months ago. GPT-5.5 is so much better than GPT-4o that it sure seems like they keep finding new juice to squeeze
The difference in progress in smaller models is far more impressive.
Compare Gemini 3.5 Flash to a ~16B parameter model from 24 months ago.
Compare GPT-5.5 to a frontier model 24 months ago.
Yes, GPT-5.5 got better. At orders of magnitude smaller parameter sizes (when factoring in ACTIVE parameters) the increase is far more pronounced.
Totally agree on smaller models making even more impressive gains. Gemini 3.5 Flash is better than the biggest SOTA model from 24 months ago, not just a 16B parameter one. GPT-4o came out 24 months ago, and there is no way I'd choose that over Gemini 3.5 Flash today.
GPT-5.3-Codex came out in February, and GPT-5.5 came out in April. How much better do you expect in two month's time? What other products can you think of that get meaningfully better in that short of a time frame?
And as good as 5.3 Codex is at writing code, 5.5 is easily just as good, if not better. But 5.5 is more than a one trick pony and it is much better at planning, writing copy, documentation, etc. I can choose to run 5.3-Codex instead of 5.5, but I never ever do.
> Musk’s detractors have been correct about Tesla’s terrible fundamentals, its Full Self-Driving lies, its robotaxi fantasies, its shaky accounting. But when they have imagined these things might affect the stock price, they have been wrong.
> Someday, someone, somewhere will make a lot of money shorting Tesla or SpaceX. But it’s unlikely to be you.
> For now, Tesla remains better understood as a religion than a financial investment, and we can now add SpaceX to that category.
reply