For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | WhitneyLand's commentsregister

Wouldn’t this be way more expensive?

Example 2TB:

Google $10/mo vs S3 ~&45/mo?

You could get cheaper that Google Drive with glacier tiers but that’s a different level of restrictions and still has retrieval fees.


If you use S3 Standard - Infrequent Access then it is $25/mo.

Bullshit. This does not represent what real people are listening to, there are ways to game the system.

The idea is explained by Rick Beato here: https://youtu.be/rGremoYVMPc


The features here don’t seem game changing. The most compelling parts are mostly already available in Claude or Codex or their related apps and services.

The biggest concern is that if you want to use SOTA models I don’t see how they can match what you get with the subscription plans of Anthropic and Open AI, whether your spending $20 or $200 a month.

Even if they could match what you get in terms of token quantity, they are giving their tools away for free for the foreseeable future and Cursor is not.


Saying his name like “girdle”, is the closest English pronunciation I’ve seen.

The actual German ö is hard for me to figure out without having a native speaker around to practice with.


The “l” is just as hard to pronounce correctly. English has a very nasal “l” compared to german.

Try saying English “hell” but dragging out the “l” (“helllllllll”); very nasal. Compare audio here: https://en.wiktionary.org/wiki/hell#German


I try to explain it as trying to shape your mouth as if you were saying e but say o instead.

I'd say it's like the 'u' in 'hurt'.

StepFun is an interesting model.

If you haven’t heard of it yet there’s some good discussion here: https://news.ycombinator.com/item?id=47069179


Since that discussion, they released the base model and a midtrain checkpoint:

- https://huggingface.co/stepfun-ai/Step-3.5-Flash-Base

- https://huggingface.co/stepfun-ai/Step-3.5-Flash-Base-Midtra...

I'm not aware of other AI labs that released base checkpoint for models in this size class. Qwen released some base models for 3.5, but the biggest one is the 35B checkpoint.

They also released the entire training pipeline:

- https://huggingface.co/datasets/stepfun-ai/Step-3.5-Flash-SF...

- https://github.com/stepfun-ai/SteptronOss


Tuned Qwen 3.5 27B beats Step 3.5 on almost all benchmarks, so the point about the size class is moot.

Benchmarks are not interesting in deciding the "size class". Bigger size means more knowledge. Also, the Qwen 3.5 27B is a dense 27B active parameter model. StepFun 3.5 Flash has 11B active parameters.

> Bigger size means more knowledge.

Qwen 3.5 27B beats StepFun 3.5 Flash on GPQA Diamond too, so probably no.


Benchmarks don't tell the whole story. For one-shot coding tasks, I found Step 3.5 Flash to be stronger even than Qwen 3.5 397B.

Benchmarks don't tell the whole story... for that you need anecdotes from random HN posters :)

thanks for the info. before running the bench i only tried it in arena.ai type of tasks and it was not impressive. i didn't expect it to be that good at agentic tasks

A bit misleading to say they take 14x less memory, no one is doing inference with 16-bit models.

Call me a killjoy I hate April fools jokes.

Alright then: killjoy

Great quote from Hilbert, I think it’s also a useful thought for software development.

“The edifice of science is not raised like a dwelling, in which the foundations are first firmly laid and only then one proceeds to construct and to enlarge the rooms,” the great mathematician David Hilbert wrote in 1905 (opens a new tab). Rather, scientists should first find “comfortable spaces to wander around and only subsequently, when signs appear here and there that the loose foundations are not able to sustain the expansion of the rooms, [should they] support and fortify them.”


Yeah, I see a lot of people (especially on HN) bemoaning any science that isn't a controlled double blind experiment with a large sample size. But exploratory science is just as important as the science that proves things. Otherwise we wouldn't know which hypotheses are useful/interesting to test.

The problem is more about how it is reported to the public. Science is ugly, but when a discovery is announced to the public, a high level of confidence is expected, and journalists certainly act like there is. Kind of like you are not supposed to ship untested development versions of software to customers.

But sometimes, some of the ugly science gets out of the lab a bit too soon, and it usually doesn't end well. Usually people get their hopes up, and when it doesn't live up to the hype, people get confused.

It really stood out during the covid pandemic. We didn't have time to wait for the long trials we normally expect, waiting could mean thousands of deaths, and we had to make do with uncertainty. That's how we got all sorts of conflicting information and policies that changed all the time. The virus spread by contact, no, it is airborne, masks, no masks, hydroxycholoroquine, no, that's bullshit, etc... that sort of thing. That's the kind of thing that usually don't get publicized outside of scientific papers, but the circumstances made it so that everyone got to see that, including science deniers unfortunately.

Edit: Still, I really enjoyed the LK99 saga (the supposed room temperature superconductor). It was overhyped, and it it came to its expected conclusion (it isn't), however, it sparked widespread interest in semiconductors and plenty of replication attempts.


  > The problem is more about how it is reported to the public. 
Yes and no.

From scientific communicators there's a lot of slop and it's getting worse. Even places like Nature and Scientific American are making unacceptable mistakes (a famous one being the quantum machine learning black hole BS that Quanta published)

But I frequently see those HN comments on ArXiv links. That's not a science communication issue. Those are papers. That's researcher to researcher communication. It's open, but not written for the public. People will argue it should be, but then where does researcher to researcher communication happen? You really want that behind closed doors?

There is a certain arrogance that plays a role. Small sample size? There's a good chance it's a paper arguing for the community to study at a larger scale. You're not going to start out by recruiting a million people to figure out if an effect might even exist. Yet I see those papers routinely scoffed at. They're scientifically sound but laughing at them is as big of an error as treating them like absolute truth, just erring in the opposite direction.

People really do not understand how science works and they get extremely upset if you suggest otherwise. As if not understanding something that they haven't spent decades studying implies they're dumb. Scientists don't expect non scientists to understand how science works. There's a reason you're only a junior scientist after getting an entire PhD. You can be smart and not understand tons of stuff. I got a PhD and I'll happily say I'll look like a bumbling idiots even outside my niche, in my own domain! I think we're just got to stop trying to prove how smart we are before we're all dumb as shit. We're just kinda not dumb at some things, and that's perfectly okay. Learning is the interesting part. And it's extra ironic the Less Wrong crowd doesn't take those words to heart because that's what it's all about. We're all wrong. It's not about being right, it's about being less wrong


Are they bemoaning that science is being done, or are they bemoaning that the experimental results have not yet reached high enough confidence to justify the conclusions being suggested?

> Are they bemoaning that science is being done

The reflexive "in mice" comments seem to be bemoaning how science is done.


As someone who has made several comments consisting entirely of “…in mice.”, let me assure you that the reflex only kicks in after reading the paper and noticing that the experimental subjects were exclusively mice.

The problem is not mice experiments on arxiv, the problem is posting those papers for broader dissemenation to the public, with titles suggesting to the public that cancer has been cured, without prominently pointing out that the experiments were not about cancer in humans.


> problem is posting those papers for broader dissemenation to the public, with titles suggesting to the public that cancer has been cured

Fair enough. I'm thinking of cases where a good study that isn't turned into PR slop is dismissed because it was done in mice. Which is fine for most people. But not great if we're treating real science that way.


Dismissing good science is entirely the correct decision when the good science isn't ready for broad dissemination to the audience which it is being presented to.

I disagree. I think people understand studies have to begin in mice. It’s what the GP said. You can’t release those studies because there’s not a high enough confidence rate in what most people are interested in ie how it effect humans.

> You can’t release those studies because there’s not a high enough confidence rate in what most people are interested in ie how it effect humans

This is science by ignoramus. It isn't how science works, at least not when it works at its best. Someone advocating for censoring science because it might be misread is not on the side of science.


I’m not advocating for censoring them. I’m advocating for less hype in science media reporting around mice studies because let’s be frank. The vast majority of the population are ignoramuses that cannot make the distinctions themselves, and that has real political consequences through lack of trust in scientific organizations.

More Doctors Smoke Camels!™

It depends, especially coming from fields like psychology. You can prove anything with a small enough group. A lot of those just end up adding a lot of noise and reduce the reliability of the entire field in general. It just ends up with people getting conflicting information every other week and then they just tune out.

Like anything else, it's easier to complain about the legitimacy of something and nitpick it to death than it is to do the actual thing.

Most people on HN aren't scientists, even if they fancy themselves as such.


Hilbert's quote is entirely out of context:

1) while many formalists in his day were stress-testing definitions for unexpected gotcha's; some vocal minority were doing formalization as an eccentric art form.

2) commoditized computers running verification software was not available in his day and age

As long as the weakest link was reliance on human brains faithfully attempting to maintain consistency anyway, then it was more productive and fruitful for the economy to focus on translating observations into the language of mathematics.

Once commoditized hardware and minimalistic verification software becomes available, it makes sense to step back and start a machine readable formalization program to translate or verify the main body of mathematics.

Quoting mathematicians of the caliber like Hilbert in 2026 doesn't mean its great guidance in the face of questions Hilbert was never confronted with: with cheap affordable compute, and an enormously expanded number of mathematicians, perhaps its time to formalize the bulk of mathematics.

And it could happen quickly.

A government can mandate that a certain fraction of student scores is assessed on their formalization tasks. Basically turn the job of formalizing mathematics into homework exercises for students. There are students at all levels, undergraduate, graduate, ... If a result isn't proven yet, turn into a temporary axiom, which goes to the collective TODO list.

In a few years all of mathematics that is regularly touched on in academia could be formalized.

Nation states that enforce this will have a large number of mathematicians capable of formalizing systems into machine readable form, and will benefit tremendously compared to nation states that don't (even if the resulting formalizations were public domain: having a sword available is not the same as having workers experienced in smithing such a sword).


My only complaint with the article is that it doesn't seem to mention that digitized proofs can contain gaps but that those gaps must be explicit like in lean the `sorry` function, or axioms.

That’s similar to Neurath’s boat:” We are like sailors who on the open sea must reconstruct their ship but are never able to start afresh from the bottom. Where a beam is taken away a new one must at once be put there, and for this the rest of the ship is used as support. In this way, by using the old beams and driftwood the ship can be shaped entirely anew, but only by gradual reconstruction.”

“The output is no longer 1-bit”

I thought that constraint was the whole idea?


Agentic AI CPU? No.

It’s a CPU designed for an AI cluster. Their last CPU Grace was the same thing and no one called it agentic.

Vera now just has more performance/more bandwidth. It’s cool, I’d like to have one of these clusters, but this is not new.

It’s marketed as agentic AI because that’s fashionable in 2026.


They significantly lowered latency compared to EPYC/Xeon, which is critical for streaming agents (e.g. text/audio/video agents).


What latency? How much is it compared to LLM inference speed?


See the Redpanda comment/link here.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You