More

zobzu · 2026-05-25T15:46:37 1779723997

dimples are used for stability and lift, not for friction reduction / low cx

zobzu · 2026-05-25T15:43:47 1779723827

as usual these things are presented as new and revolutionary but aren't actually.

the specific process and implemention however are usually newer or slightly different from before.

this is our sensationalistic based society - any iterative progress, or sometimes even copy, is explained as a revolution.

now show me a 737 using 40% less fuel - guess what - that wont happen - however, perhaps we'll get a slightly better process to create aircraft skins. keep in mind you cant re-sand a fuselage every week, it needs to work reliably with no maintenance.

phonon · 2026-05-26T11:47:34 1779796054

"... this flying wing will burn 50% less fuel than today's jets..."[1]

[1]https://time.com/7292452/jetzero-low-carbon-air-travel/

zobzu · 2026-05-19T14:19:59 1779200399

i really don't mind the GIMP UI but then again I used it for a very long time, so perhaps that's why (same for PS, I'm a 1.0 user).

On the flip side, I'd love a darktable that is closer the lightroom's UI, for similar reasons. Somehow, i find it more difficult to get the same speed and flow with darktable.

zobzu · 2026-05-18T01:03:27 1779066207

with AI the "they could so they never wondered if they should" will be a very frequent thing.

stephantul · 2026-05-18T04:19:25 1779077965

This is a bit rude.

We didn't generate this project, we wrote it, a lot of it manually, and trained custom models. We'd been working in the real-time retrieval space for a while, and we thought coding was a good fit for this specific technology.

esperent · 2026-05-18T05:00:37 1779080437

My comment above wasn't meant to be rude. And you do have extensive benchmarks against grep etc so it's clear you understand the importance of that.

But I still think you're missing the harder but more important proof which is agent evals. Have you done any of that?

I would personally love to find tools in this space which can make agents more efficient and I do believe there's a scope for massive improvements compared to default workflows. But my evals with RTK and Headroom have made me wary that a tool can look like it should work, conceptually make sense, pass non-agentic benchmarks, and still make an actual agentic workflow worse.

stephantul · 2026-05-18T06:11:48 1779084708

It was directed at the parent who implied that we didn’t think about this.

I agree with your point about the evals and how you can get discontinuities: good search can be worse than bad search when agents can do many searches. We’re working on it

esperent · 2026-05-18T12:18:32 1779106712

When you share them, please also share the setup for people to easily rerun them. Nearly every eval I've seen shares the llm session transcript but not the actual harness setup etc. that they used.

jack_pp · 2026-05-18T02:01:06 1779069666

yeah I think I'm prone to do the same, it is so easy to create and we get too excited by it instead of first doing the research necessary which is much more boring than actually producing something.

zobzu · 2026-05-16T12:36:12 1778934972

additionally: Google reports on their own jail breaks (who is project zero?!! lol). apple does not.

in fact apple fixed several high criticality bugs like these not that long ago - they just dont talk about it other than "you must fix now".

same problems, different comms, and the more people do this, the less transparent google will be.

zobzu · 2026-05-16T12:18:03 1778933883

extrapolating that line of thinking: "why does computer run malware, i asked it to not run malware ever!”

another fun parallel: "run this [...] and make no mistake ".

human context is just as bad as llms, i swear

zobzu · 2026-05-16T12:15:21 1778933721

dude google is the one reporting on themselves here.

zobzu · 2026-05-16T12:05:23 1778933123

it feels like someone promoted ai on what to build and kept going with that same process. the other products sre just as odd lol. fun site though!

Fnoord · 2026-05-16T14:44:24 1778942664

I'm not sure they're odd? They're using a 3D printer. If the CAD design is open, you could even assemble your own repair parts.

I can highly recommend the Azeron Cyro [1] (cannot comment on their other products, but they look interesting). It is partly 3D printed, but also repairable and mod-able. It is the only vertical mouse I am aware of, with a modest amount of keys (15 + scroll wheel + analog thumbstick). I say keys since, well, in software they're recognized as such. You can also make it a Bluetooth mouse (I use USB2BT+), though obviously you suffer a bit from latency.

[1] https://azeron.com/collections/cyro

willwade · 2026-05-16T17:45:33 1778953533

I can promise you it isn’t. This totally predates llms. Been using them for a long time

zobzu · 2026-05-05T20:29:28 1778012968

gemma is also just way faster. i dont wanna wait 10min to get a 5-10% better answer (and sometimes, actually worse answer).

best is to use your own model router atm, depending on the task

SwellJoe · 2026-05-05T20:41:14 1778013674

I'm pretty sure Qwen is faster? The MoE version of Qwen is 3B active, while Gemma 4 is 4B active. Similarly, the dense Qwen is 27B while Gemma is 31B. All else being equal (though I know all else isn't equal), Qwen should be faster in both cases. I haven't actually measured with any precision, but on my AMD hardware (Strix Halo or dual Radeon Pro V620) they seem quite similar in both cases...both MoE models are fast enough for interactive use, both dense models are notably smarter but much slower, long time to first response and single-digit tokens per second once it starts talking.

vparseval · 2026-05-06T01:29:21 1778030961

qwen-3.6 is really interesting. The dense 27B model is pretty slow for me whereas the sparse 31B is blazingly fast but it also needs to be since it's so chatty. It produces pages and pages of stream of consciousness stuff. 27B does this to a lesser extent but slow enough that I can actually read it whereas 31B just blasts by.

I haven't yet compared either to Gemma 4. I tried that out the day after it came out with the patched llama.cpp that added support for it but I couldn't make tool calling work and so it was kind of useless. I should try again to see if things have changed but judging by what people say, qwen-3.6 seems stronger for coding anyway.

ctbellmar · 2026-05-06T14:36:20 1778078180

I had the same experience with 31B. Runs well on 4090 too!

Craighead · 2026-05-06T00:39:04 1778027944

I'm using both incessantly and having a great time.

ActorNightly · 2026-05-07T03:41:42 1778125302

Qwen without thinking is just as fast. I have 4 parameter settings based on recommendation. If you want a good coding problem, the thinking coding mode works well, but takes a while to arrive at an answer. If you want faster turn around time, instruction mode works without thinking.

zobzu · 2026-05-05T20:27:42 1778012862

flash is the fast (duh) model though. its not always beneficial to use pro. in practice: 1/ set to flash 3.1 ; 2/ force to pro...sometimes. mainly when the cli fails to predict what model to use.

note that it will sometimes fall back to flash 2, which sucks

mapontosevenths · 2026-05-05T21:35:34 1778016934

Flash will absolutely destroy a complex codebase. It's like a drunk junior programmer. Don't trust it with anything more complex than autocomplete.

Pro is expensive, but good. However they've decreased the pitiful stipend they used to include in even the ultra plan to the point were it's barely usable. I pivoted back to ChatGPT Pro after the recent downgrade they gave Ultra users. Googles Ultra plan cost 2.5x as much and delivers about half the usage.

chrisweekly · 2026-05-06T03:18:44 1778037524

Tangent: this is one of those situations where slang is harmful to understanding. When I saw "will absolutely destroy" my first interpretation was a positive connotation. Of course further context made it clear you were being straightforward, and this isn't aimed at you. Along these lines, "drop" has become a problematic term: "Acme co dropped support for Foo" means it's EOL, but "Foo dropped today" implies it just landed. Idioms are hard enough when they don't serve as borderline autoantonyms. To wrap up this extended digression, if anyone else finds this sort of thing interesting, and could use a good laugh, check out Ismo (a standup comic from Finland who makes truly hilarious observations about English as a second language).

https://youtu.be/oGmzfjuicE0?si=nL_W75s8UDp1g-zI

https://youtu.be/jXcMoHeWaYQ?si=QMi7nEwVWvCZyzbl

kridsdale1 · 2026-05-06T12:37:27 1778071047

I had the same experience.

sureMan6 · 2026-05-05T22:27:56 1778020076

Yeah I don't get the user who said Gemini is generous with the quota, I get more use out of codex with the 5 hour limits than Gemini gives me in a week

psychoslave · 2026-05-06T05:34:49 1778045689

> It's like a drunk junior programmer.

Thanks for the laugh. :)

HN For You