More

tomjakubowski · 2026-06-05T17:19:30 1780679970

I'm quite fond of vimscript legend Tim Pope's guidance on writing commit messages.

https://tbaggery.com/2008/04/19/a-note-about-git-commit-mess...

tomjakubowski · 2026-06-05T14:28:19 1780669699

You don't need a theory of mind to effectively manage or collaborate with a chatbot. You do for other humans.

tomjakubowski · 2026-06-05T14:03:56 1780668236

generally an editor writes the headline, not the reporter

tomjakubowski · 2026-06-04T19:48:38 1780602518

CVS Pharmacy has started rolling out an "AI assistant" phone tree with no apparent way to get to a human.

cucumber3732842 · 2026-06-04T20:10:26 1780603826

Their old school phone tree didn't either. You had to pretend to be irate.

shawn_w · 2026-06-04T20:32:10 1780605130

Pretend? Oh, there's no pretending involved.

joe_mamba · 2026-06-04T20:12:39 1780603959

Maybe if you use a lot of profanities and threaten to cancel your subscription?

tomjakubowski · 2026-06-04T03:06:36 1780542396

From now on I will use the gigacalorie for this kind of thing.

tomjakubowski · 2026-06-03T16:27:30 1780504050

There are a number of brick and mortar retailers I frequent who swing the other way and don't accept cash, only credit or debit. Presumably, they prefer paying the cost of credit card fees to the costs of handling cash. What's driving that difference?

tomjakubowski · 2026-06-03T15:29:09 1780500549

> Sure we aren't evolutionary predisposed to work, but Europeans, North Africans, Asians are genetically

What?

tomjakubowski · 2026-06-03T07:20:53 1780471253

The leader of the study, Julian Nyarko, is Associate Director and Senior Fellow at HAI. I can't say whether that means the study was conducted by HAI, but there is at least a connection to it. https://hai.stanford.edu/people/julian-nyarko

tomjakubowski · 2026-06-03T06:50:02 1780469402

This is often called tacit knowledge. https://en.wikipedia.org/wiki/Tacit_knowledge

My favorite example of this is knowing how to untangle a big pile of cables. There are robots now which can untie a single knotted cable, but I don't think any can do a pile of cables yet. https://www.youtube.com/watch?v=vp-94rsherE

tomjakubowski · 2026-06-03T05:54:22 1780466062

The images you can't see in the chats are the question sheet from here, which was the first fourth grade math homework assignment I tried. https://www.k5learning.com/worksheets/math/data-graphing/gra...

Fourth graders typically don't have access to Python for their homework assignments. To be fair to the kids, I tried it first without Python: Opus 4.6 (Feb 2026) with default Medium effort. https://claude.ai/share/1533a3e4-6757-4614-b95d-0743350a6598

pastebin of the reasoning section (no Python): https://pastebin.com/zZeG5ZnJ

It got questions 2 (Shop D) and 5 (280) wrong. It got question 3 right but the work it showed has the numbers for each shop wrong. My fourth grade teacher would have taken off points for that (shout out Mrs. Van Bladel).

Here it is again with a prompted nudge to use Python: https://claude.ai/share/e1265efb-0988-40ac-90ac-c76225b67e98

pastebin of the reasoning section (with Python): https://pastebin.com/KsP0xxZL

This time it used Python to "check its work", and answered the same questions incorrectly (2 and 5). To the model's credit, it did show the correct work on answer 3 this time.

simonw · 2026-06-03T11:09:07 1780484947

That's more of a test of vision LLM ability to correctly identify and count things in an image than it is of mathematical reasoning.

If you look at the working of your non-Python example it gets most of the counts wrong - identifying shop A as two full notebooks plus one half notebook when it's actually three full notebooks, for example. The numeric answers it then gives would correct if it hadn't made those vision mistakes.

I've been testing vision LLMs on counting the number of pelicans in a photo for a while, they're very unreliable at that.

The best I've seen is Google Gemini 2.5 if you have it output image segmentation masks (a feature they have not included in the Gemini 3 series yet): https://simonwillison.net/2025/Apr/18/gemini-image-segmentat... - but that requires additional harness engineering, you need to explicitly cause it to use its image segmentation mechanism.

tomjakubowski · 2026-06-04T00:28:36 1780532916

Fourth grade math's† students are learning geometry and how to draw simple plots. Vision ability (or tactile ability, for visually impaired students) is pretty important to understanding and solving those homework problems.

†: think "bo's'n"

HN For You