MakeAJiraTicket's comments

MakeAJiraTicket · 2026-02-27T04:50:21 1772167821

Thank you! Gemini has consistently been the best performer that I've tried, but they always require the connection to be made explicit. The point of the test is that it is very low complexity and is very targeted toward what can be considered reasoning and these models can't produce the connection without prodding.

In the ideal case of reasoning you would simply present the methods and they'd bridge the gap independently when it is brought to the forefront of their context together, but it doesn't happen.

mapontosevenths · 2026-02-27T05:11:02 1772169062

ChatGPT got it with less prodding, but I had to set it to "Pro" thinking mode (ChatGPT's version of Deep Think, I suspect). I'm sure Deep Think could get it with even less prompting.

I think your conclusion that they aren't really thinking doesn't hold. They're already there, it just costs more and time to get good results.

https://chatgpt.com/share/69a12666-64b0-8009-8dfe-59546ac400...

EDIT - Updated the link to include the full conversation. Note that I didn't change it to pro mode until the end, and eventually got tired of waiting and just told it "answer now."

MakeAJiraTicket · 2026-02-27T19:05:47 1772219147

This is the expected result. "Do you see the connection?" is where it failed to actually bridge the connection. I don't know if pro mode is relevant, but they require someone prodding from the perspective of already knowing the invention to reach it themselves.

They capture the gestalt of reasoning, they can reason in patterns that we encoded with language, but they can't do genuine reasoning.

mapontosevenths · 2026-02-27T22:10:23 1772230223

I'm not sure a lack of intuition implies a lack of reasoning. They clearly didn't make that jump until they were told to look for something, but did pretty handily once asked. Clearly they used some version of reasoning to do that, but just as clearly had no interest in it at all until directed to look for it.

I wonder if we phrased it differently we could get them to make the leap without so much hinting? I'll tinker with it a bit later.

Either way, very cool experiment! Thanks for posting it. I'd upvote you again if I could.

MakeAJiraTicket · 2025-10-11T19:36:42 1760211402

The author explicitly states that there are other types of thought, this is about structured reasoning.

MakeAJiraTicket · 2025-10-10T04:04:06 1760069046

The Telexistence demo isn't so bad, but I have no idea why we're trying to make human robots generally. The human shape sucks at a most things, and we already have people treating roombas and GPT like their boyfriends or pets...

tomcam · 2025-10-10T04:11:23 1760069483

What form factor would be better at going up and down stairs? Reaching to a high shelf? Getting between the refrigerator and counter to grab a key?

jjaksic · 2025-10-10T07:25:50 1760081150

Because human work is designed for humans. If you want a drop-in replacement for human workers, humanoid robots are your best bet.

seanhunter · 2025-10-10T12:14:08 1760098448

That doesn’t even remotely follow. Human work is designed for humans so if you want human work done you need a human to do it.

If you want to replace the human the best bet is to redesign the work so that it can be done with machine assistance, which is what we’ve been doing since the industrial revolution.

There’s a reason the motor car (which is the successful mass market personal transportation machine) doesn’t look anything like the horse that it replaced.

jkestner · 2025-10-10T14:55:24 1760108124

<HN voice>Technically, the motor car replaced the coach. A more accurate and enjoyable analogy would that car engines don’t look like horses.

batch12 · 2025-10-10T14:58:57 1760108337

hence horseless CARriage

batch12 · 2025-10-10T15:00:16 1760108416

I don't think that it's human work this would target, but instead work in shared human spaces.

seanhunter · 2025-10-10T18:35:31 1760121331

We already have robots that work in shared human spaces, and our experience in that domain has shown that you need to put a lot of thought into how to do this safely and specifically how to prevent the robot from accidentally harming the humans. Ask anyone with a robotic cnc machine how they would feel about running the machine without its protective housing for example. I expect they will start to throw up just a little bit. Flexibility is exactly the opposite of what you need until we have a CV and controller combination that can really master its environment. I could forsee a lot of terrible accidents if you brought a humanoid robot into a domestic environment without a lot of care and preparation for example.

blktiger · 2025-10-10T14:14:13 1760105653

Sure but robots don’t join unions or ask for a pay raise or benefits.

MakeAJiraTicket · 2025-10-10T03:52:38 1760068358

I have a function that compares letters to numbers for the Major System and it's like 40 lines of code and copilot starts trying to add "guard rails" for "future proofing" as if we're adding more numbers or letters in the future.

It's so annoying.

MakeAJiraTicket · 2025-10-10T03:49:39 1760068179

Defensive programming is considered "correct" by the people doing the reinforcing, and is a huge part of the corpus that LLM's are trained on. For example, most python code doesn't do manual index management, so when it sees manual index management it is much more likely to freak out and hallucinate a bug. It will randomly promote "silent failure" even when a "silent failure" results in things like infinite loops, because it was trained on a lot of tutorial python code and "industry standard" gets more reinforcement during training.

These aren't operating on reward functions because there's no internal model to reward. It's word prediction, there's no intelligence.

LeifCarrotson · 2025-10-10T13:46:47 1760104007

LLMs do use simple "word prediction" in the pretraining step, just ingesting huge quantities of existing data. But that's not what LLM companies are shipping to end users.

Subsequently, ChatGPT/Claude/Gemini/etc will go through additional training with supervised fine-tuning, reinforcement learning with reward functions whether human-supervised feedback (RLHF) or reward functions (RLVR, 'verified rewards').

Whether that fine-tuning and reward function generation give them real "intelligence" is open to interpretation, but it's not 100% plagarism.

aoeusnth1 · 2025-10-10T19:37:34 1760125054

You used the word reinforcing, and then asserted there's no reward function. Can you explain how it's possible to perform RL without a reward function, and how the LLM training process maps to that?

MakeAJiraTicket · 2025-10-10T23:38:47 1760139527

LLM actions are divorced from that reward function, it's not something they consult or consider. Reward function in that context doesn't make sense.

comex · 2025-10-10T06:28:56 1760077736

Reinforcement learning by definition operates on reward functions.

HN For You