For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | at2005's commentsregister

I didn't compare with the harness (focused on distillation) but the original ToT paper has a section on it: https://arxiv.org/abs/2305.10601


Ah, I meant that MCTS uses more inference-time compute (over GRPO) to produce a training sample


Btw the whole motivation for this were algorithms like Grover's, which need "oracles" to be specified. You can only imagine trying to code adders and greater-than circuits with QASM...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You