For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | SknCode's commentsregister

How?


Same way you distill any model. Training data efficiency matters only while you train the source model/ensemble. Once you have that you are purely compute bound during distillation.


I am not sure if I understand things correctly.

I came to believe the LLMs work with token embeddings. Is then the REFRAG only "something" in front of the LLM, and the decoder is the RL policy which expands only some token chunk embeddings into token embeddings feedable to LLM? Or the REFRAG needs you to 'tune' the LLM to be able to work with both token embeddings and token chunk embeddings?


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You