For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | more frannyg's commentsregister

I'm still figuring out "inference time" but what left me puzzled at first was that there is - to humans at least - an infinite amount of tokens that might come next, technical jargon, synonyms, lexical levels in general, so in my mind there was an RNG build into the function, that, after "filtering" the weights based on the user request - and a lot of different tokens, even those meaning the same or almost the same have the same weights - simply rolled the dice to produce the return string.

I thought the LLM was "getting to know the user" but it had it a short memory span (the context) and thus "forgot" already calculated weights that it would use to (re)generate new weights.

Further down I learned it freaking forgets all the previous weights in general (I think that's what I learned, I'm getting there)


That was indeed part of what I wondering about.

Larger and smaller, in my beginner mind, was a difference of much recursiveness the design of the model allowed.

- User request implies knowledge about X. - PULLING in weights for X. - Probability of user knowing about Xm and Xz is low (because the training data says Xm and Xz are PhD-level knowledge or something). - Pulling in weights for an ELI5-level explanation of Xm and Xz ...

I thought, an LLM would do this recursive pulling of weights based on the semantics of the user request, which it does, but it doesn't do that "dynamically" based on "recalculated" weights and regenerated combos of tokens, which could happen if the training data wasn't "frozen" and accessible, which I learned further down in the comments, isn't.

That's why I wondered whether more processing power and or time would benefit this recursive generation and pulling.


> You can't spend compute to get more detail [...]

Upscaling, technically, is a thing without limits, no?


Right on. A total misconception on my part. And your answer was a nice primer before diving in to the rest of the comments. Thanks!


I wasn't able to elaborate on what I mean with "better" when I asked the question but the idea can indeed be summarized with "will an LLM increase quantity and quality of parameters if you give it more processing power and time". Now I know that language models don't do that at all and that the weights of the user request stored in the "frozen" training data is what assembles the return after generating possible output strings, which are selected by pre-prompts like asking for chain of thought and reasoning paths and so on, which in the end, are nothing more than more weights pulling in more specific context. (I'm just thinking out loud here)


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You