User @thin_signal developed a tool for mixed precision quantization on MLX. They perfoemd a sensitivity analysis across the model layers and applied less radical quantization on the layers more sensitive and more quantization tomlayers that are more robust.
The tool, which is documented here (https://mlx-optiq.pages.dev/) also implements the recently aanounced TurboQuant KV-Cache optimization, so in total this should greatly improve the quality of locally run LLMs.
Looking forward to an OptiQ release of the Gemma 4 family.
Along the same line is that you can watch any hour long video without interruptions unless it is music where you will get interrupted every couple of minutes with "are you there?" dialogues.
I already use MkDocs, I don't use NodeJs. I looked for something in the Python ecosystem, but everything is JS. So i wanted to see if I could create it myself :)
Wondering if it were feasible to vibe-code a Bot that would engage and invest the little sums that yield those bait gains, and harvest those.
Eject at the right time.
reply