Agreed, this is some proper nice tinkering writeup that we get far too rarely now.
Lovely project! I'm a software guy who in recent years does lots of CAD for hobby projects (mainly robotics) and orders custom machined parts (lots of sheet metal construction, occasionally milled parts) along with 3D printing.
I find parametric modelling very zen. Stacking operations is very Lego-like, like stringing up pure functions. Plus I can listen to podcasts while I model, but not while I write code - it engages the brain differently.
Now that LLMs are sapping some of the joy out of programming (I use the tools, they're productive, achieving goals and delivering user value is still satisfying, etc. - but the act of writing code is just more enjoyable than prompting, so it's a tad dispiriting that it's getting harder to jusitify) I also find that I get a lot of satisfaction from doing something with my hands. In some ways it's a safer space for technical creativity.
Paragraphica is a lensless, sensorless camera that, when you press the shutter, compiles a bunch of data with GPS, location, time of day etc, and feeds it to an image generator to create the image.
As large language models (LLMs) transition from research prototypes to production systems, practitioners often need reliable methods to verify that model outputs satisfy required constraints. While sampling-based estimates provide an intuition of model behavior, they offer no sound guarantees. We present BEAVER, the first practical framework for computing deterministic, sound probability bounds on LLM constraint satisfaction. Given any prefix-closed semantic constraint, BEAVER systematically explores the generation space using novel token trie and frontier data structures, maintaining provably sound bounds at every iteration. We formalize the verification problem, prove soundness of our approach, and evaluate BEAVER on correctness verification, privacy verification and secure code generation tasks across multiple state of the art LLMs. BEAVER achieves 6 to 8 times tighter probability bounds and identifies 3 to 4 times more high risk instances compared to baseline methods under identical computational budgets, enabling precise characterization and risk assessment that loose bounds or empirical evaluation cannot provide.
wondering how would literate programming be with the actual code being written by LLM? I have been searching for any tools that allow for such a setup... may be my next weekend project in a long list of projects
nice analogy comparing it to BBC Radio 3- if you/someone knows which neighbourhood would be like BBC Radio 4? I find R3 too high brow for me - Radio 4 seems more accessible :)
First place that jumps to mind is Richmond! Radio 4 is certainly more "chatty" while still being posh and Richmond is both posh and full of that street/café life.
If we're going to fill out the roster, let's say Radio 1 is Camden, 1xtra is Brixton, Radio 2 is Bromley, Radio 5 is Dagenham, and 6music is eh.. I dunno, Shoreditch?
Maybe! I am not really an R4 listener and when I am, it's for the twee comedy stuff rather than the news and politics. If LBC is as you describe, though, I dread to imagine where Talk Radio is :-D