Submissions from gilesthomas.com

		Jax Back Ends and Devices (gilesthomas.com)
		2 points by gpjt 3 hours ago \| past \| discuss
		Using Safetensors with Flax (gilesthomas.com)
		1 point by ibobev 9 hours ago \| past \| discuss
		Using Safetensors with Flax (gilesthomas.com)
		2 points by gpjt 23 hours ago \| past \| discuss
		First Looking into Jax (gilesthomas.com)
		2 points by ibobev 4 days ago \| past \| discuss
		First Looking into Jax (gilesthomas.com)
		3 points by gpjt 6 days ago \| past \| discuss
		10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module (gilesthomas.com)
		1 point by ibobev 17 days ago \| past
		10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module (gilesthomas.com)
		3 points by gpjt 18 days ago \| past
		10Gb/s Ethernet: what I did to get it working in my home (gilesthomas.com)
		232 points by gpjt 37 days ago \| past \| 177 comments
		10Gb Ethernet: what I had to (re)learn (gilesthomas.com)
		2 points by ibobev 37 days ago \| past
		10Gb Ethernet: what I had to (re)learn (gilesthomas.com)
		1 point by gpjt 38 days ago \| past \| 1 comment
		LLM from scratch, part 33 – what I learned from the appendices (gilesthomas.com)
		5 points by gpjt 44 days ago \| past
		LLM from scratch (32l) – Interventions: updated instruction fine-tuning results (gilesthomas.com)
		1 point by gpjt 45 days ago \| past
		An LLM becomes more coherent as we train it (gilesthomas.com)
		1 point by ibobev 48 days ago \| past
		How an LLM becomes more coherent as we train it (gilesthomas.com)
		3 points by gpjt 48 days ago \| past
		LLM from scratch, part 32k – Interventions: gradient accumulation (gilesthomas.com)
		2 points by gpjt 51 days ago \| past
		Interventions: Trying to train a better model in the cloud (gilesthomas.com)
		1 point by ibobev 56 days ago \| past
		LLM from scratch, part 32j – trying to train a better model in the cloud (gilesthomas.com)
		2 points by gpjt 57 days ago \| past
		Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com)
		2 points by ibobev 58 days ago \| past
		Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com)
		1 point by gpjt 59 days ago \| past
		Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com)
		2 points by ibobev 59 days ago \| past
		Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com)
		7 points by gpjt 62 days ago \| past
		Automating starting Lambda Labs instances (gilesthomas.com)
		2 points by ibobev 63 days ago \| past
		Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
		2 points by ibobev 72 days ago \| past
		Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
		2 points by gpjt 73 days ago \| past
		Writing an LLM from scratch, part 32f – Interventions: weight decay (gilesthomas.com)
		6 points by gpjt 73 days ago \| past
		Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com)
		3 points by ibobev 81 days ago \| past
		Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com)
		3 points by gpjt 87 days ago \| past
		Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com)
		3 points by ibobev 3 months ago \| past
		Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)
		1 point by ibobev 3 months ago \| past
		Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)
		1 point by ibobev 3 months ago \| past
		More

HN For You