For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | fromregister
Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com)
7 points by gpjt 18 hours ago | past | discuss
Automating starting Lambda Labs instances (gilesthomas.com)
2 points by ibobev 1 day ago | past | discuss
Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
2 points by ibobev 10 days ago | past | discuss
Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
2 points by gpjt 10 days ago | past | discuss
Writing an LLM from scratch, part 32f – Interventions: weight decay (gilesthomas.com)
6 points by gpjt 11 days ago | past | discuss
Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com)
3 points by ibobev 19 days ago | past
Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com)
3 points by gpjt 25 days ago | past
Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com)
3 points by ibobev 54 days ago | past
Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)
1 point by ibobev 54 days ago | past
Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)
1 point by ibobev 54 days ago | past
Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com)
1 point by ibobev 54 days ago | past
Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com)
6 points by gpjt 56 days ago | past
Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)
1 point by gpjt 57 days ago | past
Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)
2 points by gpjt 58 days ago | past
Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com)
1 point by gpjt 59 days ago | past
Getting a Custom PyTorch LLM onto the Hugging Face Hub (gilesthomas.com)
1 point by ibobev 65 days ago | past
Getting a Custom PyTorch LLM onto the Hugging Face Hub (gilesthomas.com)
1 point by gpjt 65 days ago | past
Writing an LLM from scratch, part 31 – the models are now on Hugging Face (gilesthomas.com)
1 point by ibobev 75 days ago | past
Writing an LLM from scratch, part 31 – the models are now on Hugging Face (gilesthomas.com)
2 points by gpjt 76 days ago | past
Digging into the LLM-as-a-Judge Results (gilesthomas.com)
1 point by ibobev 84 days ago | past
Digging into the LLM-as-a-Judge Results (gilesthomas.com)
1 point by ibobev 85 days ago | past
Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results (gilesthomas.com)
1 point by gpjt 85 days ago | past
Using DistributedDataParallel to train a base model from scratch in the cloud (gilesthomas.com)
10 points by ibobev 86 days ago | past
LLM from scratch, part 29 – using DDP to train a base model in the cloud (gilesthomas.com)
2 points by gpjt 86 days ago | past
LLM from scratch, part 28 – training a base model from scratch on an RTX 3090 (gilesthomas.com)
540 points by gpjt 4 months ago | past | 121 comments
Why smart instruction-following makes prompt injection easier (gilesthomas.com)
2 points by ibobev 4 months ago | past
Writing an LLM from scratch, part 27 – what's left, and what's next? (gilesthomas.com)
1 point by gpjt 5 months ago | past
Writing an LLM from scratch, part 26 – evaluating the fine-tuned model (gilesthomas.com)
4 points by gpjt 5 months ago | past
Writing an LLM from scratch, part 25 – instruction fine-tuning (gilesthomas.com)
2 points by gpjt 5 months ago | past
Writing an LLM from scratch, part 24 – the transcript hack (gilesthomas.com)
1 point by gpjt 5 months ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You