For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | NicoConstant's submissionsregister
1.Real-time LLM Inference on Standard GPUs: 3k tokens/s per request (kog.ai)
219 points by NicoConstant 15 days ago | past | 97 comments
2.Kog AI – Building a Real-Time Inference Stack on AMD Instinct GPUs [video] (youtube.com)
8 points by NicoConstant 29 days ago | past

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You