For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | mantapoint's commentsregister

If you're more inclined to theory, I would suggest "Learning Theory from First Principles" by F. Bach: https://www.di.ens.fr/~fbach/ltfp_book.pdf

The book assumes limited knowledge (similar to what is required for Pattern Recognition I would say) and gives a good intuition on foundational principles of machine learning (bias/variance tradeoff) before delving to more recent research problems. Part I is great if you simply want to know what are the core tenets of learning theory!


Ironic, since the relatively recently discovered double descent makes it clear that bias-variance tradeoff as we know it from statistical learning theory simply doesn't apply to "overparameterized" deep models.

Much of old theory is barely applicable and people are, understandably, bewildered and in denial.

If someone were to be inclined to theory, I'd just recommend reading papers that don't try oversimplify the domain:

https://arxiv.org/abs/2006.15191

https://arxiv.org/abs/2210.10749

https://arxiv.org/abs/2205.10343

https://arxiv.org/abs/2105.04026


I don't believe it's oversimplifying the domain. Typically the reference I pointed to has a section dedicated to double descent (sec 11.2). You may also be surprised that such phenomenon can be observed on toy convex convex examples from "old theory" (sec 11.2.3), as you call it.

Anyways, I still believe that learning foundational stuff such as the bias-variance tradeoff is useful before diving to more advanced stuff. I even think that tackling recent research question with old tool is insightful too. But that's only my opinion, and perhaps I'm in denial :)


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You