mantapoint's comments

mantapoint · on Jan 9, 2023

If you're more inclined to theory, I would suggest "Learning Theory from First Principles" by F. Bach: https://www.di.ens.fr/~fbach/ltfp_book.pdf

The book assumes limited knowledge (similar to what is required for Pattern Recognition I would say) and gives a good intuition on foundational principles of machine learning (bias/variance tradeoff) before delving to more recent research problems. Part I is great if you simply want to know what are the core tenets of learning theory!

sinenomine · on Jan 10, 2023

Ironic, since the relatively recently discovered double descent makes it clear that bias-variance tradeoff as we know it from statistical learning theory simply doesn't apply to "overparameterized" deep models.

Much of old theory is barely applicable and people are, understandably, bewildered and in denial.

If someone were to be inclined to theory, I'd just recommend reading papers that don't try oversimplify the domain:

https://arxiv.org/abs/2006.15191

https://arxiv.org/abs/2210.10749

https://arxiv.org/abs/2205.10343

https://arxiv.org/abs/2105.04026

mantapoint · on Jan 10, 2023

I don't believe it's oversimplifying the domain. Typically the reference I pointed to has a section dedicated to double descent (sec 11.2). You may also be surprised that such phenomenon can be observed on toy convex convex examples from "old theory" (sec 11.2.3), as you call it.

Anyways, I still believe that learning foundational stuff such as the bias-variance tradeoff is useful before diving to more advanced stuff. I even think that tackling recent research question with old tool is insightful too. But that's only my opinion, and perhaps I'm in denial :)

HN For You