The book assumes limited knowledge (similar to what is required for Pattern Recognition I would say) and gives a good intuition on foundational principles of machine learning (bias/variance tradeoff) before delving to more recent research problems. Part I is great if you simply want to know what are the core tenets of learning theory!
Ironic, since the relatively recently discovered double descent makes it clear that bias-variance tradeoff as we know it from statistical learning theory simply doesn't apply to "overparameterized" deep models.
Much of old theory is barely applicable and people are, understandably, bewildered and in denial.
If someone were to be inclined to theory, I'd just recommend reading papers that don't try oversimplify the domain:
I don't believe it's oversimplifying the domain. Typically the reference I pointed to has a section dedicated to double descent (sec 11.2). You may also be surprised that such phenomenon can be observed on toy convex convex examples from "old theory" (sec 11.2.3), as you call it.
Anyways, I still believe that learning foundational stuff such as the bias-variance tradeoff is useful before diving to more advanced stuff. I even think that tackling recent research question with old tool is insightful too. But that's only my opinion, and perhaps I'm in denial :)
The book assumes limited knowledge (similar to what is required for Pattern Recognition I would say) and gives a good intuition on foundational principles of machine learning (bias/variance tradeoff) before delving to more recent research problems. Part I is great if you simply want to know what are the core tenets of learning theory!