This will be an introduction to the mathematics of machine learning from a mathematical point of view. We will start with an introduction to optimization and learning, survey some of the characteristic phenomena of high dimensional geometry, and then delve into the mathematical structure of neural networks. We shall be interested in the capability of such networks in principle (universality), the geometry of the loss landscape (proliferation of saddle points; manifold hypothesis), and certain specific ideas used in modern machine learning (attention, batch normalization, dropout etc) - what they actually do (from a mathematical perspective) and why they work.
Notes for this class will be posted online here and updated as we go along.