ENSAE Paris - École d'ingénieurs pour l'économie, la data science, la finance et l'actuariat

Introduction to Applied Statistical Learning - CI/MS

Objective

This course is a comprehensive introduction to machine learning methods. It will introduce the typical problems of data description and modeling in order to better predict the response of a new individual. We will describe the algorithms and quantify their good behavior and, in parallel, through R-based work sessions, we will see how to use these methods in practice.

At the end of this course, the students should be able to

  • Set up classification or regression methods
  • Know the theory of the methods presented
  • Read and interpret the digital outputs of these methods
     

Planning

Introduction.

  • Difference between estimation (statistical) and prediction (ML); definition of loss functions, risk, empirical risk. 

Classification algorithms.

  • Methods from statistics, linear discrimination. Nearest neighbor method and other universally consistent methods. Decision trees and Random forests.

Regression algorithms.

  • Least squares method. Penalization methods: RIDGE estimator, LASSO estimator and Elastic Net.

Selection of estimators.

  • Empirical risk minimization methods. Learning and test data. Cross-validation.

References

  • C. Bishop. Pattern Recognition and Machine Learning. Springer 2006. This is an excellent introduction to machine learning. Contains lots of exercises, some with exemplary solutions. 
  • R. Duda, P. Hart, and D. Stork. Pattern Classification. John Wiley & Sons, second edition, 2001. The classic introduction to the field. 
  • Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016.
  • Mohri, Mehryar, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of machine learning. MIT press, 2018.
  • L. Wasserman. All of Statistics: A Concise Course in Statistical Inference. Springer, 2004. This book is a compact treatment of statistics that facilitates a deeper understanding of machine learning methods. 
  • K. Murphy. Machine Learning: A Probabilistic Perspective. MIT, 2012. Unified probabilistic introduction to machine learning. 
  • S. Shalev-Shwartz, and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014. This recent book covers the mathematical foundations of machine learning. Available for personal use online: Link.