ENSAE Paris - École d'ingénieurs pour l'économie, la data science, la finance et l'actuariat

High-dimensional statistics

Objective

This course develops tools to analyze statistical problems in high-dimensional settings where the number of variables may be greater than the sample size. It is in contrast with the classical statistical theory that focuses on the behavior of estimators in the asymptotics as the sample increases while the number of variables stays fixed. We will show that, in high-dimensional problems, powerful statistical methods can be constructed under such properties as sparsity or low-rankness. The emphasis will be on the non-asymptotic theory underlying these developments. 

The grade is determined by a final exam. 

Planning

•    Sparsity and thresholding in the Gaussian sequence model.
•    High-dimensional linear regression: Lasso, BIC, Dantzig selector, Square Root Lasso. Oracle inequalities and variable selection properties.
•    Estimation of high-dimensional low rank matrices. Recommendation systems.
•    Inhomogeneous random graph model. Community detection and estimation in the stochastic block model.
 

References

Alexandre Tsybakov. High-dimensional Statistics. Lecture Notes. (Detailed Lecture Notes are available.)