ENSAE Paris - École d'ingénieurs pour l'économie, la data science, la finance et l'actuariat

High-dimensional statistics

Objectif

This course develops tools to analyze statistical problems in high-dimensional settings
where the number of variables may be greater than the sample size. It is in contrast with
the classical statistical theory that focuses on the behavior of estimators in the asymptotics
as the sample increases while the number of variables stays fixed. We will show that, in high-
dimensional problems, powerful statistical methods can be constructed under such properties
as sparsity or low-rankness. The emphasis will be on the non-asymptotic theory underlying
these developments.

 

The grade is determined by a final exam. Extra points can be acquired for
optional homeworks.

Plan

  • Sparsity and thresholding in the Gaussian sequence model.
  • High-dimensional linear regression: Lasso, BIC, Dantzig selector, Square Root Lasso. Oracle inequalities and variable selection properties.
  •  Estimation of high-dimensional low rank matrices. Recommendation systems.
  • Inhomogeneous random graph model. Community detection and esti-
    mation in the stochastic block model.

Références

Alexandre Tsybakov. High-dimensional Statistics. Lecture Notes. (Detailed Lecture Notes are available.)