T. Tony Cai

Lectures on Ultra High Dimensional Regression
Tony Cai
Presented at Department of Biostatistics, Harvard University, April 2010

Abstract: The analysis of high-dimensional data now commonly arising in scientific investigations poses many statistical challenges not present in smaller scale studies. In these lectures I will discuss high-dimensional linear regression with large p and small n. This problem has attracted much recent interest in a number of fields including applied mathematics, electrical engineering, and statistics.

To provide a proper background and foundation for the main topics, we shall begin with discussions on the high-dimensional Gaussian sequence model. We then consider the linear model y = Xβ + z, where the dimension of the signal β is much larger than the number of observations. It is now well understood that l₁ minimization methods provide effective ways for high dimensional sparse regression. I will present an elementary and unified analysis of l₁ minimization methods including Lasso and the Dantzig Selector in three settings: noiseless, bounded error and Gaussian noise. Time permitting, I will also discuss l₁ minimization approaches to sparse precision matrix estimation.

References:

Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig Selector. The Annals of Statistics 37, 1705-1732.
Cai, T., Liu, W. & Luo, X. (2011). A constrained l₁ minimization approach to sparse precision matrix estimation. J. American Statistical Association 106, 594-607.
Cai, T., Wang, L. & Xu, G. (2010). Shifting inequality and recovery of sparse signals. IEEE Transactions on Signal Processing 58, 1300-1308.
Cai, T., Wang, L. & Xu, G. (2010). Stable recovery of sparse signals and an oracle inequality. IEEE Transactions on Information Theory 56, 3516-3522.
Cai, T., Zhang, C.-H. & Zhou, H. (2010). Optimal rates of convergence for covariance matrix estimation. The Annals of Statistics 38, 2118-2144.
Candes, E. T. and Tao, T. (2007). The Dantzig Selector: Statistical estimation when p is much larger than n (with discussion). The Annals of Statistics 35, 2313-2351.

Back to Tony Cai's Homepage