STAT991: Regularization Methods in Learning
Spring 2009
Time & Location: MW 1:30 3:00, F88 JMHH
Course Description
In this course, we survey regularization (penalization) methods in
machine learning and statistics. As we go back and forth between
statistical estimation and optimization, we address both statistical
convergence properties and computational issues that arise from
minimizing a regularized objective. We closely study one apparent
connection between statistics and optimization through the lens of
minimax duality. One of the goals of the course is to gain some
understanding of the tradeoff between computational and statistical
costs  a largely unexplored area.
We will touch upon the following topics: online convex optimization,
Fenchel duality, Tikhonov regularization, SVMs for classification and
regression, limited feedback (bandit) problems, random perturbation
and random projection methods, aggregation methods, multitask learning
and matrix regularization, Lasso and L1penalization methods, model
selection and sieves, Rademacher complexity, regularization via early
stopping, informationbased complexity of Convex Programming, and
more. We will aim to develop a general framework and tools for many of
the above methods. Open questions and potential topics for research
will be given in most lectures.
Tentative Schedule  this will change with probability 1
 Jan 14. Intro to regularization/penalization methods.
 Jan 21. Basics of convex optimization.
 Jan 26. Basics of convex analysis. von Neumann Minimax Theorem, Lagrange dual.
 Jan 28. Sequential optimization and LegendreFenchel duality. Application to linear models.
 Feb 02. From sequential optimization to regret minimization. Gradient descent methods. Mirror descent.
 Feb 04. Repeated games. Regret minimization. Connection to stochastic minimization.
 Feb 16. Regret analysis. Entropy regularization, exponential weights.
 Feb 18. Relation between regretminimization algorithms. Strong convexity and BetheLeader analysis.
 Feb 23. Curved losses, expconcavity, fast rates. Applications to aggregation methods.
 Feb 25. Aggregation methods.
 Mar 02. Aggregation, averaging, and von Neumann's Minimax Theorem.
 Mar 16. Dual of the sequential optimization problem.
 Mar 18. Regret as Bregman divergences on distributions.
 Mar 23. Minimum expected risk functional and its geometry.
 Mar 25. Curvature and fast rates for minimax regret.
 Mar 30. Rademacher complexity: applications to uniform deviations (classical) and minimax regret (novel).
 Apr 01. Lower bounds on supremum of empirical process. Application to minimax regret.
 Apr 06. Examples: linear primaldual ball game, experts. Upper bounds on Rademacher complexity for linear games.
 Apr 08. Stochastic Optimization.
 Apr 13. Multiarmed Bandits and optimization with limited feedback.
 Apr 15. Interior point methods; selfconcordant functions and applications to bandit optimization.
 Apr 20. Stability of regularization methods; relation to generalization and consistency.
 Apr 22. Online algorithms in Reproducing Kernel Hilbert Spaces.
 Apr 27. Model selection via regularization. Fast rates.
 Apr 29.
Suggested Articles (this list will be greatly expanded)
Online Learning
Optimization
Interaction of optimization and estimation errors
Sparsity
Early Stopping
Bandit Problems
Aggregation
Methods in Statistics
Model Selection
Useful Books

Fundamentals of Convex Analysis.
HiriartUrruty and Lemaréchal

Prediction, Learning, and Games.
N. CesaBianchi and G. Lugosi