# Methods for Testing Multiple Hypotheses

One feature of hypothesis testing that is perhaps too often ignored is that one seldom tests just one hypothesis. Even in the most classical situations, one almost always makes multiple tests. In more modern investigations, such as those that are common in machine learning, bioinformatics, or even the econometrics of financial time series, one might test several thousands of hypotheses.

Obviously such activities call for some rethinking of the traditional p=0.05 standards.

In fact, a large (and rambling) literature addresses this important issue. As our course continues, I will introduce what seem to me to be some of the main messages that emerge.

A great place to start is with Holm (1979).

It would be nice to have a simple survey or textbook presentation that gives one the current "big picture.".

The best that I have found on the web is the tutorial by Lee and Whitmore. Unfortunately, these slides don't come with a sound track, so you will need to fill in a few blanks. Let me know if you find a more suitable introduction.

## Related Topics: FDR and Data Snooping

The false discovery rate (FDR) is a notion that is closely tied up with the methodology for testing multiple hypotheses. Eventually I expect to develop a page that specifically addresses FDR. Another related topic is "data snooping," and issue that is an eternal part of the conversation when one considers trading strategies. Of the papers noted below, only Romano and Wolf (2005) explicitly address econometric concerns, but I am sure that there are further relevant resources.

## Original Articles

Hommel (1988) seems like a sensible place to start if you get interested in these issues. Wright (1992) is also well written and gives a slightly shifted perspective. Marcus, Peritiz, and Gabriel (1976), which introduces the notion of a "closed set" of hypotheses, is warmly recommended by Bob Stine.

Hochberg, Y. (1988) A Sharper Bonferroni Procedure for Multiple Tests of Significance, *Biometrika,* 75, 800--802.

**Holm, S. (1979) A Simple Sequentially Rejective Multiple Test Procedure, ***Scandinavian Journal of Statistics*, 6, 65--70.

Holm, S. (1999) Multiple Confidence Sets Based on Stagewise Tests, *J. Amer. Statist. Soc.*, (94), 489--495.

Hommel, G. (1988) A stagewise rejective multiple test procedure based on a modified Bonferroni test, *Biometrika*, (75) 2, 383--386.

Marcus, R., Peritz, E. and Gabriel, K.R. (1976) On closed testing procedures with special reference to ordered analysis of variance, *Biometrika*, (63) 3, 655--660.

Romano, J. and Wolf M. (2005) Stepwise Multiple Testing as Formalized Data Snooping, Technical Report, Department of Statistics, Stanford University.

Simes, R. J. (1986) An improved Bonferroni Procedure for Multiple Tests of Significance, *Biometrika*, 73, 751--754.

Troendle, J. F. (1995) A Stepwise Resampling Method of Multiple Hypothesis Testing, J. Amer. Statist. Soc. vol 90 no 492, 370--378.

Wright, S. P. (1992) Adjusted P-Values for Simultaneous Inference, *Biometics* 48, 1005--1013.