One feature of hypothesis testing that is perhaps too of ten ignored is that one seldom tests just one hypothesis. Even in the most classical situations, one almost always makes multiple tests. In more modern investigations, such as those that are common in machine learning, bioinformatics, or even the econometrics of financial time series, one might test several thousands of hypotheses. Obviously such activities call for some rethinking of the traditional p=0.05 standards.
In fact, a large (and rambling) literature addresses this important issue. As our course continues, I will introduce what seem to me to be some of the main messages that emerge. Here are some of the resources that I may mention. A natural place to start is with Holm (1979), but, unfortunately, that article is one for which I have no link.
It would be nice to have a simple survey or textbook presentation to get one started. The best that I have found on the web is the tutorial by Lee and Whitmore. Unfortunately, these slides don't come with a sound track, so you will need to fill in a few blanks. Let me know if you find a more suitble introduction.
The false discovery rate (FDR) is a notion that is closely tied up with the methodology for testing multiple hypotheses. Eventually I expect to develop a page that specifically addresses FDR. Another related topick is "data snooping," and issue that is an eternal part of the conversation when one considers trading strategies. Of the papers noted below, only Romano and Wolf (2005) explicity adress econometic concerns, but I am sure that there are further relevant resources.
Hommel (1988) seems like a sensible place to start if you get interested in these issues. Wright (1992) is also well written and gives a slightly shifted perspective. Marcus, Peritiz, and Gabriel (1976), which introduces the notion of a "closed set" of hypotheses, is warmly recommended by Bob Stine.
Holm, S. (1979) A Simple Sequentially Rejective Multiple Test Procedure, Scandinavian Journal of Statistics, 6, 65--70.
Wright, S. P. (1992) Adjusted P-Values for Simultaneous Inference, Biometics 48, 1005--1013.