Two-Sample Covariance Matrix Testing and Support Recovery in High-Dimensional and Sparse Settings
Tony Cai, Weidong Liu, and Yin Xia
Abstract:
This paper considers in the high-dimensional setting three inter-related problems: (a) testing the equality of two covariance matrices Σ_{1} and Σ_{2}; (b) recovering the support of Σ_{1} - Σ_{2} ; and (c) testing the equality of Σ_{1} and Σ_{2} row by row. We propose a new test for testing the hypothesis H_{0}: Σ_{1} = Σ_{2} and investigate its theoretical and numerical properties. The limiting null distribution of the test statistic is derived and the power of the test is studied. The test is shown to enjoy certain optimality and to be especially powerful against sparse alternatives. The simulation results show that the test significantly outperforms the existing methods both in terms of size and power. Analysis of a prostate cancer dataset is carried out to demonstrate the application of the testing procedures.
When the null hypothesis of equal covariance matrices is rejected, it is often of significant interest to further investigate how they differ from each other. Motivated by applications in genomics, we also consider recovering the support of Σ_{1} - Σ_{2} and testing the equality of the two covariance matrices row by row. New procedures are introduced and their properties are studied. Applications to gene selection is also discussed.