Statistics and Its Interface

Volume 11 (2018)

Number 2

Interaction screening by partial correlation

Pages: 317 – 325

DOI: http://dx.doi.org/10.4310/SII.2018.v11.n2.a9

Authors

Yue Selena Niu (Department of Mathematics, University of Arizona, Tucson, Az., U.S.A.)

Ning Hao (Department of Mathematics, University of Arizona, Tucson, Az., U.S.A.)

Hao Helen Zhang (Department of Mathematics, University of Arizona, Tucson, Az., U.S.A.)

Abstract

Interaction effects between predictors can play an important role in improving prediction and model interpretation for regression models. However, it is both statistically and computationally challenging to discover informative interactions for high dimensional data. Variable screening based on marginal information is popular for identifying important predictors, but it is mainly used for main-effect-only models. In this paper, we study interaction screening for high dimensional quadratic regression models. First, we show that the direct generalization of main-effect screening to interaction screening can be incorrect or inefficient, as it overlooks the intrinsic relationship between main effects and interactions. Next, we propose a main-effect-adjusted interaction screening procedure to select interactions while taking into account main effects. This new unified framework can be employed with multiple types of correlation measures, such as Pearson correlation coefficients, nonparametric rank-based measures including Spearman’s and Kendall’s correlation coefficients. Efficient algorithms are developed for each correlation measure to make the screening procedure scalable to high dimensional data. Finally, we illustrate performance of the new screening procedure by simulation studies and an application to a retinopathy study.

Keywords

high dimensional data, interaction effects, marginal statistic, quadratic regression, rank correlation, variable screening

2010 Mathematics Subject Classification

Primary 62F07, 62H20. Secondary 62J05.

Full Text (PDF format)

This research is supported in part by National Science Foundations DMS-1309507, DMS-1418172, DMS-1722691, and NSFC-11571009. The authors thank the Editor, AE, and reviewers for their helpful comments and suggestions.

Received 9 February 2017

Published 7 March 2018