Statistics and Its Interface

Volume 2 (2009)

Number 4

Regularized (bridge) logistic regression for variable selection based on ROC criterion

Pages: 493 – 502

DOI: http://dx.doi.org/10.4310/SII.2009.v2.n4.a10

Authors

Hong-Bin Fang (Division of Biostatistics, University of Maryland Greenebaum Cancer Center, Baltimore, Maryland, U.S.A.)

Zhenqiu Liu (Division of Biostatistics, University of Maryland Greenebaum Cancer Center, Baltimore, Maryland, U.S.A.)

Ming T. Tan (Division of Biostatistics, University of Maryland Greenebaum Cancer Center, Baltimore, Maryland, U.S.A.)

Guo-Liang Tian (Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong)

Abstract

It is well known that the bridge regression (with tuning parameter less or equal to 1) gives asymptotically unbiased estimates of the nonzero regression parameters while shrinking smaller regression parameters to zero to achieve variable selection. Despite advances in the last several decades in developing such regularized regression models, issues regarding the choice of penalty parameter and the computational methods for models fitting with parameter constraints even for bridge linear regression are still not resolved. In this article, we first propose a new criterion based on an area under the receiver operating characteristic (ROC) curve (AUC) to choose the appropriate penalty parameter as opposed to the conventional generalized cross–validation criterion. The model selected by the AUC criterion is shown to have better predictive accuracy while achieving sparsity simultaneously. We then approach the problem from a constrained parameter model and develop a fast minorization-maximization (MM) algorithm for non-linear optimization with positivity constraints for model fitting. This algorithm is further applied to bridge regression where the regression coefficients are constrained with $\ell_p$-norm with the level of $p$ selected by data for binary responses. Examples of prognostic factors and gene selection are presented to illustrate the proposed method.

Keywords

MM algorithm, ROC, variable feature selection

Full Text (PDF format)