Statistics and Its Interface

Volume 14 (2021)

Number 2

Grouped variable selection with prior information via the prior group bridge method

Pages: 211 – 227

DOI: https://dx.doi.org/10.4310/20-SII629

Authors

Kai Li (Karyopharm Therapeutics Inc., Newton, Massachusetts, U.S.A.)

Meng Mei (Department of Statistics, Oregon State University, Corvallis, Or., U.S.A.)

Yuan Jiang (Department of Statistics, Oregon State University, Corvallis, Or., U.S.A.)

Abstract

In a multiple regression with grouped predictors, it is usually desired to select important groups as well as to select important variables within a group simultaneously. To achieve this so-called “bi-level selection,” group bridge has been developed as a combination of group-level bridge and variable-level lasso penalties. However, in many scientific areas, prior knowledge is available about the importance of certain groups of predictors, leading to the necessity of methodological development to incorporate such valuable information. For a prior-informative group, we propose a new penalty called “group ridge” as a combination of grouplevel ridge and variable-level lasso penalties, which always preserves this group while selects important variables in it. Then, we propose a composite group penalization named “prior group bridge” by applying group ridge and group bridge to prior-informative groups and groups with no prior information, respectively. We prove that prior group bridge achieves estimation and group selection consistencies given that the prior information is correct. In addition, we demonstrate the empirical advantage of prior group bridge over group bridge in terms of estimation, group and variable selection, and prediction through simulation studies. Finally, we apply prior group bridge to a genetic association study of bipolar disorder to illustrate its applicability and efficacy in real applications.

Keywords

composite penalization, group ridge, selection consistency, solution path

The full text of this article is unavailable through your IP address: 3.235.172.123

Received 5 August 2019

Accepted 25 July 2020

Published 22 December 2020