Statistics and Its Interface

Volume 6 (2013)

Number 1

Accounting for linkage disequilibrium in genome-wide association studies: a penalized regression method

Pages: 99 – 115

DOI: http://dx.doi.org/10.4310/SII.2013.v6.n1.a10

Authors

Jian Huang (Department of Biostatistics, University of Iowa, Iowa City, Ia., U.S.A.)

Jin Liu (School of Public Health, Yale University, New Haven, Conn., U.S.A.)

Shuangge Ma (School of Public Health, Yale University, New Haven, Conn., U.S.A.)

Kai Wang (Department of Biostatistics, University of Iowa, Iowa City, Ia., U.S.A.)

Abstract

Penalized regression methods are becoming increasingly popular in genome-wide association studies (GWAS) for identifying genetic markers associated with disease. However, standard penalized methods such as LASSO do not take into account the possible linkage disequilibrium between adjacent markers. We propose a novel penalized approach for GWAS using a dense set of single nucleotide polymorphisms (SNPs). The proposed method uses the minimax concave penalty (MCP) for marker selection and incorporates linkage disequilibrium (LD) information by penalizing the difference of the genetic effects at adjacent SNPs with high correlation. A coordinate descent algorithm is derived to implement the proposed method. This algorithm is efficient in dealing with a large number of SNPs. A multi-split method is used to calculate the $p$-values of the selected SNPs for assessing their significance. We refer to the proposed penalty function as the smoothed MCP and the proposed approach as the SMCP method. Performance of the proposed SMCP method and its comparison with LASSO and MCP approaches are evaluated through simulation studies, which demonstrate that the proposed method is more accurate in selecting associated SNPs. Its applicability to real data is illustrated using heterogeneous stock mice data and a rheumatoid arthritis.

Keywords

genetic association, feature selection, linkage disequilibrium, penalized regression, single nucleotide polymorphism

Full Text (PDF format)