Statistics and Its Interface

Volume 4 (2011)

Number 3

Controlling population structure in human genetic association studies with samples of unrelated individuals

Pages: 317 – 326

DOI: http://dx.doi.org/10.4310/SII.2011.v4.n3.a6

Authors

David B. Allison (Section on Statistial Genetics, Department of Biostatistics, University of Alabama at Birmingham, U.S.A.)

Nita A. Limdi (Department of Neurology, University of Alabama at Birmingham, U.S.A.)

Nianjun Liu (Department of Biostatistics, University of Alabama at Birmingham, U.S.A.)

Amit Patki (Department of Biostatistics, University of Alabama at Birmingham, U.S.A.)

Hongyu Zhao (Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, Conn., U.S.A.)

Abstract

In genetic studies, associations between genotypes and phenotypes may be confounded by unrecognized population structure and/or admixture. Studies have shown that even in European populations, which are thought to be relatively homogeneous, population stratification exists and can affect the validity of association studies. A number of methods have been proposed to address this issue in recent years. Among them, the mixed-model based approach and the principal component-based approach have several advantages over other methods. However, these approaches have not been thoroughly evaluated on large human datasets. The objectives of this study are to (1) evaluate and compare the performance of the mixed-model approach and the principal component-based approach for genetic association mapping using human data consisting of unrelated individuals, and (2) understand the relationship between these two approaches. To achieve these goals, we simulate datasets based on the HapMap data under various scenarios. Our results indicate that the mixed-model approach performs well in controlling for population structure/admixture. It has a similar performance as that based on principal component analysis. However, the approach combining mixed-model and principal component analysis does not perform as well as either method itself.

Keywords

mixed-effects model, principal component analysis, population structure/admixture, genetic association analysis

2010 Mathematics Subject Classification

Primary 62P10, 92-08. Secondary 62-07.

Full Text (PDF format)