Statistics and Its Interface

Volume 7 (2014)

Number 2

Family based association study with complex survey data

Pages: 167 – 176

DOI: https://dx.doi.org/10.4310/SII.2014.v7.n2.a2

Authors

Dewei She (Department of Statistics, George Washington University, Washington, D.C., U.S.A.)

Hong Zhang (Institute of Biostatistics, School of Life Science, Fudan University, Shanghai, China; Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, U.S.A.)

Yan Li (Department of Mathematics, University of Texas, Arlington, Tx., U.S.A.; Joint Program in Survey Methodology, University of Maryland, College Park, Md., U.S.A.)

Barry I. Graubard (Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, U.S.A.)

Zhaohai Li (Department of Statistics, George Washington University, Washington, D.C., U.S.A.; Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, U.S.A.)

Abstract

Genetic data collected from the Third National Health and Nutrition Examination Survey (NHANES III) provides an opportunity to investigate associations between genetic variations and health-related phenotypes for the US population. Complex sample designs involving stratified multistage cluster sampling and sample weighting are used to sample families in household surveys such as the NHANES III. We modified conditional likelihood score and trend tests used to test the null hypothesis of no association between a candidate gene and a phenotype in simple random samples of nuclear families so that these tests are applicable to data from complex sample designs. The finite sample properties of our modified test procedures are evaluated via Monte Carlo simulation studies. We recommend using an F-version of the trend test instead of a score test because the F-test shows greater power. Our test statistics are applied to NHANES III data to test for associations between the locus ADRB2 (rs1042713) and obesity, VDR (rs2239185) and high blood lead level, and TGFB1 (rs1982073) and asthma.

Keywords

complex sampling, conditional likelihood score test, nuclear family, survey data, trend test

Published 17 April 2014