Statistics and Its Interface

Volume 7 (2014)

Number 2

Modern sample size determination for unordered categorical data

Pages: 219 – 233

DOI: http://dx.doi.org/10.4310/SII.2014.v7.n2.a7

Authors

Junheng Ma (Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, U.S.A.)

Jiayang Sun (Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, U.S.A.)

Joe Sedransk (Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, U.S.A.; Joint Program in Survey Methodology, University of Maryland, College Park, Md., U.S.A.)

Abstract

Sample size determination is one of the most important practical tasks for statisticians. In this paper, we study sample size determination for unordered categorical data, with or without a pilot sample. With a pilot sample, we provide a minimal difference method, a first order correction, and bootstrap methods for sample size determination in the comparison of two multinomial distributions using the usual chi-squared test. We also propose a Bayesian approach that uses an extension of a posterior predictive p-value. The performance of these methods is investigated via both a simulation study and a real application to leukoplakia lesion data. We advocate a better performance measure than MSE when the sampling distribution is highly skewed. Practical recommendations are given. Some asymptotic results are also provided.

Keywords

bootstrap, calibrated posterior predictive p-value, multinomial distribution, pilot data, power calculation, practical recommendations

2010 Mathematics Subject Classification

Primary 62F10, 62F15. Secondary 62F40.

Full Text (PDF format)