Statistics and Its Interface

Volume 7 (2014)

Number 1

We dedicate this special issue to Dr. Gang Zheng, a great colleague and dear friend to many of us.

Inherent difficulties in nonparametric estimation of the cumulative distribution function using observations measured with error: Application to high-dimensional microarray data

Pages: 69 – 73



George W. Wright (Biometric Research Branch, National Cancer Institute, Bethesda, Maryland, U.S.A.)

Lori E. Dodd (National Institute of Allergy and Infectious Diseases, Bethesda, Maryland, U.S.A.)

Edward L. Korn (Biometric Research Branch, National Cancer Institute, Bethesda, Maryland, U.S.A.)


Distribution function estimation is important in many biological applications. A very simple example is given to show that with the addition of normal errors, data from very different underlying distributions can generate nearly identical distributions of observations. Therefore, in some situations it can be essentially impossible to accurately estimate an underlying cumulative distribution function from a reasonable number of observations measured with error. An application is given involving estimating the distribution function of differential gene expression based on more than fifty thousand genes.


empirical Bayes, stability, microarray data, mixture models, measurement error, shrinkage

2010 Mathematics Subject Classification

Primary 62C12. Secondary 62G07, 62P10.

Full Text (PDF format)