Statistics and Its Interface

Volume 10 (2017)

Number 4

Nonparametric verification bias-corrected inference for the area under the ROC curve of a continuous-scale diagnostic test

Pages: 629 – 641

DOI: http://dx.doi.org/10.4310/SII.2017.v10.n4.a8

Authors

Gianfranco Adimari (Department of Statistical Sciences, University of Padova, Italy)

Monica Chiogna (Department of Statistical Sciences, University of Padova, Italy)

Abstract

For a continuous-scale diagnostic test, the area under the receiver operating characteristic curve (AUC) is a popular summary measure to assess the test’s ability to discriminate between healthy and diseased subjects. In some studies, verification of the true disease status is performed only for a subset of subjects, selected possibly on the basis of the test result and of other characteristics of the subjects. Estimators of the AUC based only on this subset of subjects are typically biased; this is known as verification bias. Some methods have been proposed to correct verification bias, but they require parametric models for the (conditional) probability of disease and/or the (conditional) probability of verification. A wrong specification of such parametric models can affect the behaviour of the estimators, which can be inconsistent. To avoid misspecification problems, in this paper we propose a fully nonparametric method for the estimation of the AUC of a continuous test under verification bias. The method is based on nearest-neighbor imputation and adopts generic smooth regression models for both the probability that a subject is diseased and the probability that it is verified. The new AUC estimator is consistent under the assumption that the true disease status, if missing, is missing at random (MAR). A simple extension which deals with stratified samples is also provided. Simulation experiments are used to investigate the finite sample behaviour of the proposed methods. An illustrative example is presented.

Keywords

missing data imputation, nearest-neighbor imputation, ROC analysis

2010 Mathematics Subject Classification

Primary 62G05, 62G20. Secondary 62P10.

Full Text (PDF format)

Published 30 May 2017