Statistics and Its Interface

Volume 6 (2013)

Number 3

Estimation and imputation in linear regression with missing values in both response and covariate

Pages: 361 – 368



Jun Shao (School of Finance and Statistics, East China Normal University, Shanghai, China; Department of Statistics, University of Wisconsin, Madison, Wisc., U.S.A.)


We consider linear regression with missing responses as well as missing covariate data. When the missing data mechanism is ignorable, we show that regression parameters and the response mean can be estimated using standard methods and treating imputed values as observed data. We also show that the same procedure results in biased and inconsistent estimators when missing response mechanism depends on covariates that also have missing values and thus is nonignorable. Efficient estimation and imputation under nonignorable missingness is a challenge problem. Under some conditions, we derive some asymptotically unbiased and consistent estimators via direct estimation or imputation. Some simulation results are presented to examine the finite sample performance of various estimators.


asymptotic unbiasedness and consistency, imputation, linear regression, missing covariate data, missing response data, nonignorable missingness

2010 Mathematics Subject Classification

Primary 62J05. Secondary 62G20.

Full Text (PDF format)