Statistics and Its Interface

Volume 14 (2021)

Number 4

Regularized multiple mediation analysis

Pages: 449 – 458



Bin Li (Department of Experimental Statistics, Louisiana State University, Baton Rouge, La., U.S.A.)

Qingzhao Yu (School of Public Health, Louisiana State University Health Sciences Center, New Orleans, La., U.S.A.)

Lu Zhang (Department of Public Health Sciences, Clemson University, Clemson, South Carolina, U.S.A.)

Meichin Hsieh (Louisiana Tumor Registry, New Orleans, La., U.S.A.)


Mediation analysis is used to explore how an established exposure-outcome relationship is influenced by a third variable (mediator). Multiple mediation analysis refers to the mediation analysis with multiple mediators. We propose to use the elastic net regularized linear regression in multiple mediation analysis when the number of potential mediators is large. In exploring the exposure-mediator-outcome relationship, we regularize coefficients of mediators in predicting the outcome. The penalization on the coefficient is inversely proportional to the association between the exposure variable and each mediator. Therefore, in estimating the effect of a mediator, the exposure-mediator and the mediatoroutcome associations are jointly considered. An R package, mmabig, is compiled for the proposed method. We perform a series of sensitivity and specificity analysis to examine factors that can influence the power of identifying important mediators. Further, we illustrate how to consider potential nonlinear associations among variables in the mediation analysis. Simulation studies have shown that the proposed mediation analysis method consistently obtain larger power when compared with its main competitors. The method is used with a real data set to explore factors that contribute to the racial disparity in survival rates among breast-cancer patients.


elastic net, high-dimensional data set, multiple mediation analysis, penalized likelihood, R package mmabig

This study was partially funded by the NIH/NIMHD award #1R15MD012387.

Portions of this research were conducted with high performance computational resources provided by the Louisiana Optical Network Infrastructure (

Received 23 June 2020

Accepted 2 February 2021

Published 8 July 2021