Statistics and Its Interface

Volume 16 (2023)

Number 4

Estimating individualized treatment rules for multicategory type 2 diabetes treatments using electronic health records

Pages: 505 – 515



Jitong Lou (University of North Carolina, Chapel Hill, N.C., U.S.A.)

Yuanjia Wang (Columbia University, New York, N.Y., U.S.A.)

Lang Li (Ohio State University, Columbia, Oh., U.S.A.)

Donglin Zeng (University of North Carolina, Chapel Hill, N.C., U.S.A.)


In this article, we propose a general framework to learn optimal treatment rules for type 2 diabetes (T2D) patients using electronic health records (EHRs). We first propose a joint modeling approach to characterize patient’s pretreatment conditions using longitudinal markers from EHRs. The estimation accounts for informative measurement times using inverse-intensity weighting methods. The predicted latent processes in the joint model are used to divide patients into a finite of subgroups and, within each group, patients share similar health profiles in EHRs. Within each patient group, we estimate optimal individualized treatment rules by extending a matched learning method to handle multicategory treatments using a one-versus-one approach. Each matched learning for two treatments is implemented by a weighted support vector machine with matched pairs of patients. We apply our method to estimate optimal treatment rules for T2D patients in a large sample of EHRs from the Ohio State University Wexner Medical Center. We demonstrate the utility of our method to select the optimal treatments from four classes of drugs and achieve a better control of glycated hemoglobin than any one-size-fits-all rules.


electronic health records, individualized treatment rules, latent process, machine learning, multicategory treatments, type 2 diabetes

This research work was supported by the National Institutes of Health grants GM124104, NS073671, and MH117458.

Received 10 February 2021

Accepted 4 May 2022

Published 14 April 2023