Statistics and Its Interface

Volume 8 (2015)

Number 3

Distance-weighted Support Vector Machine

Pages: 331 – 345

DOI: https://dx.doi.org/10.4310/SII.2015.v8.n3.a7

Authors

Xingye Qiao (Department of Mathematical Sciences, Binghamton University, State University of New York, Binghamton, N.Y., U.S.A.)

Lingsong Zhang (Department of Statistics, Purdue University, West Lafayette, Indiana, U.S.A.)

Abstract

A novel linear classification method that possesses the merits of both the Support Vector Machine (SVM) and the Distance-weighted Discrimination (DWD) is proposed in this article. The proposed Distance-weighted Support Vector Machine method can be viewed as a hybrid of SVM and DWD that finds the classification direction by minimizing mainly the DWD loss, and determines the intercept term in the SVM manner. We show that our method inheres the merit of DWD, and hence, overcomes the data-piling and overfitting issue of SVM. On the other hand, the new method is not subject to the imbalanced data issue which was a main advantage of SVM over DWD. It uses an unusual loss which combines the Hinge loss (of SVM) and the DWD loss through a trick of axillary hyperplane. Several theoretical properties, including Fisher consistency and asymptotic normality of the DWSVM solution are developed. We use some simulated examples to show that the new method can compete DWD and SVM on both classification performance and interpretability. A real data application further establishes the usefulness of our approach.

Keywords

discriminant analysis, Fisher consistency, imbalanced data, high-dimensional, low sample size data, Support Vector Machine

2010 Mathematics Subject Classification

62H30

Published 17 April 2015