Contents Online

# Statistics and Its Interface

## Volume 10 (2017)

### Number 4

### Modeling the upper tail of the distribution of facial recognition non-match scores

Pages: 711 – 725

DOI: http://dx.doi.org/10.4310/SII.2017.v10.n4.a13

#### Authors

#### Abstract

In facial recognition applications, the upper tail of the distribution of non-match scores is of interest because existing algorithms classify a pair of images as a match if their score exceeds some high quantile of the non-match distribution. We develop a general model for the non-match distribution above $u_{\tau}$, the $(1-\tau)$th quantile, borrowing ideas from extreme value theory. We call this model the $\mathrm{GPD}_{\tau}$ , as it can be viewed as a reparameterized generalized Pareto distribution (GPD). This novel model treats $\tau$ as fixed and allows us to estimate $u_{\tau}$ in addition to parameters describing the tail. Inference for both $u_{\tau}$ and the $\mathrm{GPD}_{\tau}$ scale and shape parameters is performed via M-estimation, where our objective function is a combination of the quantile regression loss function and $\mathrm{GPD}_{\tau}$ density. By parameterizing $u_{\tau}$ and the $\mathrm{GPD}_{\tau}$ parameters in terms of available covariates, we gain understanding of these covariates’ influence on the tail of the distribution of non-match scores. A simulation study shows that our method is able to estimate both the set of parameters describing the covariates’ influence and high quantiles of the non-match distribution. We apply our method to a data set of non-match scores and find that covariates such as gender, use of glasses, and age difference have a strong influence on the tail of the non-match distribution.

#### Keywords

generalized Pareto, M-estimation, quantile regression

Published 30 May 2017