Statistics and Its Interface
Volume 10 (2017)
Spectral methods for learning discrete latent tree models
Pages: 677 – 698
We consider the problems of structure learning and parameter estimation for discrete latent tree models. For structure learning, we introduce a concept of generalized information distance between variables based on singular values of probability matrices, and use it to build a bottom-up algorithm for structure recovery. The algorithm is proved to be consistent. Moreover, a finite sample bound is given for exact structure recovery. For parameter estimation, we suggest a novel matrix decomposition algorithm for the case when every latent variable has two states. Unlike the expectation-maximization (EM) algorithm, our algorithm can avoid trapping into a local optima. Moreover, it is proved to be consistent and a finite sample bound is also given for parameter estimation.
In both structural learning and parameter estimation, empirical results were provided to support our theoretical results. In applications to real data, we analyzed the Changchun mayor hotline data, where the underlying structures were detected for Chinese words. We demonstrated that the proposed method is efficient for discovering hierarchical structures and latent information.
latent variables, parameter estimation, spectral distance, structural learning
2010 Mathematics Subject Classification
Primary 62H05. Secondary 62H12.