Statistics and Its Interface

Volume 9 (2016)

Number 4

Special Issue on Statistical and Computational Theory and Methodology for Big Data

Guest Editors: Ming-Hui Chen (University of Connecticut); Radu V. Craiu (University of Toronto); Faming Liang (University of Florida); and Chuanhai Liu (Purdue University)

Generative modeling of convolutional neural networks

Pages: 485 – 496



Jifeng Dai (Microsoft Research Asia, Beijing, China)

Yang Lu (Department of Statistics, University of California at Los Angeles)

Ying Nian Wu (Department of Statistics, University of California at Los Angeles)


The convolutional neural networks (ConvNets) have proven to be a powerful tool for discriminative learning. Recently researchers have also started to show interest in the generative aspects of ConvNets in order to gain a deeper understanding of what ConvNets have learned and how to further improve them. This paper investigates generative modeling of ConvNets. The main contributions include: (1) We construct a generative model for the ConvNet in the form of exponential tilting of a reference distribution. (2) We propose a generative gradient for pre-training ConvNets by a non-parametric importance sampling scheme. It is fundamentally different from the commonly used discriminative gradient, and yet shares the same computational architecture and cost as the latter. (3) We propose a generative visualization method for the ConvNets by sampling from an explicit parametric image distribution. The proposed visualization method can directly draw synthetic samples for any given node in a trained ConvNet by the Hamiltonian Monte Carlo algorithm, without resorting to any extra hold-out images. Experiments on the challenging ImageNet benchmark show that the proposed generative gradient pre-training helps improve the performances of ConvNets in both supervised and semi-supervised settings, and the proposed generative visualization method generates meaningful and varied samples of synthetic images from a large and deep ConvNet.


big data, deep learning

Full Text (PDF format)