Statistics and Its Interface
Volume 2 (2009)
Alignment of protein mass spectrometry data by integrated Markov chain shifting method
Pages: 329 – 340
Mass spectrometers such as SELDI-TOF (surface enhanced laser desorption/ionization time-of-flight) and MALDI-TOF (matrix assisted laser desorption and ionization time-of-flight) measure the relative abundance of different protein ions or protein fragments (peptides) indexed by the mass-to-charge ratio (m/z). A special characteristic of the MS spectra is its variabilities in both m/z values and intensity magnitudes. We propose modelling the logintensities by a semiparametric model and the m/z by the integrated Markov chain shifting (IMS) model, for which the second-order differences of the random effects are assumed to follow a second-order Markov chain. Alignment of spectra is done through averaging over the random shifts conditional on the observed intensity information. The unknown parameters are estimated by an iterative nonparametric maximum profile likelihood method and a Gaussian kernel approximation. The bandwidths in kernel approximation are taken to be 0.04%–0.08% of the m/z values. Simulation results show that the proposed approach can achieve satisfactory alignment by reducing the intensity variations of the misalignment spectra by a factor of around 75%. Most alignment algorithms align spectra by clustering neighboring peaks and do not incorporate peak height information. Our semiparametric random shifting method builds a model taking into consideration of both the random shift effects of neighboring m/z values and similarity of the intensity magnitudes of common peaks within the ranges of about 50% of the intensity values.
MS spectra, semiparametric model, Markov chain, integrated Markov chain shifting, profile likelihood
2010 Mathematics Subject Classification
Primary 62P10. Secondary 62Gxx, 92F05.
Published 1 January 2009