Statistics and Its Interface
Volume 8 (2015)
Special Issue on Modern Bayesian Statistics (Part II)
Guest Editor: Ming-Hui Chen (University of Connecticut)
A Bayes testing approach to metagenomic profiling in bacteria
Pages: 173 – 185
Using next generation sequencing (NGS) data, we use a multinomial with a Dirichlet prior to detect the presence of bacteria in a metagenomic sample via marginal Bayes testing for each bacterial strain. The NGS reads per strain are counted fractionally with each read contributing an equal amount to each strain it might represent. The threshold for detection is strain-dependent and we apply a correction for the dependence amongst the (NGS) reads by finding the knee in a curve representing a tradeoff between detecting too many strains and not enough strains. As a check, we evaluate the joint posterior probabilities for the presence of two strains of bacteria and find relatively little dependence. We apply our techniques to two data sets and compare our results with the results found by the Human Microbiome Project. We conclude with a discussion of the issues surrounding multiple corrections in a Bayes context.
metagenomics, Bayes testing, bacteria, dependence
2010 Mathematics Subject Classification
Primary 62F15, 62P10. Secondary 62-07, 62F03.