Bayesian community-wide culture-independent microbial source tracking

Abstract
SourceTracker finds the proportion and origin of contaminants in a given sample. Its database will prove useful in screening of metagenomic datasets for contaminants. Contamination is a critical issue in high-throughput metagenomic studies, yet progress toward a comprehensive solution has been limited. We present SourceTracker, a Bayesian approach to estimate the proportion of contaminants in a given community that come from possible source environments. We applied SourceTracker to microbial surveys from neonatal intensive care units (NICUs), offices and molecular biology laboratories, and provide a database of known contaminants for future testing.