→ Poster doi: 10.7490/f1000research.1112744.1
Huaiying Lin, Stefan Green, Pinal Kanabar, Neil Bahroos, and Mark Maienschein-Cline, University of Illinois at ChicagoAbstract
There are numerous taxonomy assignment tools available in the bioinformatics field for metagenomics studies, but the performance of these tools has not been well studied on the same set of samples by a third party. Our goal is to measure the discrepancies and consistencies of several popular taxonomy classifiers on the same samples and to compare the results obtained with different sequencing technologies. In this study, 8 stool samples were collected and sequenced with both whole genome shotgun and 16s sequencing methods. We compared the consistency of taxonomy profiling using (1) five popular off-the-shelf taxonomy profiling softwares: Mothur and Qiime for 16s amplicons, and Gottcha, Metaphlan2 and Metaphyler for shotgun reads; (2) MEGAN’s Lowest Common Ancestor (LCA) taxonomy profiling algorithm using NCBI Megablast-based output between three databases: nt (non-redundant nucleotide), nr (non-redundant amino acid) and 16s microbial nucleotide. From approach (1), we found the taxonomic composition from 16s amplicons is a useful estimate of the whole genome shotgun reads. Mothur, Qiime, Metaphyler and Metaphlan2 showed similar clustering patterns, although Gottcha returned distinctive results. From approach (2), we found that when comparing across different sequencing/processing methods, shotgun reads are the most stable regardless of database types, while 16s reads show different beta diversity among the three databases. Comparing across different databases, we found shotgun reads have a higher beta diversity when comparing to 16S amplicons, and nt database gives the most different taxonomic profiles.