- Frederic ESCUDIE, INRA Toulouse
- Lucas AUER, INRA Toulouse
- Maria BERNARD, INRA Jouy-en-josas
- Laurent CAUQUIL, INRA Toulouse
- Katia VIDAL, INRA Toulouse
- Sarah MAMAN, INRA Toulouse
- Mahendra MARIADASSOU, INRA Jouy-en-josas
- Guillermina HERNANDEZ-RAQUET, INRA Toulouse
- Geraldine PASCAL, INRA Toulouse
High-throughput sequencing of 16S/18S/23S RNA amplicons has opened new horizons in the study of microbe communities. With the sequencing at great depth the current processing pipelines struggle to run rapidly and the most effective solutions are often designed for specialists. These tools are designed to give both the abundance table of operational taxonomic units (OTUs) and their taxonomic affiliation. In this context we developed the pipeline FROGS: « Find Rapidly OTU with Galaxy Solution ». Developed for biologists on the Galaxy platform.
A preprocessing tool merges paired sequences into contigs with flash, cleans the data with cutadapt, deletes the chimeras with VSEARCH combined with a cross-validation method and dereplicates sequences with a home-made python script. The clusterisation tool runs with SWARM that uses a local clustering threshold, not a global clustering threshold like other software do. The affiliation tool returns taxonomic affiliation for each OTU using both RDPClassifier and NCBIBlast+ on different databases (Silva, Greengenes). And finally, the post processing tool allows users to process this table with the user-specified filters and provides statistical results and numerous graphical illustrations of these data.
FROGS has been developed to be very fast even on large amounts of 454/HiSeq/MiSeq data in using cutting-edge tools and an optimized design, also it is portable on all Galaxy platforms. FROGS was tested on numerous simulated datasets. The tool has been extremely rapid, robust and highly sensitive for the OTU detection with very few false positives compared to other pipelines widely used by the community.