→ Poster doi:10.7490/f1000research.1112511.1
Aaron Petkau (1), Franklin Bristow (1), Thomas Matthews (1), Josh Adam (1), Philip Mabon (1), Cameron Sieffert (1), Eric Enns (1), Jennifer Cabral (2), Joel Thiessen (2), Natalie Knox (1), Damion Dooley (3), Aleisha Reimer (1), Eduardo Taboada (6), Alex Keddy (7), Robert G. Beiko (7), William Hsiao (3,4), Morag Graham (1,2), Gary Van Domselaar (1,2), The IRIDA Consortium and Fiona Brinkman (5)
(1) National Microbiology Laboratory, Winnipeg, Canada
(2) University of Manitoba, Winnipeg, Canada
(3) BC Public Health Microbiology and Reference Laboratory, Vancouver, Canada
(4) University of British Columbia, Vancouver, Canada
(5) Simon Fraser University, Burnaby, Canada
(6) National Microbiology Laboratory, Lethbridge, Canada
(7) Dalhousie University, Halifax, Canada Abstract
Modern epidemiological investigations of infectious disease outbreaks are transitioning to routinely incorporate Whole Genome Sequencing (WGS) data for microbial pathogens. WGS provides a wealth of information previously unavailable, enabling fine-level resolution of isolates using data from the entire genome, down to Single Nucleotide Variants (SNVs). However, the application of WGS for genomic epidemiology continues to be hindered by the complexities of data management and analysis, often requiring considerable expertise as data progresses from the sequencer into a final report.
Here, we present IRIDA (Integrated Rapid Infectious Disease Analysis) and SNVPhyl (SNV Phylogenomics) our platform for genomic epidemiology and pipeline for SNV-based phylogenies respectively. IRIDA stores and manages WGS data and associated epidemiological metadata; provides the execution of analysis pipelines via an internal Galaxy instance, as well as visualization and evaluation of results. Capacity also exists for incorporation of IRIDA-managed data into external tools, such as independent Galaxy installations, through a REST-like API. SNVPhyl enables the classification and clustering of bacterial isolates by identifying phylogenetically informative SNVs from sequence reads. SNVPhyl is distributed as a Galaxy workflow and suite of tools; enabling incorporation within independent Galaxy instances, batch execution via a provided command-line controller script, or execution as part of the larger IRIDA package.
IRIDA and SNVPhyl have shown considerable success within Canada as we transition towards routine sequencing for surveillance and outbreak investigations. With the help of the Galaxy community we have made significant improvements over previous years and IRIDA and SNVPhyl are now freely available at https://github.com/phac-nml/irida and http://snvphyl.readthedocs.org/.