That's why we chose Galaxy. ChemFlow is being implemented with our own functions. By now it includes most of the processing tools : import and convert our data; run chemometrics methods such as calibrations and classifications.
We are very satisfied of the performances of Chemflow running on a server. Nevertheless, some issues were fixed, others are still pending:
As a summary, Galaxy is used in a new domain, chemometrics, adressed to a new user community, and will be a central platform for a new e-learning module, as a MOOC.
→ Slides doi: 10.7490/f1000research.1112750.1
→ Video
Author
Abdulrahman Azab
Björn Grüning
Abstract
This talk is relevant mainly for advanced developers and sysadmins who wish to support docker on their systems but skeptical about docker being insecure. This is also relevant for running Galaxy in production on the top of a HPC system.
How to configure the system to run docker containers as the local user in a very simple and quick way without having to worry about e.g. having connection to LDAP from containers.
→ Slides doi: 10.7490/f1000research.1112751.1
→ Video
As use of Galaxy increases and computational resources are continuously busy it becomes important to optimize resource usage. To address this issue, we have developed Dynamic Tool Destination (DTD), which is a dynamic job destination that works with all tools and destinations. In DTD an administrator sets up rules for each tool in a YAML file, these rules define what destination a tool should go to when particular parameters are present, input data is large or small, etc. DTD is open source under the Apache License and is available on github at https://github.com/phac-nml/dynamic-tool-destination
→ Slides doi: 10.7490/f1000research.1112752.1
→ Video
Classic bioinformatics curricula are limited by a relatively rigid course compartmentalization, employment of expensive IT/Bioinformatics proprietary tools, and limited grading system as an outcome for completing the course. Here we present a curriculum infused with real-life research-based projects such as whole genome analysis, gene expression array and molecular dynamics, applied for aging, cancer and pharmacogenomics. These projects serve as pivotal points for integrating biomedical, computer science and statistics into one coherent interdisciplinary subject known as bioinformatics. Each project has scientific objectives serving as underlying platform for educational goals. Students join the projects after completing a basic course familiarizing them with the technical and scientific aspects of the projects. The curriculum is based on 100% open source, cutting edge, evolving technology. This allows teaching students to use the most current technology at the fraction of proprietary software price. The utilization of real-life projects brings excitement of involvement in pertinent discoveries and facilitates learning and open sharing of ideas. As the outcome of completing the projects, students will develop the skills, knowledge, and hands-on experience that will make them competitive in today's intensive and rapidly changing field of computational biology.
→ Slides doi:10.7490/f1000research.1112753.1
→ Video
Metaproteomics characterizes proteins expressed by microorganism communities (microbiome) present in environmental samples or a host organism. Mass spectrometry (MS)-based metaproteomics has catalyzed new discoveries into the functional dynamics of microbiomes (Wilmes et al 2015, doi: 10.1002/pmic.201500183). Metaproteomic informatics is distinctly challenging due to the large databases and complex processing steps involved. This challenge limits widespread use of metaproteomics. Through modular workflows, we demonstrate the use of the Galaxy bioinformatics framework as a metaproteomic informatics solution (Jagtap et al 2015; doi: 10.1002/pmic.201500074). The workflow output results are compatible with tools for taxonomic and functional characterization (Unipept and MEGAN5). MEGAN5 was used to generate functional characterization of the metaproteome using Inter2Pro pathway analysis. These workflows enable new discoveries from diverse communities such as dental plaques (Rudney et al 2015, doi: 10.1186/s40168-015-0136-z), bronchoalveolar lavage fluid (BALF), lung tissue, and cervical-vaginal fluid (CVF). Our results demonstrate the power of discovery metaproteomics to add functional understanding to microbiomes, beyond what is possible using traditional metagenomic approaches.
→ Slides doi: 10.7490/f1000research.1112754.1
→ Video
GenAP is a Canadian platform that provides Galaxy instances across different Canadian HPC centers. Having more that 7 TB of reference genomes, replicating this data in all HPC centers becomes expensive and hard to keep in synch. Cern VM files system (CVMFS) allow us to centralize the provisioning, replicate the data and distribute genome references on demand. In CVMFS the local machine only imports the genomes necessary for the job being run allowing the use of a minimal storage by the HCP centers.
→ Slides doi: 10.7490/f1000research.1112755.1
→ Video
Circos is a biologist favourite tool for production quality plots, however there is an extremely large activation energy in building the initial plots due to Circos' steep learning curve. We have worked to developing a generic and easily configurable Galaxy tool permitting the generation of Circos plots, while providing the generated configuration files in order to allow further tweaking and customization after the fact. We have made the tool publicly available during development and have already received contributions during the GCC2016 Hackathon.
→ Slides doi:10.7490/f1000research.1112756.1
→ Video
Integrating workflow support for GenomeSpace into Galaxy.
The GenomeSpace importer/exporter itself has been rewritten as a standalone pip installable tool, available here: https://github.com/gvlproject/python-genomespaceclient. We hope to transfer that code back back into GenomeSpace or Galaxy as a set of Python bindings + commandline client for GenomeSpace.
There's a 3 minute video of how things work here:
https://www.youtube.com/watch?v=5QPtWS_ab0I
Seehttps://github.com/galaxyproject/galaxy/pull/1814
→ Slides doi: 10.7490/f1000research.1112757.1
→ Video
Monarch (https://monarchinitiative.org) integrates a variety of genomic, phenotypic, and disease data by leveraging ontologies to create relationships across multiple organisms.
We have (quickly using planemo!) created a galaxy tool to wrap the web services exposed by monarch, including the phenopacket implementation.
Please let us know how we can improve on this first cut. Looking forward to getting some feedback from you
→ Slides doi: 10.7490/f1000research.1112758.1
→ Video
A report on the GCC2016 Datathon.