→ Poster doi:10.7490/f1000research.1112472.1
Anil S. Thanki, Nicola Soranzo, Robert P. Davey, The Genome Analysis Centre
, Norwich, UK, Abstracts
The Ensembl GeneTrees pipeline  infers the evolutionary history of gene families, represented as gene trees. These are analysed alongside the corresponding species tree to detect duplication and speciation events. This pipeline is a large and complex suite of interconnected tools and scripts with many dependencies and is therefore quite difficult to port and replicate on a different platform.
We have simplified this process by converting the command line GeneTrees pipeline into an open-source Galaxy workflow, called GeneSeqToFamily. This workflow consists of more than 20 steps and uses existing tools already available in the Galaxy Toolshed, as well as new tools that we developed, such as wrappers for TreeBest and hcluster_sg, alongside data format converters and output parsers. We have also developed tools for retrieving sequences, features and gene trees from Ensembl using its REST API, which can be used as inputs for the workflow.
1. Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E: EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2009, 19(2):327–335.