→ Slides doi:10.7490/f1000research.1112467.1
, Indiana UniversityAbstract
Precision genomics is essential in medicine, environmental health, sustainable agriculture, and biological research. Yet popular genome informatics methods lag behind the high levels of accuracy and completeness in gene construction that are attainable with current RNA-seq data.EvidentialGene
is a genome informatics pipeline for gene construction that has a measurably high accuracy and completeness rate for animals and plants, from insects, ticks and crustaceans to crop plants and trees, to fishes and other vertebrates. It uses big data from gene sequencers, generating bigger gene sets than alternate methods, then reduces those with biological criteria of protein codes and orthology into accurate species gene sets. EvidentialGene is in production use at compute centers in USA, Sweden, Australia and elsewhere.
The software pair of MAKER and Trinity form a common recipe now in gene discovery publications, but greater accuracy is possible and easy to obtain. Recent examples with disease vector mosquitoes Aedes
(yellow fever, Zika virus) and Anopheles
(malaria), show EvidentialGene surpasses accuracy of published genes from MAKER, Trinity and Vectorbase. For fishes, Evigene surpasses those recently published from MAKER, Trinity and NCBI Eukaryote genome annotation pipelines.
Galaxy installations that provide genome and transcriptome services will benefit by adding EvidentialGene. This author challenges Galaxy centers with MAKER, Trinity or other gene construction pipelines to reach comparable accuracy and completeness of EvidentialGene, and will collaborate on such with select genomics projects.