AGILE: an assembled genome mining pipeline.

Hughes, Graham M; Teeling, Emma C.
Bioinformatics; 35(7): 1252-1254, 2019 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-30184049


A number of limiting factors mean that traditional genome annotation tools either fail or perform sub-optimally when trying to detect coding sequences in poor quality genome assemblies/genome reports. This means that potentially useful data is accessible only to those with specific skills and expertise in assembly and annotation. We present an Assembled-Genome mIning pipeLinE (AGILE) written in Perl that combines bioinformatics tools with a number of steps to overcome the limitations imposed by such assemblies when applied to highly fragmented genomes. Our methodology uses user-specified query genes from a closely related species to mine and annotate coding sequences that would traditionally be missed by standard annotation packages. Despite a focus on mammalian genomes, the generalized implementation means that it may be applied to any genome assembly, providing a means for non-specialists to gather gene sequences for downstream analyses. AVAILABILITY AND IMPLEMENTATION Source code and associated files are available at https// and https// Singularity and Virtual Box images available at https// SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
