Your browser doesn't support javascript.

Biblioteca Virtual em Saúde


Home > Pesquisa > ()
Imprimir Exportar

Formato de exportação:


Adicionar mais destinatários
| |

Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads.

Chong, Zechen; Ruan, Jue; Wu, Chung-I.
Bioinformatics; 28(21): 2732-7, 2012 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-22942077
MOTIVATION: The innovation of restriction-site associated DNA sequencing (RAD-seq) method takes full advantage of next-generation sequencing technology. By clustering paired-end short reads into groups with their own unique tags, RAD-seq assembly problem is divided into subproblems. Fast and accurately clustering and assembling millions of RAD-seq reads with sequencing errors, different levels of heterozygosity and repetitive sequences is a challenging question.


Rainbow is developed to provide an ultra-fast and memory-efficient solution to clustering and assembling short reads produced by RAD-seq. First, Rainbow clusters reads using a spaced seed method. Then, Rainbow implements a heterozygote calling like strategy to divide potential groups into haplotypes in a top-down manner. And along a guided tree, it iteratively merges sibling leaves in a bottom-up manner if they are similar enough. Here, the similarity is defined by comparing the 2nd reads of a RAD segment. This approach tries to collapse heterozygote while discriminate repetitive sequences. At last, Rainbow uses a greedy algorithm to locally assemble merged reads into contigs. Rainbow not only outputs the optimal but also suboptimal assembly results. Based on simulation and a real guppy RAD-seq data, we show that Rainbow is more competent than the other tools in dealing with RAD-seq data.AVAILABILITY: Source code in C, Rainbow is freely available at
Selo DaSilva