Why de-novo genome assembly?
With the recent advancement in DNA sequencing technologies, researches have been fortified to high-throughput, cost-efficient sequencing of many more genomes that was ever previously possible. The most prominent area of genomic sequencing and assembly, known as de-novo genome assembly, now-a-days makes it even possible for researchers to individually sequence the genome of their choosing organisms.
What is de-novo genome assembly?? In bioinformatics, sequence assembly refers to merging fragments of a much longer DNA sequence in order to reconstruct the original sequence. De-novo sequencing refers to sequencing a novel genome where there is no reference sequence available for alignment. Sequence reads are assembled as contigs, and the coverage quality of de-novo sequence data depends on the size and continuity of the contigs (i.e., the number of gaps in the data). It is the process whereby we merge together individual sequence reads to form long contiguous sequences ‘contigs’, sharing the same nucleotide sequence as the original template DNA from which the sequence reads were derived.
How to perform de-novo genome assembly?? What we do first is obtain sequence read file from sequencing machines and inspect the obtained reads to look for what we have got and what the quality is like. After quality evaluation, we cleanup the raw data (called quality trimming) if necessary. Then choose an appropriate assembly parameter set and assemble the data into contigs/ scaffolds. Lastly, examine the output of the assembly and assess assembly quality.
Advantages: The main benefits of de-novo assembly is that it generates accurate reference sequences, even for complex or polyploidy genomes, and provides useful information for mapping genomes of novel organisms or finishing genomes of known organisms. Also, it can clarifies highly similar or repetitive regions for accurate de-novo assembly, and is able to identify structural variants and complex rearrangements, such as deletions, inversions, or translocations.
1). Narzisi, Giuseppe, and Bud Mishra. “Comparing de-novo genome assembly: the long and short of it.” PloS one 6, no. 4 (2011): e19175.
2). Li, Ruiqiang, Hongmei Zhu, Jue Ruan, Wubin Qian, Xiaodong Fang, Zhongbin Shi, Yingrui Li et al. “De-novo assembly of human genomes with massively parallel short read sequencing.” Genome research 20, no. 2 (2010): 265-272.
3). Hernandez, David, Patrice François, Laurent Farinelli, Magne Østerås, and Jacques Schrenzel. ” De-novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer.” Genome research 18, no. 5 (2008): 802-809.