Importance Of Next Generation Dna Sequencing Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

In the last fifty years, traditional DNA sequencing techniques were positively evolved to determine the DNA sequences of organisms communities. A certain number of different methods are being utilised and they commonly include in the amplification DNA strands (Dale, 2012). The genome sequencing has been completed for one of the bacteriophage (5kb). Recently, several significant developments in sequencing techniques have been made to know and better understand the population and the communities of various plants, animals and microbes (Dale et al, 2010). These sequencing techniques have provided huge incontrovertible evidences about the identity of living organisms; as a result, a new gate (known as next generation DNA sequencing) was built up in 1995 by the first sequencing process of the microorganisms such as Shigell sonni, Haemophilus influenza, and Mycoplasma genitalium (Dale, 2012). Six years later, two real existences, Human Genome Project (HGP) and private organisation Celera, published the human DNA sequence of more than 3 billion base pairs. However, evolution of the technological levels in biological and chemical areas have played predominant rule in undertaking large genome sequencing projects (Primrose et al, 2011). These modern platforms can change the trend of study the population and the communities of various organisms from genes for genomes.

This project will identify sequencing and metagenomics and also will introduce the role and importance of both conventional platforms and next-generation DNA sequencing in sequencing area. It will provide information about the pros and cons of both platforms for sequencing.


DNA sequencing might be utilised to understand and analyise the sequence of one gene, whole chromosome, and complete genomes. Relying on approaches run, DNA sequencing is the major method of determining the accurate alignment of four nucleotides within the DNA sequences population of microbes, plants and animals (Dale, 2012). Generally, there are certain number of strategies and technologies to determine the alignment of Adenine, Guanine, Thymine, and Cytosine. However, there are considerable amounts of the sequence data in databanks (Gibson, 2008). Researchers have been concerned about how DNA sequences will be discovered in the future. They are afraid that the discovered DNA sequences would be extreme equivalent to some other DNA sequences that have already been found? However, the computer programs are appropriate systems to compare DNA molecules (Dale, 2012). They can significantly separate two DNA sequences if the lengths of two strands are the same. One of the most important programs is BLAST (Basic Local Alignment Search Tool) (Li et al, 2008). To compare and identify positions of two DNA strands that match each other, BLAST search is a very powerful tool to compare a query sequence with databases in databanks or genomic library. For instance, in order to compare two DNA sequences like Shigella sonnei and Escherichia Coli, BLAST demonstrates the regions of matched orientations and opposite orientations via different colour (Dale, 2012). Moreover, BLAST can perform various functions such as nucleotide database, protein database and translate nucleotide database (Li et al, 2008). On the other hand, in order to compare and match two genomes of two microbial strains or two species, researchers can use ORFs. It is a sophisticated computer software that enables the researchers to remove introns inside the DNA sequences.


Metagenomics is the study of the significant different genetic materials of various organisms' communities that present in one environmental area such as soil, marine. It allows us to know and better understand the communities of organisms that we have had no idea about before (Dale et al, 2010). Conventional and modern platfomers indicate how genome sequencing is significantly evolved to better understand and identify microbial DNA sequences (Dale, 2012). One of the pivotal benefits of metagenomics of microbial communities is that a substantial number of bacteria and prokaryotic species have not yet been isolated or cultured in scientific laboratories. This is because genome extraction and genome sequencing have not yet been done for them (Kan et al, 2004). Thus, with the help of next-generation DNA techniques, researchers can provide potential information and reliable data for most living eukaryotic and prokaryotic species. They can also construct an appropriate related taxonomy and library if they can accurately compare the genomic sequences of unknown microbial species to available DNA sequences of known microbial species (Dale et al, 2010).

Classic Approaches

In 1980-1995, there were several various methods. Different methods were evolved for determining DNA sequences when Sanger method (dideoxy method), which was invented by Frederick Sanger, was significantly accessible (Gibson, 2008).

The basis of conventional strategy, which is also called dideoxy sequencing, is to grow DNA sequences that include several proper steps. Firstly, in order to commence and synthesise single novel DNA fragments through dideoxy technique, it requires a single DNA strand, which is called template. This template requires initially oligonucleotide primer, DNA polymerase, normal substrates of dATP,dCTP,dGTP,dTTP and Fluorescent dyes (Kan et al, 2004). The creation of the novel strand is synthesised by adding complementary nucleotides to the opposing template. Secondly, elongation step extends the new strand by continuous adding of four nucleotides to annealed primer (Dale, 2012). Thus, covalent phosphodiester bond is constructed between the 5`-phosphate and the 3`-OH group. Each round of extension is ended by the integration of fluorescently- labelled dideoxynucleotides (ddNTPs). Elongated reaction is repeated by adding annealed primer to the end of another novel strand and then the addition of four nucleotides (Kan et al, 2004). Finally, the repeat steps are commenced by increasing the number of single DNA strands. During the steps, extended reaction and the repeated reaction perform many times under control of various temperatures and other circumstances. However, these novel strands synthesised in extended reactions do not have a 3-OH at the head. Therefore, the bases are not able to add (Dale, 2012).

Polymerase Chain Reaction (PCR):

In 1993, PCR was invented by Kary B. Mullis. This has supplied a profound information and an appropriate service to genome sequencing. It also has a positive role in some cycles of some sequencing techniques (Sjoblom et al, 2006). In essence, there are three leading steps or rounds to do the reaction of PCR and in this reaction the purified genomic DNA of organisms is required to start. The PCR can be even used to amplify small DNA fragments (Bartlett et al, 2003). First of all, in order to commence the process, short DNA sequences are necessary and they can be split into double-strand template. A high temperature of (melting= ≈92ËšC) is given in this stage. Secondly, annealing steps are starting to bind two oligonucleotide primers, each of which are approximately 25-50bp, to the ends of opposing templates, where the temperature degree become lower to (annealing=45-60ËšC) (Dale, 2012). Thirdly, extension cycle will be starting when the temperature rises again to (extension =≈72ËšC). In this cycle, DNA polymerase is used to extend short sequence opposing templates. Indeed, the first extension step synthesizes a couple of novel complementary templates and about 30 seconds is necessary to each elongation cycles (Dale, 2012). Finally these steps are normally repeated by several times to amplify DNA sequences (repeating = about 25-40 steps). Although this technique has played a key role in genome sequencing, it has some drawbacks such as financial problem, missing termination and contaminant sequences that are huge barriers in front of those who need that technique (Sjoblom et al, 2006).

Shotgun sequencing:

The leading principle of this genomic sequencing is essential to have a long DNA sequence since the greatest strategy is to fragment it into a number of shorter fragments. Each small part has to have a perfect size for process (Sjoblom et al, 2006). Vector, which is called extra chromosomal DNA, is much-needed for this rationale. This bacteria vector is prepared to insert small fragments to produce random clone. Then recombinant random vectors are stored properly in a small library (Dale, 2012). Each of the suitable recombinant vectors can be called shotgun sequencing. However one should know that each small strand comes from first long DNA sequence. Now, there are an uncertain number of bits of strands. A computer program like algorithm can be used to compare each strand to all the other strands. Computer can also distinguish two overlapping sequences from any other sequences. When two sequences are completely overlapping, they can be linked together to generate a contig assembly. Finally, all of those complementary sequences will be made appropriate contigs (Anderson, 1981).

Next-generation sequencing

Recently, the evolutions in the process of DNA sequencing have been obtained utilising fundamentally the same major techniques (Sanger approach and sequencing) (Dale, 2012). In 2000, the first second-generation reaction was the considerably signature reaction. For the first time, since 2008, 454 platform has been used for genome sequencing. Human genome project was then sequenced by contemporary second-generation sequencing (wheeler et al., 2008). Indeed, what distinguished this platform from the other Next-generation DNA sequencing run in a normal study laboratory was only the level and size of the process. This involves evolution of a high standard of the sample's production, putting the sample into the technology, organising the sequencing reactions and analysing the consequences. All of them made successful in the process of sequencing of the human DNA, even though it took approximately ten years and cost about ten billion dollar (Dale, 2012). However, the notion of high-throughput sequencing has been increasingly improved and also has decreased significantly both the cost and the period of sequencing (Sjoblom et al, 2006).

In current time, there are more than three main second-generation sequencers produced by three various commercial companies.

454 Pyrosequencing platform:

It is one of the next- generation techniques and also is an appropriate alternative to classic techniques that includes some different cycles (Wood et al, 2007). Each cycle is composed of the demonstration of only one nucleotide and also depends on the emission of pyrophostphate when single base (nucleotide) is incorporated into a creating DNA strand (Korshunova, Y., 2008). Pyrophosphate emitted can be reacted by two enzymatic interaction which ATP sulphurylase can alter adenosine 5 phosphosulphate to adenosine 5-triphosphate, that actives a second enzyme (Korshunova, Y., 2008). After that, both adenosine 5-phosphosulphate and luciferin are added to drive visible light making at wells and this is most likely to be measured by a device. Pyrophosphate may be made if the DNA production interaction is provided with the certain dNTP (Margulies et al, 2005).

The main principle of pyrosequencing reaction is to amplify a single DNA strand after which a genomic DNA (such as Shigella sonni DNA sequences) is isolated and shattered into a substantial number of small parts of fragment (Dale, 2012). Each small part is attached to a microbead and then this can form a sequencing library. Each DNA strand is embedded in a mixture and droplet of oil. Micorbead with the small DNA part is precisely consisted of water including various reagents participated in the PCR reaction (Margulies et al, 2005).. These make the small DNA fragments to be significantly amplified. PCR amplification starts to synthesis about a million of DNA copies. After that emulsion is broken to attach amplified DNA strands with particular microbeads. However, there are more than one million of wells preparing for specific microbeads (Korshunova, 2008). Each prepared bead (ligated its DNA fragment) loads onto the well (known as Pico TiterPlate device), Where the surface of wells permit only one specific microbead because one bead is equal to one read in the eventual cycle. The Pico TiterPlate device is putted in the sophisticated machine for sequencing process (Margulies et al, 2005). There is a special camera laser that is able to record each base complementary to template base leads to chemoluminescence light sign. More than one million of copies of a signal stranded DNA of Shigella sonni are ligated on DNA detain microbead. The intensity of integration at each well is measured by a 454 pyrosequencing program. Finally, measured data are significantly differentially analysed to amplicon different observation and de novo sequence assembly (Dale, 2012).

SOLiD platform:

This strategy is quite similar to previous platform because the first round is to produce a library of cutting DNA strands. This sequencing process is considerably relied on binding (Kan et al, 2004). After attaching of the adapters with both microheads of DNA fragments, each sequence is linked to a single microbead. In above platform, PCR synthesised a great emulsion, where one bead is consisted of a small drop of water emulsified in an oil drop (Dale, 2012). All preparing microbeads are ligated to a slide surface and then anneal primers are added to the heads of the short DNA sequences. Furthermore, the leading dissimilarity of the SOLiD sequencing with other Next-generation DNA platforms includes the connection of bases with each other, dxTP reagents are not applied, Oligonucleotide sequencing is utilised and two nucleotides are accurately read at the same time, rather than polymerised process is applied to read the fragment (Kan et al, 2004). However, every preparing oligonucleotide has a couple of nucleotides at the ends; therefore, that oligonucleotide which has the exact base pair can attach to the anneal primer (Tang et al, 2009). Moreover, the most important part of the SOLiD technique is that flow cells are washed with a mixture of labelled-oligonucleotide Primers (known as labelled-probes) and ligase enzyme (Wood et al, 2007). Every oligonucleotide primer has a couple of base pair (Guanine, Thymine) at the ends since every primer ligates to another primer through these two dinucleotides (GT) but other primers will not be attached to each other if they do not end with these two dinucleotides and also do not clean (Valouev et al, 2008). In contrast, the template ends with opposing base pairs (Adenine, Cytosine) due to the ligation. Eventually, those labelled-probes with fluorescent dye can release a particular signal; as a consequence, computer can measure properly attached-primers in all wells. In each cycle of SOLiD sequencing, the mixture of Labelled-probes and Ligase enzyme will be added to repeat the sequencing several times. Whereas, a substantial number of DNA sequences are synthesised and measured to build a library, the cost of each sample is relatively expensive and also it can measure short DNA sequences(Kan et al, 2004).

Illumina or Solexa platform (bridge amplification sequencing):

This modern platform includes accurate various steps. First of all, a DNA extraction such as Shigella sonni DNA is extremely essential to start the process of amplification and then this extract DNA sequence randomly shatters into small DNA fragments. After that, in order to attach flow cell surface, adapters are properly added to the ends of the small DNA fragments (Valouev et al, 2008).. Then, Polymerase Chain Reaction plays an important role in amplifying of DNA fragments and lead to the synthesis of sequence clusters on the flow cell surface; as a result, the surface of flow cell is masked with the sequence clusters. Approximately 50 million short DNA strands are denaturised (Dale, 2012). Now, each single DNA strand consists of two adapters and single fragment. Sequencing cycles, which are involved some steps, commence first a novel sequence (Dale et al, 2010). First cycle, free adapter is added to the flow cell surface (slide), followed by dye-labelled nucleotides are individually added one by one to adaptor. At first, two types of the stranded oligonucleotides, which are exactly complementary to each other, synthesize on the slide surface. These couple of neighbouring DNA strands then experience exponential development in situ hybridization utilising a new method called Bridge amplification(Kan et al, 2004). The free sides of DNA strands bend and ligate to their complementary oligonucleotide which supply as a short fragment (called primer) for a cycle of DNA production. This automatic cycle can repeat considerable number of times and cause the first two neighbouring fragments being grown into sequence clusters of a several thousand identical DNA fragments. Indeed, oligonucleotide primers are later washed to remove the unattached DNA sequences. In order to continue this sequencing, the mixture (known as substrates) is constantly added to grow DNA sequences (Wood et al, 2007). Eventually the machine is available to read all those sequences are stained with Fluorescent dye. This takes place in each cluster of the fifty million clusters on the slide(Dale, 2012). However, the important thing in this method is that nucleotides one by one add to growth DNA strains and also unattached bases are washed away. The consequences are immediately read and recorded, the nucleotide is removed, the stain is emitted, and sequencing is repeated to record the following nucleotide of the DNA strands. After that, the machine gathers the DNA strands of each cluster by showing the plots of images(Wood et al, 2007).

The advantages and disadvantages of some classic and modern methods of genome sequencing:

In comparison with Classic sequencing, second-generation DNA sequencing is noticeably different in costs and practical parts. One of the advantages of Sanger method is that it requires short DNA sequences (about 500 bp) and also the cost of its run is relatively inexpensive ($1-$20 for Sanger sequencing). Another advantage is that it is very accurate(Dale, 2012).. However, the disadvantages of this method are that it depends on the ddNTP and also it can read short DNA sequences. Moreover, contaminant DNA fragments sometimes occur during the sequencing (Valouev et al, 2008). In contrast, the costs of run a single sample of modern platforms are relatively costly (nearly $7k for SOLiD, $14k for 454 sequencing and $28k for Illumine) (Sjoblom et al, 2006). One advantage of 454 pyrosequencing is to run multiple samples per day and roughly 250-1000bp can be measured but it is extremely complicated in sample preparation. However, in comparison with pyrosequencing and SOLiD method, Illumine method can measure about 1.6 billion bp and also three to nine days is taken to a single run (Wood et al, 2007). Practical accuracy in 454 pyrosequencing and illumine sequencing are nearly similar (approximately 99.99%). In comparison with 454 pyrosequencing, SOLiD method can read about 50 bases which are short, meaning that considerably more calculation is necessary to construct longer DNA fragment measured precisely. Furthermore, the level of its accuracy is 99.94% which is a leading benefit of this technique and also each cycle may receive approximately 60 Gb (Dale, 2012).


This project has explained some conventional and modern platforms which have significantly played a predominant role in the process of genome sequencing and have provided further reliable evidence about gene functions and gene structures. Through second-generation platforms, the DNA sequences are amplified to far too many millions of sequences and the sequencing library can be built to these and, genetics and researchers can understand and know the huge difference between the species and strains of eukaryotes as well as prokaryotes through these techniques. Although, time, contaminant sequences and cost of methods are the biggest hinders in front of the evolution of genomic technologies, they can supply significantly differentially analysing reliable data. Researchers and experts are ceaseless to produce and develop modern sequencing techniques which will cause the amplification of DNA sequences.