Mechanisms For Achieving Gene Duplication Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Molecular evolution in eukaryotic organisms is found to be extremely influenced by gene duplication, which opens the way for new evolutionary pathways. What is influenced and altered is not only the genomic material, but also the phenotype and the complexity of the organism. Ohno was the first to propose that adaptation is endorsed by this mechanism, where the ancestral gene keeps its original role, whereas the new gene that arises acquires a completely different function through evolution. Additionally, the extra gene is able to acquire the same or a slightly modified function. Loss of the redundant gene is also an option for the novel cistron. Duplication of genes, is therefore a significant accelerator of evolution, that is worthy of investigation.

The significance of gene duplication to evolutionary process has become known since the beginning of 1930s (Zhang 2003). Duplicated genes are not restricted in a few organisms; genome sequencing approach has revealed that each sequenced genome carried a number of duplicated genes, demonstrating undoubtedly the importance and prevalence of this mechanism. In addition, this analysis enabled researchers to identify and expose the mechanisms by which gene duplication confers novel function to the organism and have a role in evolution (Bridges 1936).

None of the three superkingdoms of living organisms (Bacteria, Archaea and Eukarya) escaped from gene duplication. Helicobacter pylori (Tomb et al.1997 ), Arcaeoglobus fulgidus (Klenk et al. 1997) and human (Li et al.2001 ) are only a representative example for each domain of life, respectively. The new genes that arise have not a certain fate. Some of them acquire mutations, become pseudogenes and are removed from the genome, while other are maintained and stably inherited.

The rate of gene duplication and the half-life of these arising genes, are two factors very important in the evolution of duplicated genes (Long & Thornton 2001). The rate of evolution is taken into account in many studies, since it is an indicator of the functional restrictions (Gayral et al. 2007). Interestingly, what is observed is that the daughter genes evolve faster than the paternal one, which constrains its position and function.

Mechanisms for achieving Gene Duplication

Gene duplication can arise during mitosis or in meiosis. At some stage in mitosis, exchange of genetic material between the two chromatids of the same chromosome is possible to take place. Ideally, the two chromatids are identical. However, in the case of unequal crossing- over, a duplicated cistron will be formed in one chromatid, while the other will lack that cistron due to deletion. As a consequence, one daughter cell will be hemizygous for that locus, while the other will be heterozygote for a gene duplication (figure 1a). If this event takes place in germ cells, then the rearrangement will be heritable. After generations, the duplicated cistron will become homozygous and probably will confer a new characteristic to that species that will distinguish it among others (Ohno, 1970). Unequal crossing- over might also take place during prophase I between two homologous chromosomes in meiosis. In both cases, what is going to be duplicated, depends on the position of the chromosomal exchange. Therefore, the result might be a duplicated portion of the gene, a whole gene or many genes, containing among exons and introns as well. Regional duplication may cause disruption of a gene's activity, if the duplicated gene is a part of a larger group responsible for a metabolic system. What is more, tandem duplication may occur, which means that more than two copies of a chromosomal region are present in a row in the same chromosome. Simultaneous duplication of the entire genome can also occur, resulting in a phenomenon called polyploidization, where all the gene loci are duplicated. This is more frequently observed in kingdom plantae rather than animalia. Viral transduction is another mechanism that leads to gene duplication. It occurs when a bacteriophage is released after synbiosis in a bacterium, taking with it a small part of the host genome. Infecting a new host cell, phage will incorporate its genome and the portion of chromosome taken from the previous bacterium, to a new host cell. This process is called transduction and if the phage inserts again in the same position as in the previous host due to preference, then tandem duplication of that locus will be observed (fig.1b). A different mechanism from the above is called retroposition, and involves the incorporation of cDNA from mRNA into the genome (fig.1c). As a result, what is inserted is the transcribed part of the sequence that holds only the exons, the polyA tail, the 5'cap but not the introns. In this model, incorporation of the duplicated gene takes place at random and not next to the original gene (Zhang 2003). This mechanism accounts for approximately 10,000 duplication events in the human genome.

Evolution of Gene Duplications

Great evolutionary changes do not arise due to allelic changes, which are small and take place in already existing genes. Great evolutionary steps can be observed only if forbidden mutations are accumulated at the active site of a previous gene giving rise to a new gene locus which has a new central role. Natural selection does not favour this kind of genomic alteration. Gene duplication, however, acts as an escape from this rule. The newly created redundant copy is vulnerable in the accumulation of forbidden mutations and a new cistron is reborn that will produce a polypeptide with an entirely new role. For this reason, this mechanism emerges as the major driving force of evolution (Moore & Purugganan 2003, Ohno, 1970, Gu et al. ).

What happens to the genes after duplication? The fate of every duplicated gene is not the same. Through evolutionary processes, the function of each novel region will be determined while some other cistrons will be deleted. It is supported, that the initial stages of the duplicate gene evolution are the most critical for the destiny of these genes (Moore & Purugganan 2003).


Duplicated genes are exposed to mutations, as any other gene locus. Through evolution, that extra gene may become a pseudogene which is not transcribed, not translated or has no function. Therefore it does not confer a new phenotype to the organism. Pseudogenization is the term to describe this process, by which a pseudogene is born from a functional gene. This process happens in the first million years after the rearrangement, only if the additional gene is not favoured by selection. Since pseudogenes are not essential to the organism, gradually will be eliminated from the DNA sequence, or will not be obvious anymore in the genome due to the fact that they will acquire so many changes, that will diverge at a great extent from the parental gene. However, there are cases where a functional duplicated gene becomes a pseudogene. As an example is a number of olfactory receptor genes that have lost their properties in hominoids, probably because some other mechanisms compensated for contributing a better vision (Rouquier et al, 2000). This model provides excellent information about the evolution of organisms. It is apparent, that even if a functional gene is lost, it is for the benefit of the organism, for its survival and adaptation to the environment.

Gene function is retained or slightly modified

Sometimes the amount of protein produced from a specific gene is not enough for the requirements of the organism. The novel gene loci that are formed after gene duplication solve this problem, since they provide redundancy. The abundant genes offer to the organism more of the same gene product. Histones and ribosomal RNAs are some examples of this beneficial mechanism. Nonetheless, in order to maintain the ancestral function, it is implied that duplicated genes have an essential role in the proper function and the survival of the individual. If this is not the case, then the derived gene is more possible to slightly diverge in function from the parental gene. This mode is known as subfunctionalisation and reflects a very frequent phenomenon that occurs after duplication. Two transcription factors, engrailed-1 and -1b are an example of this mechanism. These two genes, arose by gene duplication and diverged from each other. In zebrafish, their expression is tissue specific. Pectoral appendage bud and hindbrain/spinal cord are the places of action, respectively (Force et al. 1999). Engrailed-1 in mouse genome is similar to the two genes in zebrafish since they originated from the same ancestral organism (orthologous genes). However, in this mammal, the gene is expressed in both tissues. The ancestral state is this. Alteration in the expression of gene after the chromosomal rearrangement affected regulatory elements that control where the gene was going to be expressed. As a result, two copies of the same gene are now required to cover both places, since both actions are necessary (fig.2).

Emergence of new function

The resulting duplicated genes might have a totally new role, conferring to the organism new properties (Long & Langley 1993)(Moore & Purugganan 2003). Although it sounds impossible, changes in the resulting peptide contribute to have different gene products in several cases. Frame -shift mutations can lead to the emergence of a new gene with a completely novel function. These deletions or insertions events, disrupt the reading frame and produce different protein products. In this way, a duplicated gene can have a differentrole from the original one. Several arginine substitutions in the amino acid level that took place shortly after the chromosomal duplication, is the cause of the novel antibacterial function of ECP. Emerged from the human duplicated gene ECP (Eosinophil Cationic Protein), it has abolished its dependence on the ribonuclease activity (Rosenberg 1995). It is though of great interest to find out what other function or functions this protein had acquired and lost through the evolutionary process until its recent activity, given that several genetic modifications took place since its parental state.

Few million years were needed for the deletion of most of the duplicated genes, whereas genes that were maintained in the genome due to neofunctionalization and subfunctionalization are only the minority.

Internal gene Duplication

Genome architecture can be altered not only due to complete duplication events of protein- coding genes but also because of duplications that take place internally to genes (Gao & Lynch 2009). Eukaryotic genomes, including human, are affected by this mechanism. This evolutionary force can change the structure of the coding region, leading to the acquisition of a new function, creation of more introns or activation of other splice sites. For this reason, internal gene duplication, is accountable not only for DNA sequence evolution, but also for the emergence of serious diseases, such as breast cancer and Duchenne muscular dystrophy (Weiss et al. 2007)(Hogervorst et al. 2003).

Evolutionary Driving Forces

8)It is believed that neutral genetic drift, where random changes do not confer neither an advantageous nor a disadvantageous property to the organism, is the mechanism that drives the fixation of pseudogenization and redundant unlinked cistrons (Lynch & Force 2000)(Lynch et al. 2001). The changes are totally neutral. Subfunctionalization is also driven by genetic drift (Lynch & Force 2000) . At the contrary, natural selection, where the genetic change confers a beneficial attribute to the organism, is responsible for the new function that the duplicated gene acquires (Lynch et al. 2001). This positive selection is found to characterize the primary evolutionary steps of gene duplication (Moore & Purugganan 2003).

Duplication of genes is not always beneficial

It is already mentioned that natural selection does not permit the accumulation of mutations on the active site of a gene. However, if there are multiple copies of a gene locus, natural selection is unable to shelter all those loci. Thus, forbidden mutations will be accumulated and more of the copies will become useless. According to Callan (1967), genes follow the ''the master-slave theory'' (Ohno, 1970) . In a tandemly duplicated gene sequence, the gene positioned at the end is the master and the rest of them are called the slaves. This is due to the fact that the master gene is the only gene among the other copies that serves as the template during S- phase of the cell cycle, where the genetic material is duplicated. Consequently, the same defect would be inherited to the slaves of the daughter cells, if forbidden mutations were already present in the master gene.

Unequal genomic exchange in prophase I of meiosis, which results in deletions and further duplications, is one more difficulty that the organisms with many duplicated genes have to face. As an example, is the nucleolar organiser which had 450 copies of a ribosomal gene. After unequal crossing over, deletion and duplication event, the ribosomal gene 1 and the ribosomal gene 250, will receive different numbers of gene copies ( 200 and 700 copies respectively).

Patterns of Gene Duplication

C.elegans, yeast and Drosophila

Duplicated genes account up to 20% in the genome of these organisms (Lynch & Conery 2000)and more specifically, block duplications can be observed very frequently in all three of them. As it concerns Drosophila, the gene duplication rate is much lower compared to that of yeast (S. Cerevisiae) and nematode (C.elegans) while yeast holds the greatest number of tandem duplicates (Gu et al. 2002). It has been shown that the large number of tandemly duplicated genes in yeast genome is the consequence of enormous gene loss and reciprocal translocations prior to duplication of the entire genome (Cavalcanti et al. 2003). The pattern of block duplication in yeast and worm differs; tandem gene duplication occurred between chromosomes (interchromosomal duplication) in the case of yeast, whereas in the nematode they occurred within the chromosomes (intrachromosomal).

In Drosophila pseudoobscura, retroposition is the major mechanism that contributes to the generation of duplicated genes . Moreover, inversion events after unequal non- allelic crossing-over in order to repair a double-strand-break can also give rise to the extra gene (Meisel 2009a, Meisel 2009b). Polyploidization as a way for producing duplicated genes, is not observed in Drosophila's genome. Repeated sequences that flank the breakpoints of an inversion but also the duplicated regions, supports the idea that the initial steps of the mechanisms that give rise to this kind of rearrangement and to duplicated genes, are related (Meisel 2009b).

Duplicated genes in Arabidopsis thaliana

Aeabidopsis thaliana is one of the most useful model organisms in plant biology, mostly due to its small genome size (157*106 bp) which was sequenced by the year 2000 (Bomblies & Weigel 2007b. Genetic incompatibilities of paralogous genes ( duplicated genes at different chromosomal regions) have been observed to arise due to divergent evolution (Bikard et al. 2009). As a result, crossing different Arabidopsis strains, hybrids are characterised by reduced fitness and viability (Bomblieset al, 2007 . This is evidenced by crossing two Arabidopsis accessions, Col and Cvi. A homozygote combination state of two alleles at specific loci ( on chromosome 1 and 5) resulted to the abortion of the embryo, due to incomplete development. It was found that two paralogs in Col that encode HPA, were responsible for catalysing the synthesis of histidine, a vital amino acid for plant growth. The two genes that arose from duplication event, differ in two synonymous bases (SNPs). One of them, is found to be responsible for the weak development of the organism, because of mutations that lead to the reduced amount of histidine. In Cvi, the HPA gene from one of the two loci is thought to be deleted or never existed (Bikard et al. 2009). Therefore, homozygote offspring for the two silenced loci, are unable to synthesise the essential amino acid and cannot grow properly. Bateson- Dobzhansky- Muller model, which explains similar incompatibilities that result in the diverged function of duplicated genes, is supported in this example (Bomblies & Weigel 2007a) . It is of great importance to consider also the possibility for speciation, which could also result (Orr et al. 2004), if the progeny of the cross inherit several incompatibilities from the parental strains.

Three gene pairs examined in A. thaliana resulted from duplication events and two of them support the theory for positive selection whereas the other supports neutral genetic drift. GD1, GD2 and GD3 pairs of genes also support that the early stages in duplicate gene evolution are the most crucial ones for their fate. For the case of GD1 and GD2 gene pairs, selective positive forces operate on the daughter genes and cause divergence from the progenitor genes, depending on the functional characteristics and the redundant loci. On the other hand, gene loss early in the evolution of the duplicated locus, results in the pseudogenization of GD3 (Moore & Purugganan 2003).

Duplicated Retrogenes in mouse

Neofunctionalization and non-synonymous evolution appear to affect duplicated genes in mice, which resulted from retrotransposition and have an effect in biochemical processes (Gayral et al. 2007). Contrarily to other data, this analysis revealed that most of these genes, including pseudogenes as well, are affected by selection and not by neutral evolutionary process. The purifying selection that takes place prevents the duplicated genes from acquiring negative properties prior to elimination from the genome or silencing. Furthermore, subfunctionaliastion is not a proposed evolutionary model for retroduplicates. Retroduplicates that recently took place (after rat and mouse split), give birth to an open reading frame that holds only exons. Gayral et al (2006), by analysing closely related mouse species, was able to dissect the early stage in the evolution of different duplicated genes. The parental gene in every case showed unchanged evolutionary rate compared to the retroposed gene which was free to experience an extreme acceleration. The appearance of a new function occurred without the effect of positive selection since changes affected regulatory elements. This had an impact on the place of expression of the retrogene, resulting in tissue pecific patterns of expression. This is characterised as the asymmetric evolutionary pathway, where the duplicate has a different genomic sequence compared to the primary gene that remains expressed in the same location. Retroduplication is a common phenomenon that takes place mostly in mammalian genomes. Functional retrogenes are observed in human genome as well, arisen during the evolution of primates (Marques et al. 2005).

If mutations are accumulated in the retrogene, then the redundancy might produce an unwanted product for the organism. In this case, pseudogenization can arise through evolutionary steps (frame shift mutations might take place, stop codon might appear earlier in the coding sequence or the transcription of the DNA might be arrested) and is preferably observed in autosomal chromosomes rather in the germ line.

Adaptive evolutionary processes in macaque, human, mouse and rat

Adaptive natural selection after the duplication event, can lead to neofunctionalization. It is observed to be one of the major driving forces of young duplicated genes that have been observed recently in macaque, human, mouse and rat genomes and acts on the amino acid level of the duplicates (Han et al. 2009). Positive selection is also found to be more permissible in paralogs, where the newly genes have acquired a new position in the genome of the organism. This adaptive force allows to duplicates to be to be maintained in the genome and it is responsible for the subsequent expansion of the gene families. Genes implicated in neuronal functions, stress and immune system are some of the targets of the positive selection. Divergence in the regulation and role of serum amyloid A genes (SAA genes) is also an example. Glucocorticoids act on SAA1 and enhance its activity, whereas SAA2 does not show any response to them.

Duplicated androgen receptor in fish

During the evolution of Actinopterygian fish, polyploidization occurred. This was an early event in teleost fish, which present a great amount of paralogous genes. What followed, is a loss of the exta gene in some lineages. Diversity in the phenotype and evolutionary radiation were aided by these processes, opening the way for subfunctionalization and neofunctionalization. After the duplication of the entire genome, the androgene receptor (AR), which is important for sex- determination, gave rise to two receptors, AR-A and AR-B. Evolutionary processes led to teleost lineage, which possess the two genes. However, many basal species (like zebrafish), lack AR-B. On the other hand, AR-A and AR-B genes remain related in other teleost fish, like in the case of Osteoglossiformes and Anguilliformes). As any other receptor, they hold a DNA binding domain and a ligand binding domain. These regions are found to be susceptible in the accumulation of mutations. This phenomenon was observed in AR-B of Percomorpha. These differences in the two receptors, specifically in AR-B, are thought to reflect differences in function related to the plasticity that fish hold as it concerns the determination of sex. (Douard et al. 2008)

Duplication in Bacteria

The known genome sequence for many bacterial organisms enabled researchers to look at the behaviour of duplicated genes in different species. Biochemical pathways seem to be evolved, due to the appearance of novel enzymes translated from duplicated genes that obtained changes. The duplicated gene can also differ from the precursor gene, by a single domain. Such an example, is observed in Escherichia coli where two gene, the ribose repressor and the periplasmic protein responsible for the ribose transport, share both similarities and differences. The ability to bind to ribose is a characteristic for both proteins, whereas the ability to act as a transcription factor is a feature of the ribose receptor only. What is more, ribose receptor, gained the ability to travel to the periplasmic space and interact with the ABC transporter system. (Serres et al. 2009)

Poly(A) polymerases

Poly(A) polymerase is essential in eukaryotes for the incorporation of a poly(A)- tail at the 3'end of the mRNA. It is a post-transcriptional modification which confers stability to the mRNA. Meeks and her group (Meeks et al. 2009), analysing the coding regions of this protein in different plant species, revealed that poly(A) polymerase genes have been derived from a unique ancestral gene. A sequence of duplication events gave birth to multiple copies of the gene during the evolution of higher plants and due to their important developmental role functional specialization has also occurred. In the case of rice, some of the genes are expressed in all four tissues (leaves, stems, flowers and roots), whereas some other poly(A) polymerase genes are not observed in roots. In mammals, like mouse, apart from the basic role of the protein, polymerases that act specifically in testis have also been described (Kashiwabara et al. 2002). (Lee et al. 2000).

Gene duplication in mitochondrial genome

Gene duplication is not only observed in nuclear genome but in mitochondrial genome, as well. Mitochondria are the major energy factories of the cell, utilizing the energy that is stored in foods. They have been arisen from endosymbiotic prokaryotes and these organelles are characterised as autonomous since they have their own DNA (circular and double- stranded), to replicate and synthesise their proteins and RNAs. Duplicated genes have been observed, importantly contributing to mitochondrial evolution. Plant mitochondria, can be larger than mammalian ones, up to 100 times. As a result, rearrangements take place frequently between the repeated motifs, including recombination events that lead to deletions or further duplications (Xiong et al. 2008). In rapeseed, two copies of mitochondrial cox2 gene important for the formation of prostanoids, (cyclooxygenase enzymes - cox2-1 and cox2-2) are observed, due to duplication. However, these two cistrons diverge in 55 nucleotides upstream of the stop codon. While cox2-1, shows homology to cox2 genes of other plants, cox2-2 has an extension which does not show any homology to any other DNA sequence analysed until now (Handa 2003).

What is more, it is also observed that duplicated genes in yeast and mammalian genomes, like humans, are maintained by subcellular relocalization (Wang et al. 2009). The mitochondrial proteins may translocate to cytosol, from cytosol back to mitochondria or to other locations. What was altered on the protein was the targeting signal, conferring to the protein a new function, important in animal evolution.

Trypsin and chymotrypsin

The proteases trypsin and chymotrypsin are both secreted from pancreas into the intestinal tract and are essential for the hydrolysis of proteins. Trypsin cleaves peptide chains at the carboxyl side of lysine or arginine, whereas chymotrypsin cleaves phenylalanine and tyrosine. The two proteins show homology and are derived from the same old gene. It is found that trypsinogen, which is the precursor of trypsin, is the ancestral gene for chymotrypsinogen, which is the precurson of chymotrypsin. Multiple forbidden mutations in the active sites of the redundant duplicate of trypsinogen gene, consequently led to the creation of chymotrypsinogen. (Ohno, 1970)


Gene duplication and the consequent divergence of the novel genes are fundamental in the evolution of genomes. The examples that prove the great role of gene duplication in the evolution are numerous. Although most of the duplicated genes are silenced, the most important role of duplicating the genomic content appears to be the generation of novel genes with novel functions. The new environment that the derived gene experiences with relationship to the accumulation of mutations, release it from its old role. Duplicated genes are not only important in evolution but also for medical purposes, for example in immune defence. Analysing their location, researchers may reveal and map mutations, enabling them to design targeting plasmids (vectors) for embryonic stem cells. Relationships and differences of the genotype and phenotype between and within species can also be illustrated, providing significant knowledge for evolutionary rearrangements and the architecture of the genome of many organisms.