The Study Of Genomics Has Helped Gene Research Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Whole genome sequence data is a valuable research resource which displays all the genes necessary for the survival of the bacterium, organisation of open reading frames, non-coding regions, insertional sequence and horizontal transferred elements. It is also a catalogue containing every virulence determinant, drug target and potential vaccine component. By sequencing and functionally annotating bacterial genes in large databases, it is also possible to identify genetic differences/similarities across entire genomes to find the biological function of these differences, and to explore the evolutionary pressures that contribute to gene reduction, transfer, or gain and their implications. Additionally, genome sequencing has made it possible to identify specific molecular targets that could be used for typing and identifying bacteria and can also be utilized to reveal antibiotic resistance and thus predict therapeutic outcomes. One field that has benefitted from genome sequencing is the study of bacterial pathogenesis. It would make sense that in order to understand how bacteria cause disease, it would be essential to identify both the genes and genes products implicated in virulence. Genome sequencing has contributed to this field by generating new ways to screen for virulence genes and link them to specific phenotypes, as well as reveal the degree of genetic variability with different pathogenic species.

Virulence gene discovery

In the pre-genomic era, detecting virulence factors involved laborious biochemical approaches or forward/reverse genetic manipulations. Today, bacterial genomics has not only facilitated many of these common techniques, but has been the foundation for the development of novel strategies for virulence gene identification. These methods use comparative genomics, computational methods, genetic signatures, and linkage of genes to additional genetic elements. Although some of these approaches are based on direct analysis of genome sequence data, others require some genome sequence information but are do not need specific sequence analysis. In all these strategies however, complete genome sequence data have proven invaluable for the discovery and functional analysis of potential virulence genes(1).

Database searches

Since a bacteria's genome is the determining factor for its diverse behaviours and pathogenic traits, comparing the genetic content of different bacteria/microorganisms can yield valuable information. For example, with computational methods it is possible to screen out prospective virulence genes by their sequence similarity to known genes annotated in several bioinformatic databases. This approach became evident after the sequencing of the entire H.influenza genome, were 25 new genes involved in the synthesis of LPS, a virulence factor, were found in a homology search against known LPS genes in other bacteria(2). Further studies looking at the genomic conservation of these LPS genes among different H.influenzae strains, eventually identified a core LPS structure which can potentially be used as a broad-range acapsular H.influenzae vaccine(3). Similarly, a homology search Lin et al. revealed a toxin gene cluster in V.cholerae strains, El Tor O1 and O139, with high similarity to genes found to assemble RTX toxins in gram-negative bacteria. Further studies linked these toxic gene cluster to V. Cholerae cytotoxicity in vitro(4). Moreover, the research group was able to demonstrate through representational difference analysis, another type of genomic comparison method, that these genes were deleted in the classical V.cholera biotype, which has been replaced by both of the strains described above in cholera pandemics. These finding were significant because not only did it provide additional understanding of an additional pathogenic mechanism previously unknown, but suggest that this gene cluster in the newer cholera biotypes might provide a selective advantage during displacement of the classical biotype.

Comparative genomics can also be used to detect virulence genes specific to bacteria within a genus or within a strain/species. Xue et al., for example, performed a comparative analysis on species of Leptospira: L.borgpetersenii, L.interrogans, and L.biflexa. All these species have different lifestyles, from the non-pathogenic soil bacterium L.biflexa, to the pathogenic L.interrogans and L.borgpetersenii. Comparative genomics allowed the identification of 656 genes found only in L. interrogans and L.borgpetersenii with no orthologous in the soil bacterium L.biflexa, suggesting their potential as specific virulence factor ; analysis also showed little orthologs in genes encoding for possible outer membrane antigens in both pathogenic and nonpathogenic Leptospira, suggesting that these contribute to virulence and host tropisms. Genomic analysis also indicated that L.borgpetersenii, which is the only species within the three studied that can not survive outside its host, is missing genes found in the two others which are implicated in their increased survival in different environments (including a DNA repair protein) and thus better transmission. This finding suggest factors which can contribute to the genomic basis for differences in transmission potential, an important pathogenic factor, that requires further testing(6,7). The incredible extent of genomic diversity and pathogenicity is also shown in studies of E.coli, where genome comparison between E. coli O157:H7 and E.coli K12 revealed that the former contained more than 1,300 strain-specific genes; these genes coded for extended metabolic mechanisms and putative virulence factors(8,9).

Sequence similarity studies is not only limited to bacterial genes, but can be applied to various host proteins or functional motifs to detect novel virulence genes. Vance et al. used this technique when they found an ORF with high sequence similarity to eukaryotic lipoxygenases in Pseudomonas aeruginosa. Further characterization showed that this bacterial lipoxygenase was not only secreted, but it synthesized antiinflammatory mediators from eukaryotic substrates. Knowing that a clinical significant outcome of P.aeruginosa is cystic fribrosis, which is regarded as a chronic lung inflammatory condition, the presence of a lipoxygenase that synthesized mediators known to modulate inflammatory responses is significant and offers further pathogenic mechanisms(10). Additionally, this finding could also hint that acquiring/producing machinery which can synthesize antiinflammatory lipid mediators may be a general scheme by which pathogens manipulate the course of infection in their hosts.

Genetic signatures

As bacteria adapt to a range of environments, many genes are subjected to certain host selective pressures. By using genomic sequence information, it is possible to identify potential virulence genes by distinguishing genetic signatures of such selective pressure, otherwise known as virulence gene markers(1). For example, simple sequence repeats (SSRs) found in several bacterial genomes have been shown to regulate the on/off expression of various surface antigens by a mechanism employing polymerase slippage. Since these SSRs can be located within the ORF of a gene or on the transcription promoter region, their high mutation rate can affect both transcription and/or translation of genetic sequences. These repeats have been associated with bacterial phase variation, since they are mostly located in or near genes encoding surface molecules. These rapid reversible mutations allow rapid phenotypic changes, leading to immune evasion, host persistence, and survival of bacteria in a constantly changing environment(10, 11). Examples include H.influenzae, which is able to produce variant clones by reversibly switching on/off expression of LPS genes that are required for the addition of sugars to the backbone of the LPS molecule, resulting in antigen variation and immune evasion(10). Similarly, H.pylori also possess homopolymeric repeats in (or near) seventeen genes that code for surface proteins. Of these, it is particularly significant that eight genes encode adhesins, vital for H.pylori attachment to gastric epithelial cells, and therefore significant in its phase variation(12,13). By using genome sequencing, it is therefore possible to identify potential virulence genes that contain SSR's in or near their sequences.

Another mechanism for differential gene expression also includes the use of DNA inversion sequences, used by Bacteroides fragilis, to regulate genes encoding surface proteins, regulatory molecules, and polysaccharides. These again allow immune evasion and increased ability to colonize new sites. This mechanism for phase variation in this opportunistic pathogen was previously unknown before the sequencing of its genome(14).

Additional genetic elements

Certain genetic elements such as plasmids, prophages, transposons, or pathogenic islands are known to frequently encode virulence factors, including antibiotic resistance. Such elements can exist as separate entities (such as a plasmid) or be chromosomally integrated. Recognition of these elements is in part by the proteins they encode for and by a GC content and codon usage that is clearly varied from the rest of the genome and indicates their recent acquisition(1). Thus, potential virulence genes can be recognized by looking for genes encoding, for example, membrane bound proteins linked to these insertion elements. This method for virulence gene recognition was used in the discovery of the RTX gene cluster in V.cholerae. When sequence analysis exposed a TLC element (which encodes a gene homologue found in filamentous phages) and an upstream CTX prophage (part of a filamentous bacteriophage) , the team asked whether genes downstream could encode for further phage-related proteins. By analyzing the ORF of contigs downstream of these elements, the virulence gene cluster RTX was exposed and further testing confirmed its role as a virulence factor(4).

Recognition of these elements often occur after comparative genomics between different strains identify these as variable regions(1). Additionally, by studying the genome sequence of multiple bacteria, it is also possible to identify where these genes originated from to determine how easily they can be distributed within bacterial populations. More on this topic is discussed below.

Detecting the genomic basis of different phenotypes and understanding genomic dynamics of pathogenicity

Genome sequencing has provided important clues into the evolutionary dynamics of bacteria. It has not only exposed the basic genetic capacity of an organism, but has allowed researchers to gain insight into the genetic susceptibility of bacteria to evolutionary change, the forces that drive these changes, and which of them have biological significance. These studies teach us that various forces have shaped the bacterial genome throughout its evolution, e.i. genome reduction, gene change, or gene acquisition, and that these forces predominate in certain bacteria lineages. Furthermore, genomic projects can now help pinpoint the potential genetic basis for differences in disease severity and host range, allowing us to better understand pathogenicity(1).

Detecting Genes deletion/lost of function

A clear example of the role of genomics in understanding the development of pathogenicity is in the study of reductive evolution. By sequencing various intracellular microorganisms, researchers have discovered that these highly host adapted pathogens usually contain reduced genomes, with deletions in various metabolic enzymes. While it was first thought that deleted genes were no longer required for the survival of a pathogen in its host, further comparative genomics have revealed that this gene lost may actually be beneficial survival, a clear pathogenic trail(1). This hypothesis has been applied to Mycobacterium tuberculosis, where comparative genomic microarray of 100 clinical isolates, showed that deletions of genes coding for certain cell wall components might reduce the amount of immunological recognisable antigens. Additionally, other deletions confer clear pathogenic benefits by promoting antibiotic resistance ( katG, inhA, rpoB)or by limiting latency and promoting faster transmission(2). Similarly, it was found that the genetic background for differences in tropism and pathogenicity in Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica, might potentially be due to large-scale gene silencing and deletions in B.parapertussis and B.pertussis(3).

A second example of positive selection for genome reduction detected by genome sequencing is pathoadaptation mutations. In these mutations, loss of "antivirulence genes" is thought to mediate the evolution of bacteria from their ancestral commensal origins to newly evolved pathogenic genomes. In this sense, genes absent in all virulent clones of a species, but consistently induced in related non-pathogenic ancestor species, are likely candidates for such "antivirulence" genes that have been lost by evolution(4). Evidence in support of this model has come from the genomic sequencing of Shigella ssp. and by comparison to its ancestral origin, E.coli. In Shigella, experimental evidence has shown that the lost of the cadA gene (encoding a lysine decarboxylase (LDC)), expressed in >90% of E.coli, provided a unique selective advantage; Cadaverine, the product of LDC activity, is shown to prevent the expression of plasmid- mediated virulence factors, found to be essential for Shigella's gut invasiveness, thus diminishing its enterotoxic activity and inducing its inactivation/deletion from the genome. These genetic studies therefore reveal that after evolving from E.coli, separate Shigella species underwent selective pressures that selected clones that lacked the ancestral cadA gene(5).

Detecting genes gained

Horizontal gene transfer is mediated by various mobile genetic elements, including plasmids, transposons, prophages, and pathogenicity islands(6). These genes often contain virulence genes that facilitate the colonization of new niches by bacteria, as well as cause damage to the infected host. In many cases, as seen for genes lost, genes gain can also be responsible for the transformation of commensal microorganisms to pathogenic bacteria. A clear example exist in the Shigella spp, where although genome sequencing reveals 90% homology to E.coli(7), genomic analysis has confirmed that the acquisition of a large virulence plasmid encoding most genes for invasion, induction of inflammation, replication, and transmission, was the main evolutionary event to separate it from its E.coli ancestors. Accessory genetic elements have also been responsible for much of the differences in pathogenicity traits in strains within a species. For example, studies on four strains of Staphylococcus aureus and two Staphylococcus epidermidis strains revealed seven pathogenicity islands specific to S.aureus, which possessed a myriad of toxins or virulence determinants. Additionally, various integrated elements made up 7% of the genome of one S.aureus strain, COL, indicating their important in virulence(17). Antibiotic resistant specific to S.aureus is also dictated by a chromosomally inserted genetic element SCCmec which is responsible for the rise of multi-drug resistant S.aureus, a bacterium which causes serious hospital-acquired infections worldwide(18). Genome sequencing thus, helps bring to light potential gains of virulence genes by basically contributing genetic maps that exposes the clear markers separating these horizontally transferred genes from the rest of the genome.

Detecting genes changed

The presence of significant sequence diversity in similar bacteria provides a foundation for understanding pathogenesis, disease severity, tropism, and bacterial evolution. As more genomes are sequenced, scientists have acquired a better understanding of intra/inter-species variation, and why certain strains in/within a species are pathogenic and some are not. Genome sequencing and thus comparative genomics can, for example, detect polymorphic genes. Polymorphic genes are often potential virulence factors or immune stimulators, because antigens that interact directly with the host are known to be genetically heterogeneous. Detection of polymorphic sequences can also serve as markers for evolutionary studies and can provide clues into the evolution of pathogenicity in some strains. Similarly, only direct comparison of whole genomes can detect single-nucleotide polymorphisms (SNPs) (1). SNP detection has been applied to find clues into the genetic cause of different phenotypes in Mycobacterium tuberculosis, B. anthracis, and Escherichia coli(8,9,10). G These SNP have become important, not only in typing methods which have aided disease diagnosis, but in understanding genetic changes and their implications i.e Burkholderia mallei after accidental human infection(11)....

Beyond Sequence: Genomics

Although analysing DNA sequences yields important hypothetical information on pathogenesis, further biological/physiological studies are needed to link a gene to a particular pathogenic phenotype. In this sense the availability of sequence information greatly facilitates this task by augmenting conventional methods for virulence-gene discovery, while leading to the development of other high-through put techniques(8). In one of these methods, signature-tagged mutagenesis, genome sequencing has allowed it to evolve into a high-through put technique that can simultaneously examine all the genes in a microorganism. In a similar method developed after genome sequencing efforts, targeted-genetic footprinting, the complete genome sequence is used to create PCR primers to ...(9). Other post-genomic methods include genomic microarrays, used for examining virulence gene expression of pathogens during infection and for enhancing the understanding of pathogenesis by comparing genome data of pathogenic and nonpathogenic strains (10). Similarly, genome sequences analysis can provide a database of all hypothesized gene products, which contribute to proteomic analysis of proteins induced under different experimental conditions. This proteomic and sequence analysis was used to define a new virulence-associated protein-secretion pathway that functions in V.cholerae strain V52(11).

Further Clinical Applications and Final Thoughts

Complete bacterial sequencing results in information that is unobtainable by any other means. Not only does it contain data on the organisation of ORFs, noncoding regions, orphan genes, and insertion sequences, but it also contains potential virulence genes, virulence markers, vaccine candidates, and drug targets. Adaption by many commensal and virulent microorganisms is a dynamic and constantly changing process, as the host environment itself is a life form and therefore sensitive to the same selective pressures faced by bacteria. Genomic sequencing can therefore be used as a tool to not only unearth the complexity of genomic dynamics and bacterial evolution, but offer insight into host-pathogen relationships in relation to environmental adaptation. In this same context, genomics also reveals the extent of variability with bacterial species and allows the exploration of why these differences are important for both the pathogenicity of the bacterium, as well as its biology. Knowledge of these putative resistance mechanisms will promote better use of existing drugs and facilitate the conception of new therapies Although a basic step, genome sequencing and virulence gene identification can only provide clues on mechanisms of pathogenesis, and further biological and physiological studies need to be performed. are allowing for correlations to be made between genotype and phenotype in many instances

What genomics contributed was the area that indeed, the availability of only one complete genome sequence for a given taxon, which was a dream only two decades ago, is now considered inadequate for describing the complexity of species and genera and their inter-relationship

Genomic analysis allows the determination of which factors are unique to certain bacterial species/strains and which pathogenic factors are conserved among bacteria. Additionally, bacterial genomics has revolutionized virulence gene discovery, by not only the methods employed, but by allowing genetic signitures to be detected, such as SSRs... These data provide an unprecedented opportunity to examine how bacterial pathogens evolved from their commensal ancestors.

As genes and

combinations of genes that contribute to pathogenesis are

identified, it is also essential to understand where these

genes originate in order to assess how easily they can be

transferred among members of bacterial populations.

Whole genome sequencing has lead to a broader understanding of microbial pathogenesis and the complex interactions between bacteria and their hosts. The study of over 250 genome sequences by both computational and experimental methods has allowed the identification of genes involved in the process of pathogenicity, i.e those responsible for colonisation, toxin production, effective transmission, and persistence, and has provided immeasurable insight into pathogen diversity and evolution.

Genome sequencing has been able to bring examples of this to light, by not only unearthing the suspected evolutionary origin of a pathogen, but by being able to describe how and why the pathogen became virulent

brought to life a re-evaluation of bacterial diversity, where the complete sequence of a few chosen representatives of the bacterial domain is not longer deemed sufficient for understanding the variety found in the prokaryotic world. It is also hoped that an understanding of the metabolic requirements of such fastidious organcisms will allow their culture in the lab, an obstacle to the study of most intracellular pathogens.