Form Of Molecular Medicine Is Dna Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

This is an introductory chapter. It covers the basic definitions and terms as are necessary for understanding molecular medicine. These terms will be frequently used in later chapters.

The term 'molecular medicine' deals with the application of DNA-based knowledge in medical practice. Molecular biology is based on the use of DNA in research. What an average pathologist requires is therefore, knowledge of molecular medicine, and not molecular biology in its true sense, because, few pathologists have the time or inclination to indulge in the realms of pure research.


The starting material for any form of molecular medicine is DNA. DNA was discovered in 1869 by a Swiss physician, Friedrich Miescher, who isolated DNA from cell nuclei.

Friedrich Miescher (1844-1895) decided to study leucocytes from a sample of pus. He tried to isolate the leucocytes using various salt solutions, but the cells swelled and gave rise to a highly viscous porridge that was impossible to handle. He then separated the nuclei from the leucocytes and subjected them to an alkaline extraction procedure. This resulted in the formation of a precipitate. Obviously, this material was obtained from the nucleus, and he, therefore, named it 'nuclein'. Chemical analysis showed that this was not a protein. This meant that it was some other material, which was later found to be DNA. The term 'nucleic acid' was derived from Miescher's 'nuclein'.

In 1944, Oswald Avery and colleagues showed that the genetic information in the bacterium Pneumococcus was present within its DNA. The double helical structure of DNA was proposed by James Watson and Francis Crick, a hypothesis, that has been confirmed several times since then. In 1956, Arthur Kornberg discovered an enzyme called DNA polymerase, which enabled small segments of double stranded DNA to be synthesized.

Arthur Kornberg (March 3, 1918 - October 26, 2007) was an American biochemist who won the Nobel Prize in 1959 for his discovery of "the mechanisms in the biological synthesis of DNA" together with Dr. Severo Ochoa of New York University. Kornberg's initial work included the feeding of specialized diets to rats to discover new vitamins. However, he found this work boring and was rather fascinated by the study of enzymes. After this, he never looked back. It can therefore be confidently stated that his life was made after he left the rat race.

Other significant discoveries include the discovery of mRNA and the fact that each amino acid is encoded in DNA by a nucleotide triplet. The concept of the enzyme, reverse transcriptase came later in 1970 when it was shown by Howard Temin and David Baltimore that RNA can be copied back into DNA. The discovery of this enzyme allowed us to make copies of complementary DNA (cDNA) from RNA templates.

The discovery of restriction endonucleases was another turning point in the saga of molecular medicine. Restriction enzymes are capable of digesting DNA at specific sites in the entire sequence. As we will see later in the book, each of these discoveries has helped the pathologists and the clinicians in one or the other way. The discovery of DNA ligase, which stitches together DNA molecules, allowed the creation of recombinant DNA. This led to the idea of DNA cloning. The concept of DNA cloning will be elaborated later in this book.

The idea of DNA cloning was followed by the notion of DNA probes. This meant that small segments of DNA labeled with radioactive marker were able to identify specific regions in genomic DNA thorough annealing or hybridisation. Soluble hybridisation gave way to hybridisation on solid support membranes when DNA digested with restriction endonucleases could be transferred to these membranes through Southern Blotting. This discovery enabled the construction of DNA maps and DNA mutation analysis.


In 1977, it was discovered that the eukaryotic genes were discontinuous and the coding regions had intervening non-coding segments of DNA. These coding and non coding regions were named 'exons' and 'introns' respectively. It was later found that the body used a mechanism called splicing to remove the intervening segments. The concept of variations in the genetic structure, also called 'DNA polymorphism,' was introduced in the 1970's. This concept has been extensively studied since then as it appears to be the basis of differences in human phenotype.


Until the mid 1980's, the understanding of the human genetic diseases relied entirely on the identification and characterisation of abnormal proteins. With molecular methods, it was possible to clone the relevant gene by understanding the structure of the proteins. From the cloned gene, more information could be obtained about the underlying genetic disorder. This method was called functional cloning. An example of the use of functional cloning was the discovery of the role of the α and β globin genes in the pathogenesis of thalassemias.

The reverse of functional cloning was called positional cloning. In this method, the protein is entirely bypassed and the gene is recognized on the basis of its chromosomal location and certain characteristics that identify the DNA segment as 'gene-like'. The identification of the mutant gene as well as the knowledge of the genetic disorder could be inferred from the DNA sequence. This was also called 'reverse genetics.' The term is synonymous with 'positional cloning' because the identification of the gene was based on its position in the genome. Genes like the DMD (Duchenne muscular dystrophy) gene and the gene for chronic granulomatous disease were identified based on this method.

Guillaume Benjamin Amand


Although Duchenne muscular dystrophy was first described by Neapolitan physicians Giovanni Semmola in 1834 and Gaetano Conte in 1836, this disease is named after Guillaume Duchenne, a French neurologist. He described a child who had this condition and later, in 1868, he described 13 other children who also had this condition. Duchenne was the first who did a biopsy to obtain tissue from a living patient for microscopic examination.

The gene for DMD was discovered in 1986-87 through the independent efforts of Dr. Ronald Worton at the Hospital for Sick Children in Toronto and Dr. Lou Kunkel at Boston Children's Hospital. As the first disease-causing gene to be identified from scratch, it paved the way for the golden age of gene discovery. Since the disorder affects mainly boys, it was clear that the gene responsible for DMD must be on the X chromosome. By studying the X chromosome of children with DMD, researchers were able to pinpoint the gene responsible for the disorder in 1986. Mapping the DMD gene was an enormous task as the gene turned out to be massive - even today, it is the largest human gene (more than 100 times the average size) that has been discovered so far. This gene was later found to be responsible for the production of a muscle protein 'dystrophin', that has key role in muscle function. An almost complete absence of dystrophin was at the root of DMD (Fig 1.1). Later, a milder form of Duchenne muscular dystrophy was identified and it was named after Dr Peter Emil Becker who first identified the disease.


Fig 1.1: Mutations that result in the occurrence of Becker and Duchenne Muscular Dystrophy


The Polymerase Chain Reaction (PCR) was first described by Kary Mullis. In this method, segments of DNA were targeted with oligonucleotide primers and subsequently amplified. This method has several diagnostic and research applications. It can be used to target not only DNA sequences but also RNA sequences. The PCR will be elaborated upon in subsequent chapters, however, it is suffice to say for the present, that this technology has revolutionized molecular medicine.

<H2>Polymerase Chain Reaction - Xeroxing DNA

The polymerase chain reaction, relies on the ability of DNA-copying enzymes to remain stable at high temperatures. Its ability to amplify DNA extracted from fossils, is in reality, the basis of a new scientific discipline, paleobiology.


In 1910, Peyton Rous proved that a filterable agent (virus) was capable of causing cancer in chickens.

Peyton Rous

During his second year in the Johns Hopkins Medical College, Peyton Rous scraped the skin of a finger on a tuberculous bone while doing an autopsy and soon a tubercle formed there. He dropped out of Medical School for a year and graduated in 1905. Finding himself unfit to be a 'Real doctor', he turned to medical research. He took over a laboratory for cancer research and was able to prove that some 'spontaneous' chicken tumours, to all appearances classical neoplasms, are actually started off and driven by viruses which determine their forms as well.

We now know that the region of the viral genome (DNA in DNA tumor-viruses or RNA in RNA-tumor viruses) that can cause a tumor is called an 'oncogene.' This foreign gene can be carried into a cell by the virus and cause the host cell to take on new properties. The discovery of viral oncogenes in retroviruses led to the finding that they are not unique to viruses and homologous genes (called proto-oncogenes) are found in all cells. Indeed, it is likely that the virus picked up a cellular gene during its evolution and this gene has subsequently become altered. Normally, the cellular proto-oncogenes are not expressed in a quiescent cell since they are involved in growth (which is not occurring in most cells of the body) and development; or they are expressed under strict control by the cell. However, they may become aberrantly expressed when the cell is infected by tumor viruses that do not themselves carry a viral oncogene. Peyton Rous published his findings in the Journal of Experimental Medicine in 1910 (Fig. 1.3)

Soon after the discovery of the oncogenes, another set of genes was discovered that controlled cellular growth. These genes were called 'tumour suppressor genes.' It was also seen that mutations in copies of both these genes were responsible for causing cancer.

In addition, now we have genes regulating apoptosis, which are also known to be dysregulated in cancer.

Fig. 1.3: Peyton Rous original paper on a transmissible avian neoplasm. J Exp Med 1910 Sep 1;12(5):696-705.


Over a period of time, it was realised that mutation analysis is a very useful informational tool for management of patients with genetic/ or infectious disorders. It also provides a lot of information about the therapy and prognostication of neoplastic lesions.

The gold standard method for mutation detection is sequencing. However, several other methodologies are also available. These methodologies will be elaborated upon in subsequent chapters. Sequencing is technically difficult as a result of which it is not performed in every case. The PCR provided some respite when it came to the diagnosis of mutations but its very specificity became a disadvantage. Therefore, as a screening strategy, SSCP (Single Strand Conformational Polymorphism), DGGE (Denaturing Gradient Gel Electrophoresis) and CCM (Chemical Cleavage of Mismatch) were adopted.

However, the need for mutation analysis in a specific case should be evaluated in the following perspective-

Is the test of any use to the patient or is it being done merely out of scientific curiosity?

Will the test help in prognosticating the disease?

Is the cost-benefit ratio for carrying out the test reasonable?

All these questions need to be answered before a mutation analysis is carried out.


With rapid developments in molecular biology, it was realised that if the complete sequence of human genome is known, it would immensely help in understanding the pathogenesis of diseases. The Human Genome Project was launched in October 1990 to decode the sequence of human genome. The United States Department of Energy (DOE) was a key player in proposing the Human Genome Project (HGP) in 1987 (Fig. 1.4). The project began because the DOE was interested in nuclear weapons and the only way to understand the mutagenic potential of nuclear weapons was to decipher the entire DNA sequence of the human genome.

In the mid 1980's it was believed that there were between 50,000 to 1,00,000 genes in the human genome. The sequence of most of these genes was not available at that time. However, later it was realised that there were only about 30,000 genes and that the great majority of the DNA in the genome did not contain gene sequences. This was called junk DNA and was not explored for a long time.

Fig.1. 4: Logo of the Human Genome Project

In 1988, the DOE and the NIH (National Institute of Health) decided to further explore the potential of the HGP. This included creation of the physical and genetic maps of the human genome (which was accomplished in the mid 1990's) as well as the mapping and sequencing of a set of five model organisms including the mouse. The HGP started in 1990 with a budget of $ 3 billion and a time estimate of about 15 years. The HGP is not 'Human' in the strict sense; it intended to characterise the genome of other organisms also like the mouse, the fruit fly and the fish.

The HGP had some impediments in addition to the biological component. There were legal and ethical issues involved. There was fear that the data generated would be used inappropriately. There were also issues related to privacy and confidentiality. Therefore, the Ethical, Legal and Social Implication (ELSI) cell was set up to study these issues. The budget of the ELSI cell was 3% of the total budget of the HGP. The ELSI also incorporated features like education of the public and the professionals.


The components of the HGP were as follows:

Mapping and sequencing of the human genome with the ultimate aim of determining the sequence of 3 billion base pairs that makes up the human genome.

Mapping and sequencing the genomes of model organisms like Saccharomyces cervisiae, Caenorhabditis elegans and Drosophilia melanogaster.

Identifying the 30,000 or so genes that make up the human genome.

Developing tools for large scale data analysis. This includes developing software and databases to support large scale data collection, distribution and access.

Providing training courses and creating training posts.

Transferring technologies to the private sector. This included rapid dissemination of the knowledge to the users. This also included developing a flexible distribution system.

Addressing the ethical, legal and social implication of the HGP. This included issues like privacy, confidentiality, stigmatisation, discrimination, equity and education of the public and health care workers.


The achievements of the HGP can be discussed broadly in two steps. Between 1991 and 1995, the achievements were steady. Many cooperating laboratories constructed maps of the genome and identified the DNA sequences. The Human Genome Organisation (HUGO) coordinated all international efforts for analysing the DNA sequences.

Between 1996 and 2000, there were several developments in technology. Automated sequencers had become available by that time, and therefore, the speed of analysis of the genome increased significantly. A commercial company called Celera began to compete with the HGP in terms of speed and accuracy of genetic sequencing. Celera used the shot gun approach; it basically fragmented the entire human genome into fragments and each individual fragment was sequenced. The problem with this approach was that although the speed was good, the accuracy was not as good as standard methods of genome sequencing.

The completion of the HGP was announced in June 2000 and in February 2001, Science and Nature published the first draft of the complete sequence of the human genome. In April 2003, 50 years after the structure of DNA was unraveled, NIH announced the completion of the high quality comprehensive sequencing of the human genome.

<H2>What has the Human Genome Project taught us?

We now know that the human genome contains 3164.7 million chemical nucleotide bases (A, C, T, and G). An average gene consists of 3000 bases, but sizes vary greatly. The largest known human gene is dystrophin, which consists of 2.4 million bases. The total number of genes is estimated at approximately 30,000 as has been previously mentioned. Almost all (99.9%) nucleotide bases are exactly the same in all people. Variations in the genetic makeup of two individuals are only to the tune of about 0.1%. This difference determines variations in facial features, mental functions or even in resistance to disease. The HGP also taught us that the functions are unknown for over 50% of the discovered genes.

We also know that less than 2% of the genome codes for proteins. A large number of repeated sequences that do not code for proteins ("junk DNA") make up at least 50% of the human genome. We are still not sure as to what the functions of these repetitive sequences are, but it is believed that over a period of time, these repeats reshape the genome by rearranging it, creating entirely new genes, and modifying and reshuffling existing genes. During the past 50 million years, a dramatic decrease seems to have occurred in the rate of accumulation of repeats in the human genome.

The gene dense areas are predominantly composed of the DNA building blocks, G and C. The gene poor areas are composed predominantly of building blocks of A and T. GC- and AT-rich regions can usually be seen through a microscope as light and dark bands respectively on the chromosomes. Genes appear to be concentrated in random areas along the genome, with vast expanses of noncoding DNA interspersed in between. Stretches of up to 30,000 C and G bases repeating over and over often occur adjacent to gene-rich areas, forming a barrier between the genes and the junk DNA. These CpG (Cysteine-phosphate-Guanil) islands are believed to help regulate gene activity. Chromosome 1 has the most genes (2968), and the Y chromosome has the fewest (231).

In comparison to other species, the human genome shows a random distribution of gene rich areas. Humans have on average three times as many kinds of proteins as the fly or worm because of 'alternative splicing' where one mRNA molecule can translate into more than one protein. The number of gene families has expanded in humans as compared to worms, flies and plants. The human genome also has a much larger number of repeat sequences as compared to other species.

What are the variations and mutations that have been identified in the genome?

Scientists have identified about 1.4 million locations where single-base DNA differences (SNPs) occur in humans. Single Nucleotide Polymorphisms arecrucial in the age of personalised medicine and will be elaborated upon later. Germline mutations are far more common in the sperm rather than the egg. This is probably because the sperm undergoes a far greater number of divisions than the egg during its formation.

What are the applications of the Human Genome Project?

When the project began, it was feared that the entire project was nothing but a huge exercise in data collection. However, we now know that a knowledge of the DNA sequence allows us to identify biological systems. It also allows us to identify genes associated with different diseases. A knowledge of these genes will drive research and allow us to provide focused targets for the development of effective new therapies.


Knowledge of the genome is important in helping us understand human biology. However, genomics does not provide complete explanation of the intricacies of the functioning of the human body. The DNA in genes transcribes to RNA which then translates into proteins. Proteins undergo various changes even after being translated. Therefore, it is necessary to understand how DNA and proteins work with each other and the environment to create complex, dynamic living systems. Some of these technologies relate to studies in transcriptomics, proteomics, structural genomics, new experimental methodologies, and comparative genomics.


The transcription of genes to produce RNA is the first stage of gene expression. The transcriptome is the complete set of RNA transcripts produced by the genome at any one time. Unlike the genome, the transcriptome is extremely dynamic. The genome in the cell is constant, however, the transcriptome varies considerably in differing circumstances due to different patterns of gene expression. Transcriptomics, the study of the transcriptome, is therefore a global way of looking at gene expression patterns.


The study of the proteome, the complete set of proteins produced by a species, uses the technology of large-scale protein separation and identification. The term proteomics was coined in 1994 by Marc Wilkins who defined it as "the study of proteins, how they're modified, when and where they're expressed, how they're involved in metabolic pathways and how they interact with one another."

<H2>Structural Genomics

This describes the three dimensional structure of every protein that is encoded by the genome. The key word here is genome because structural genomics does not focus on any one protein, rather, it elucidates the structure of every protein encoded by the genome. Since protein structure determines its function, structural genomics has the potential to provide knowledge of protein function also. It also identifies potential targets for drug discovery.

<H2>Comparative genomics

This analyses DNA sequence patterns of humans and well-studied model organisms side-by-side. It has become one of the most powerful strategies for identifying human genes and interpreting their function.


What does the future of Molecular Medicine hold?

The cloning of Dolly, the sheep, proved that the cloning of humans is perhaps not very far away. In a sense, would that create immortality? If so, what are the ethical issues involved?

Ian Wilmutis, a Scottish embryologist, who in 1996, became the first to clone a mammal, a Finn Dorset lamb, named Dolly from fully differentiated adult mammary cells (Fig. 1.5). Wilmut's work, published in 1997, pushed the concept of cloning into news and public debate.

Fig 1.5 - Dolly the sheep; the world's first cloned animal

Fig. 1.6 - A schematic representation of a gene being cut.

In the near future, will it be possible to remove potential disease causing genes and leave the embryo healthy with a prospect of a disease-free life? In other words, would it be possible to have perfect human beings and would that render doctors redundant?

Microdissection of chromosomes may prove that such ideas will not remain in the realms of science fiction for long (Fig 1.6).

Will 3-D imaging change our understanding of protein chemistry? Very likely. With improved computer programmes which are capable of generating 3-D images, it will be possible for the pathologist to predict the surface proteins a tumor has. Will this lead to improved diagnostic techniques? Very likely. Only the future will tell.

As we can see, the questions are endless, the answers are limited. Molecular Medicine is at the threshold of an exciting future. Perhaps, that is one reason why this book should be read. The book will cover some aspects of all this and more.