The Human Epigenome Project Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The Human Genome Project provided a map of the human genome. However, it did not predict how the genome is packaged into chromatin. It is important to know how the genome is packaged into chromatin because this dictates how a differential expression of genes at different times in the course of development will occur. This, in essence is the study of the epigenome. The human epigenome project was, therefore, launched to provide better understanding of the human epigenome. Epigenetic processes are now known to be increasingly involved in modulating the phenotype.

The human epigenome project explains the relationships that exist between the major epigenetic players and are called the 'epigenetic code'. It explains the concept of methylation and it provides comprehensive DNA methylation maps which are collectively called the 'methylome'. It also provides an understanding of other epigenetic mechanisms like histone modifications.

The aim of the human epigenome project was to identify the chemical changes and relationships that exist between chromatin constituents that provide function to the DNA code. This allows us to understand the physiology of normal development, aging, abnormal gene control in cancer and other diseases as well as environmental health.


Genomic imprinting is an epigenetic phenomenon by which epigenetic chromosomal modifications drive differential gene expression according to the parent of origin. There are two alleles inherited from the parents. Usually, both these alleles are expressed. However, in the case of imprinted genes, only one allele is expressed. The expression of the gene is entirely according to the parent of origin. Expression can be due to an allele inherited from the mother (as in H19 and CDKN1C genes where the paternal allele is imprinted) or it is because of an allele inherited from the father (such as the IGF2 gene where the maternal allele is imprinted). This inheritance does not follow classical Mendelian genetics. Usually, imprinted genes are involved in a particular stage of development.

Imprinting is essentially a dynamic process. The profile of the imprinted genes varies during development. One of the main mechanisms involved in the control of imprinting is DNA methylation. Histone modifications can also play a role in imprinting. Imprinted genes are seen to occur in clusters and the control of these genes is by common regulatory elements. The regulatory elements maybe noncoding RNAs or Differentially Methylated Regions (DMRs). Differentially methylated regions are segments of DNA rich in cytosine and guanine, with the cytosine nucleotides methylated on one copy but not on the other. As mentioned, these regulatory elements are clustered together and these regions are called 'Imprinting Control Regions' or ICRs. Any change in the methylation patterns in the ICRs would lead to a loss of imprinting and an abnormal expression of the parental gene.

<H2>Imprinted Genes and Human Genetic Diseases

Expression of imprinted genes is essentially monoallelic. There is only one copy of the gene and that copy is inherited from one parent. So, any problem with that gene would cause a genetic situation like a recessive mutation.

Prader-Willi syndrome (PWS) is a complex genetic condition. It is believed to be the most common genetically identified cause of life threatening obesity in humans. Approximately 350,000 - 400,000 people are affected worldwide. It is seen in approximately one in 10,000 to 20,000 live births. It is seen in all races and ethnic groups but it is more common in Caucasians. The clinical symptoms include

Obesity including early childhood obesity

Short stature

Small hands and feet

Growth hormone deficiency

Mental retardation and behavioral problems

infantile hypotonia


The patients have a characteristic facial appearance with a narrow bifrontal diameter, short upturned nose, triangular mouth, almond-shaped eyes, and oral findings (sticky saliva, enamel hypoplasia).

The origins of the complex genetic syndrome were elucidated by Butler and Palmer in 1983. They reported that the deletion of chromosome 15 was a de novo event. They also found that the deleted chromosome was donated only from the father. In about 70% of the cases of Praeder Willi syndrome, the 15q11-q13 deletion occurred in a chromosome inherited from the father. In about 25% of the cases, the patients had maternal disomy. This meant that both the copies of chromosome 15 were inherited from the mother. In effect, this meant that there was a complete absence of the paternal chromosome 15. In the remaining 5% of cases, there was a defect in the imprinting centre controlling the activity of the genes in chromosome 15.

In this last 5% of cases, there would be a defect in the ICR as referred to previously and then there would be a change in the methylation pattern of the gene leading to loss of imprinting. Several paternal genes are expressed in this region and so it is difficult to pinpoint one gene as the cause of all the problems.

Angelman syndrome (AS) is another disorder of chromosome 15. The prevalence of Angelman syndrome is not precisely known. This has an entirely different clinical presentation. The clinical features are as follows:

Seizures, severe mental retardation, ataxia and jerky hand movements.


Inappropriate laughter and lack of speech


Maxillary hypoplasia, a large mouth with a protruding tongue.

A prominent nose and widely spaced teeth.

In contrast to Praeder Willi syndrome, the maternal copy of chromosome 15 is deleted in Angelmans syndrome. There is a deletion of 15q11 - q13. As mentioned earlier, the exact gene involved in the pathogenesis of Praeder Willi syndrome is not known since there are several genes at that locus in the paternal chromosome. However, in contrast the gene involved in the pathogenesis of Angelmans syndrome is known. Angelmans syndrome is because of imprinting of a single gene, UBE3A, a ubiquitin ligase gene involved in early brain development.


Some genes are constitutive genes and are expressed all the time. However, there are some genes which are expressed only at certain times. These genes can be expressed only in specific tissues (spatial expression) or they may be expressed at specific times (temporal expression). The most widely studied epigenetic modification is the cytosine methylation of DNA within the CpG dinucleotide.

The CpG dinucleotide is a sequence of 5'-CG-3'. The "p" in CpG refers to the phosphodiester bond between the cytosine and the guanine. This CpG indicates that the cytosine and guanine are next to each other on the nucleotide strand in the sequence of nucleotides. It does not matter if the nucleotide strand is single or double stranded. During evolution, the dinucleotide CpG has been progressively eliminated from the genome of higher eukaryotes and is present at only 5% to 10% of its predicted frequency. In the genome, the size of these CpG islands varies between 0.5 to 5 Kb. They occur on an average after every 100 kb. CpG islands are usually found in the promoter region of genes. These CpG islands are responsible for turning gene expression on and off. Chromatin containing CpG islands is generally heavily acetylated, lacks histone H1, and includes a nucleosome-free region. This essentially means that the chromatin is in an open state and it is available for transcription. The point to note at this stage is that the DNA is not bound and unavailable for transcription.

Approximately half of all genes in mouse and humans (i.e., 40,000 to 50,000 genes) contain CpG islands. These are mainly housekeeping genes. Housekeeping genes are genes which are required for the basic maintainence of cell function. They are expressed in all cells of an organism under normal or pathological conditions However,approximately 40% of genes with a tissue-restricted pattern of expression are also represented. Usually methylation is inversely correlated with the transcriptional status of the genes i.e. if the gene is methylated, it is not expressed and vice versa.

The enzymes that transfer methyl groups to the cytosine ring are called cytosine 5-methyltransferases, or DNA methyltransferases (DNA-MTase). There are currently three known,

catalytically active DNMTs, DNMT1, 3a, and 3b and each one appears to play a distinct and critical role in the cell. Three possible mechanisms have been proposed to account for transcriptional repression by DNA methylation. These mechanisms are as follows:

The transcription of a gene begins when a transcription factor binds to the promoter region of a gene. There can be a direct interference with the binding of specific transcription factors to their recognition sites in their respective promoters. Several transcription factors are known including AP-2, cMyc/Myn, E2F and NFκB. It is likely that these transcription factors bind to sequences in the CpG islands. Binding of these factors to the CpG islands has been shown to be inhibited by methylation.

The second mechanism includes the direct binding of specific transcriptional repressors to methylated DNA. Two such factors are MeCP-1 and MeCP-2 (methyl cytosine binding proteins 1 and 2). These factors bind to the CpG islands and cause the genes to be methylated.

The third mechanism of methylation is by altering chromatin structure. Experiments show that methylation inhibits transcription only after chromatin is assembled. Once chromatin has assumed its inactive state after DNA methylation, it cannot be counteracted even by strong transcriptional agents. Therefore, methylation stabilizes the inactive state. In addition, it also prevents activation by blocking access of transcription factors to the promoter island.

It is important to realise that methylation turns off genes. Methylation of the CpG islands serves as a locking mechanism that may follow or precede other events that turn a gene on or off. Once the methylation mechanism is in place, it can prevent activation even if the nuclear environment is optimum for transcription.

<H2>DNA demethylation during development and tissue specific differentiation

After implantation, most of the genomic DNA is usually in the methylated state. The tissue specific gene undergo methylation in their specific tissues of expression. , This essentially means that some genes can be expressed, whereas, the other genes are repressed. This allows the body a step-wise development which accounts for the perfect structure of the tissues of the human body. If this system of methylation did not exist, tissues would develop randomly and the human body would never reach the perfect form.

<H2>DNA methylation in cancer

Role of DNA methylation in oncogenesis has been hypothesized since many years. Numerous studies have suggested aberrations in DNA methyltransferase activity in tumor cells. Neoplastic cells may show hypermethylation of tumour suppressor genes or there may be hypomethylation of oncogenes. This leads to repression of tumour suppressor genes and development of cancer.

<H3>DNA hypomethylation in cancer

DNA may show hypomethylation in cancer. Decreased level of overall genomic methylation is a common finding in tumorigenesis. This decrease in global methylation appears to begin early, much before the development of frank tumor formation. Specific oncogenes are hypomethylated. This leads to an increase in the expression of oncogenes and development of cancer. A good inverse correlation between methylation and gene expression was observed in the antiapoptotic bcl-2 gene in B-cell chronic lymphocytic leukemia and the k-ras proto-oncogene in lung and colon carcinomas.

<H3>Hypermethylation of tumor-suppressor genes

An additional means of inactivating tumour suppressor genes is by hypermethylation of the promoter sequences of the tumour suppressor genes in cancer. The retinoblastoma gene (Rb) was the first classic tumor-suppressor gene in which CpG island hypermethylation was detected.

<H2>Clinical and therapeutic implications of DNA methylation

In recent years, several attempts have been made to use methylation in a therapeutic scenario. The vertebrate globin genes were among the target for clinical intervention based on drugs that affect methylation. Treatment with 5-azacytidine has been attempted. This drug is an irreversible inhibitor of DNA methyltransferase and therefore inhibits methylation. Since there is an inhibition of methylation, genes which were previously silenced can now be expressed. Among these genes is the fetal γ globin gene. 5-azacytidine can thus cause an increase in the expression of the γ globin gene which can restore the imbalance between the α chains and the non α chains in thalassemias. Unfortunately 5 azacytidine is mutagenic. Because of its mutagenicity and the observation that the other S-phase active cytotoxic agents that do not inhibit DNA methylation could induce similar increase in γ globin gene expression, 5-azacytidine has not been widely used for this application. This points to the limitations of the use of agents that cause global DNA methylation.

The recent advances in understanding of altered DNA methylation in cancer also have potential clinical implications. Because methylation of many involved genes may represent a process specific to neoplastic cells, it may be possible to detect the presence of micro metastasis by looking for the presence of methylated genes.


Histones form the protein backbone of chromatin and are an important component of epigenetics. They act as important translators between genotypes and phenotypes. They are known to have a dynamic function. As compared to DNA methylation, not much work has been done on the study of histones..

In eukaryotic cells, DNA and histone proteins form chromatin, and it is in this context that transcription takes place. As mentioned earlier, the basic unit of chromatin is the nucleosome, and consists of an octamer of two molecules of each of the four histone molecules (H2A, H2B, H3 and H4), around which is wrapped 147 bp of DNA. Histones help package DNA so that it can be contained in the nucleus. However, in addition, they may also perform important functions in gene regulation. .

The core histones are highly conserved basic proteins with globular domains. Conserved sequences mean that similar or identical sequences are seen in other proteins as well. DNA is wrapped around these globular domains. The histones also contain a relatively unstructured flexible tail which protrudes from the nucleosome. These tails are subject to a variety of post translational modifications (PTMs) such as methylation, acetylation and phosphorylation. The other changes which can take place in the tail are ubiquitination, sumoylation, ADP ribosylation and deimination, and the non-covalent proline isomerization that occurs in histone H3. Most histone PTMs are dynamic and are regulated by families of enzymes that promote or reverse the modifications.

How do histones influence transcription? The histones influence the higher order chromatin structure. It does this by affecting contacts between different histones and between histones and DNA. Specific histone modifications take place which are responsible for dividing of the genome into two parts. The first part is the transcriptionally silent heterochromatin and the second portion is the transcriptionally active euchromatin. Thus, these histone - histone and histone - DNA interactions decide if a gene is to be transcriptionally active or inactive. They regulate nuclear processes like replication, transcription, DNA repair and chromosome condensation. The common changes that take place in the histone molecule and perhaps also the best studied are histone acetylation and methylation. Ranking next to DNA methylation, histone acetylation and histone methylation are well-characterized epigenetic markers. Methylation at some of the histones (H3K4, H3K36 or H3K79) results in an open chromatin configuration and is, therefore, characteristic of euchromatin. Acetylation mediated by histone acetyl transferase (HAT) also results in an open chromatin pattern or euchromatin. On the contrary, histone deacetylases remove these changes and result in transcriptional repression.

An analogy of the relationship between DNA and histones can be found in any 'C' grade movie. The histones are akin to the big brother and their job is to protect the DNA or the younger sister. Histones allow access to the DNA only under certain circumstances and prevent access under a different set of circumstances. Since these changes are independent of the genetic code, they come under the ambit of epigenetic changes.

Essentially, three general principles are thought to be involved in histone modifications and gene expression. These principles are:

PTMs directly affect the structure of chromatin, regulating its higher order conformation and thus acting in cis to regulate transcription; The word 'cis' means to be on the same side of. Therefore, a cis regulatory element means that the PTM regulates the activity of the DNA on the same chromosome.

PTMs disrupt the binding of proteins that associate with chromatin (trans effect); 'Trans' is the opposite of 'Cis' and it means that the action is on a different chromosome. Essentially, this means that the PTM's cause a change in the binding of various transcription factors to the chromatin.

PTMs attract certain effector proteins to the chromatin (trans effect). Similar to what has been elaborated earlier, the effector proteins bind to the promoter regions and this regulates transcription.


MicroRNAs (miRNAs) were discovered in the early 1990s by Victor Ambros and colleagues. They found that miRNAs act as gene regulators. Gene hunters at that time were mainly interested in long mRNA molecules because the long mRNA molecules were the ones which were translated to proteins. The small fragments of mRNA or the microRNAs were disregarded since at that time it was believed that they did not have any function. This has now been proved wrong.

MicroRNAs are approximately 22 nucleotides in length. They are single stranded and they inhibit the expression of specific mRNA targets. They do this by binding to sequences usually located in the 3' untranslated regions or UTRs. The portion of miRNA which binds to the 3'UTR is called the 'seed region'. The human genome is believed to code for up to 1000 miRNAs.

miRNA coding sequences can be found in introns or exons of a protein-coding gene. It can also be found in intergenic regions. Several miRNA genes can be clustered along the genome and they may share the same promoter. They can also be present individually. miRNA genes are transcribed into large non coding mRNA strands which is called the primary miRNA transcript. Primary miRNA is then processed and then exported across the nuclear membrane. The exported miRNA is then incorporated into a RISC-like ribonucleoprotein complex. (RNP), also known as microRNA-induced silencing complex (miRISC). RISC stands for RNA Induced Silencing Complex. The miRNA strand guides the RISC to its target mRNA. The RISC then cleaves or silences the target mRNA. The repressive function of miRNA is performed by preventing translation of the target RNA. The degree of complementarity between the miRNA and the target RNA is the main factor upon which the interference depends.

Bioinformatics and cloning studies have estimated that miRNAs may regulate 30% of all human genes and each miRNA can control hundreds of gene targets. miRNAs are highly conserved between distantly related organisms. This indicates that their participation is an essential biological process. It is now known that miRNAs have a very important regulatory function in basic biological processes like development, cellular differentiation, proliferation and apoptosis.

<H2>MicroRNA biology and function

Through their binding to target mRNA sequences, microRNAs have a large number of biologically diverse functions. They are capable of controlling the expression of several genes. They are also capable of regulating several cell regulatory pathways. This includes cell growth, cell differentiation, cell mobility and apoptosis. It has now been proved that miRNAs are involved in the pathogenesis of several cancers. They perform this function by regulating the translation of oncogenes and tumour suppressor genes.

<H2>Detection of MicroRNA Expression

miRNAs can be isolated from tissues as well as from blood. It is important to remember that miRNAs are essentially RNA molecules. The same techniques which are used for detection of RNA can be used for the detection of miRNAs. Several techniques such as microarrays are available to look for miRNA expression. The major utility of microarrays is that the expression of thousands of miRNAs can be analysed all at one time.

Quantitative reverse transcription polymerase chain reaction (qRT-PCR) is another reliable and highly sensitive technique for microRNA detection. The advantage is that only small quantities of total RNA is required as an initial template. The disadvantage as compared to microarrays is that only one miRNA can be evaluated at one time. Northern blotting can also be used to look for miRNA expression. Finally, in situ hybridisation can be used to evaluate miRNA expression.

<H2>Role of MicroRNAs in Cancer

Not much is known about the role of miRNAs in cancer. However, in recent times, a spectrum of cancer associated miRNAs has been identified. Some of the miRNAs function as tumour suppressors and are down regulated in cancer. Other miRNAs act as oncogenes. These oncogenes are capable of inducing and promoting cancer development and progression.

The expression patterns, function and regulation of microRNAs in normal and neoplastic human cells are not quite clear. miRNAs tend to localise at fragile sites, common breakpoints and at regions of amplification. Why do they preferentially localise in these areas? Fragile sites and common breakpoints are places in the genome where translocation, deletion, amplification, or integration of exogenous genetic material occurs. MiRNAs located near fragile site could be possible targets of such genomic alterations. This evidence suggests that miRNAs may have a significant role to play in carcinogenesis. Abnormal expression of miRNAs has been seen in several neoplasms including Burkitts lymphoma, carcinoma of the breast, liver, ovary and prostate.

Some specific examples of miRNA expression in different cancers are being discussed here. In breast cancer, miR-10b, miR-21, miR-22 and miR-27a are upregulated while let-7, miR-7, miR-9-1, and miR-146 are down regulated. It is emphasized that this list is not exhaustive. To take another example, in glioblastomas, miR-21, miR-221 and miR-222 are upregulated whereas miR-7 is downregulated. An individual miRNA targets multiple genes involved in various signaling pathways so aberrant expression or malfunction of a single miRNA can have disastrous consequences. The impact of miRNAs on oncogenes and tumour suppressor genes can be bidirectional. The net outcome of such modulatory mechanisms, predicting whether the effect is going to be oncogenic or tumour suppressor has to be interpreted in the context of different microenvironments. .

miRNAs also have a role to play in virus induced human cancers. Several human cancer associated viruses have been found to express miRNAs. A number of cellular transcripts are potential targets of miRNAs. If validated, it will be known for certain if they could play important role in viral infection.

miRNAs are also important in angiogenesis in tumours. miR-126 is abundantly expressed in endothelial cells and it plays an important role in vascular development in mice and zebra fish. miR-126 enhances the pro-angiogenic effects of VEGF and FGF. Other miRNAs such as miR-130a and miR-296 are also involved in tumor angiogenesis.

miRNAs are also involved in the development of metastasis. miRNA 200 is one such molecule. Loss of expression of miRNA 200 is associated with an increase in metastases especially in ovarian and breast cancers. miR 10b and miR 373 have been associated with the development of metastases in breast and ovarian cancers.

It can be concluded that the role of miRNAs is important not only in the development of tumours, but also in the development of angiogenesis and metastases.

<H2>Clinical Applications of MicroRNAs

Since the expression of microRNAs is altered in cancers, it is thought that they may function as suitable biomarkers for disease state and progression. Tumour classification is not a new entity in medicine. Clinicians and pathologists alike are constantly on the look for new biomarkers which may help in tumour classification. This is done primarily in order to develop treatment protocols and prognostication. .

<H3>Diagnostic MicroRNAs

MicroRNAs can be used to distinguish between tumour cells and normal cells. They can also be used to identify the tissue of origin in tumours of unknown origin. Finally, they may also be used to distinguish between different subtypes of tumours.

Alterations occur in microRNAs during the early stages of cancer and these changes may be used for early detection of neoplasms. MicroRNAs can also be detected in small tissues and so large resection specimens are not required for diagnosis. Since miRNAs can be detected in formalin fixed paraffin embedded tissues, archival tissues may be used to reach diagnosis.

MicroRNAs can also be detected in serum. Detection of these miRNAs can be used as a screening procedure since the sample collection is less invasive.

Mature microRNAs are relatively stable. These phenomena make microRNAs superior molecular markers and as such, microRNA expression profiling can be utilized as a tool for cancer diagnosis.

<H3>Prognostic MicroRNAs

MicroRNAs are useful indicators of clinical outcome in a number of cancer types. As has been discussed earlier, miRNAs are involved in every stage of carcinogenesis, from initiation to progression and to the development of metastases. Therefore, these miRNA changes can be used in prognostication. They can be used to determine the tendency for recurrence and metastases. miRNAs have also been found in the surrounding non neoplastic tissues. Changes in these miRNAs can be used to detect changes in the cancer microenvironment.

<H2>Therapeutic Application of MicroRNAs

Since miRNAs are involved in almost every stage of carcinogenesis, would it be possible to use miRNAs in the treatment of cancer?

It is thought that since miRNAs are dysregulated in cancer, normalisation of their expression maybe a potential method for intervention.

One way of regulating the action of micro RNAs is to use anti-microRNA oligonucleotides (AMOs) which have been generated to directly compete with endogenous microRNAs. Unfortunately, the ability of AMOs to inactivate the target miRNAs is quite inefficient. Several modifications have been made to the AMO including addition of 2'-O methyl and 2'-O-methoxyethyl groups to the 5' end of the molecule. AMOs can be conjugated to cholesterol to generate molecules called antagomirs. These have been shown to effectively inhibit the in vivo miRNA activity.

Another therapeutic strategy is to use locked nucleic acid antisense oligonucleotides or LNAs. Locked Nucleic Acids are modified RNA nucleotides. These are chemically synthesised molecules. LNAs can be mixed with DNA or RNA molecules when required. They can be used for the in situ detection of miRNAs' They are also capable of inhibiting miRNA activity in vivo. They have been designed to increase stability and have been shown to be highly aqueous and exhibit low toxicity in-vivo.

Another method for reducing the interaction between microRNAs and their targets is the use of microRNA sponges. These sponges are synthetic mRNAs that contain multiple binding sites for an endogenous microRNA. The seed sequence or a 'seed region' is a region of about 6 - 8 nucleotides which is present at the 5' end of an RNA molecule. It is important in determining specificity of binding. Sponges designed with several seed sequences have been shown to effectively repress microRNA families sharing the same seed sequence. Although the in-vitro performance of microRNA sponges is similar to that of chemically modified AMOs, their efficacy in-vivo remains to be tested.

Although these oligonucleotide-based methods have been shown to work, they do elicit off-target side effects and unwanted toxicity. This is due to the capability of microRNAs to regulate hundreds of genes.

In the process of activating or repressing a single miRNA in one tissue, several other tissues maybe affected. A strategy called miR-masking is an alternative strategy designed to combat this effect. This method utilizes a sequence with perfect complementarity to the target gene. This ensures that the complementarity between the target gene and the oligonucleotide is perfect.

Another strategy to increase specificity of effects is the use of small molecule inhibitors against specific microRNAs. Azobenzene, for example, has been identified as a specific and efficient inhibitor of miR-21. These methods are promising tools for cancer therapy.

<H3>Strategies to overexpress microRNAs

It may be necessary to over express miRNAs in some cases. This is particularly relevant when it comes to elevating the expression of microRNAs with tumor suppressive roles. This restores tumor inhibitory functions in the cell. This can be achieved through the use of viral or liposomal delivery mechanisms. Several microRNAs have been introduced in the tumor cells via this methodology. These include miR-34, miR-15, miR-16 and LET-7. This approach reduces toxicity since AAV vectors do not integrate into the host genome and are eventually eliminated. The non-viral methods of gene transfer include cationic liposome mediated systems. These lipoplexes lack tumor specificity and have relatively low efficiency when compared to viral vectors.

MicroRNA mimics have also been used to increase microRNA expression. These small, chemically modified double-stranded RNA molecules mimic endogenous mature microRNA. These mimics are now commercially available. They do not have vector-based toxicity and are therefore promising tools for therapeutic treatment of tumors.

The description of miRNAs covered in this chapter is far from complete. It is possible that miRNAs will be used in therapeutics in the future. It is also likely that they may be used for diagnosis and prognostication of diseases. A lot remains to be elucidated about these enigmatic molecules