Cis Regulatory Regions Controlling Protein Coding Regions Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The cis-regulatory regions control the expressions of protein coding regions. Each cis-regulatory regions consists of mainly two parts: 1) Promoter Region 2) Enhancer Region. Promoter is mainly one basal promoter while enhancers can be one or more. The structure of cis-regulatory region is similar in all eukaryotes including drosophila. Although promoters are important for transcription, it does not provide any form of spatiotemporal information. The promoter contains some binding sites only and it's is just a part of transcription machinery. Each of these binding sites will be bind by Tata-binding sites, RNA Polymerised II Protein Complex and transcription machinery. Throughout the genome, it is necessary for the promoters to bind themselves with e-proteins. E-protein complex assembly is highly important for the production of M-RNA. As this mechanism is highly important throughout the genome, the basal promoter sequences and the corresponding binding proteins are in a high functional constraint. These constraints were found in some regions of drosophila and were spotted by the reduced level of polymorphism and divergence by Kohn et al in 2004. In 2005, Brown and Feder told that some polymorphisms do occur at the basal promoters and causes variable gene expressions. Promoters can be either single or multiple (alternative) promoters and the use of each of these types will depend on the different cellular conditions. The contribution of alternative promoters in regulatory divergence is still not clearly known. So it can be assumed that if promoter sequence doesn't have divergence, then the changes in enhancer sequence could be a possible cause for cis-regulatory divergence (Fang and Brennan, 1992). M-RNA transcription is fully controlled by the enhancers. Enhancers always function independently and there are many genes which contain one or more enhancers. Each enhancer controls one subset pattern of gene expression. Unlike in promoters, the binding sites in enhancers are for unique combination of transcription factor proteins only and not for all types of proteins. Once the transcription factor proteins bind, they interact between themselves. Polymerise protein complex are then assembled in promoters and will activate and sustain the transcription factor. A visible transgenic reporter gene can be used to study the activity of en enhancer as they are modular in nature (Barolo et al, 2000). So this will produce a putative cis-regulatory sequence and a basal promoter. This easily visible transgenic gene is transformed into a host species and the corresponding expression is determined. P-elements have been widely used in Drosophila melanogaster and were first proposed by Spradling and Rubin in 1982. The P-elements are used to study the whole Diptera family. When it is transferred to the D. melanogaster, the heterologous sequence will start to regulate in the trans. There are advantages and disadvantages for regulating in trans. If orthologous cis-regulatory elements are allowed in a trans-regulatory back ground their function can be directly compared. But if trans-regulators are diverged between donor and host species, the regulatory activity will differ in melanogaster. Regulatory evolution can be studied clearly if the cis-regulatory sequences from D.melanogaster could be transformed into various species to analyse their activity. (Lombardo et al, 2005). Both enhancers and basal promoters undergo the same molecular evolution process like all other genes in the genome. The main processes which determine the survival of enhancers and basal promoters are nuclear substitution, insertion, deletion, rearrangement, selection and drift. This was said by Li in 1997. There are for and against proofs for polymorphism and divergence studies conducted in cis-regulatory regions of Diptera class. (Ludwig and Kreitman, 1995; Kohn et al, 2004). Each of these studies has taken different population genetic studies for testing. But, there is still no appropriate model for studying the neutral evolution of cis-regulatory region. The mutations which occur in the cis-regulatory region can affect the phenotype considerably because the changes in the cis-regulatory can alter the gene expression. The selection coefficients of cis-regulatory regions and its expressions will be mutually related to each other. If a mutation occurs without altering the expression, it's called as a silent or neutral mutation and if it disrupts the sequence in cis-regulatory function, it's called as a deleterious mutation. The sequence contained in transcription factor will be more constrained than normal sequences. But a difference in opinion was put forward by Emberly et al, 2003; Costas et al, 2004; Phinchongsakuldit et al, 2004; Balhoff and Wray, 2005.They found out that nuclear type substitution is same in these areas and the adjacent areas. It's still not known if we have an unidentified binding site in the adjacent sequences or how divergence affects gene expressions. These are yet to be studied. All expression patterns in every species specified by cis-regulatory elements are conserved. Even if cis-regulatory sequences, taken from the same or different class (i.e. classes outside Diptera class) are introduced in D-melanogaster using transgenes, they show the same activity like any other gene. This will be true even in distant related flies like house fly (Musca domestica), black fly (Simulium vittatum) or in animals outside dipteral class (Ludwig et al, 1998, Wittkopp et al, 2002).They have followed up these evidences by suggesting that in order to maintain these activities the sequences should be conserved. When orthologous cis-regulatory elements were studied for sequence comparison, it showed that around every conserved sequence there is a region of divergence sequence. (Langeland and Carroll, 1993Ludwig et al,1998; Wolff et al, 1999). The comparison done in enhancers of D. melanogaster was helpful in determining and sequencing the enhancers of other Dipterian genomes. Before this was realised, biologists thought that if the study is based on the non-coding sequence, then it is easy to find most of the cis-regulatory regions. But when the complete genome sequence of Drosophila pseudoobscura was discovered using computational methods, it was found that only a small amount of enhancers were present conserved. This was discovered by Richards et al, 2005. This could also be because the biologists over estimated the fact of conservation of cis-regulatory region. The Dipterian enhancers used to have a remarkable identity with cis-regulatory regions of D. melanogaster due to the fact that the rate of evolution is slower for D. melanogaster enhancers compared to other sequences. Whatever be the sequence divergence, enhancer activity of cis-regulatory elements in the whole dipteral class will be conserved (Wolff et al, 1999; Ludwig et al, 2000).To explain this phenomenon stripe 2 enhancer in even-skipped genes were used. In all Diptera's including Drosophila and Anopheles, even-skipped (eve) genes codes a transcription factor for embryonic pattering (Goltsev et al, 2004). When this eve protein is expressed, seven transverse stripes are formed on the embryo. This phenomenon is controlled by 5 independent enhancers. The eve stripe II enhancer in melanogaster has the binding sites for all the 5 independent enhancers. There are two activators and three repressors in the transcription factor. All these are needed for expressing the eve protein and the binding sites are necessary for activity in D-melanogaster. As the sequence variations between non-coding regions are comparative this will fit a neural sequence evolution (Ludwig and Kreitman, 1995). To study the functional consequence of sequence divergence, DNA of eve stripe II enhancer orthologous to the D. melanogaster enhancer from D. yakuba, D. pseudoobscura and D. erecta, and was isolated by Ludwig et al and their activity was studied using reporter genes of transgenic melanogaster. Only small similarities were present due to divergence in binding sites of enhancers. But all orthologous enhancers could drive the gene expression in the same pattern given in D-melanogaster (Ludwig et al, 1998). Some chimeric enhancers in between D. pseudoobscura and D. melanogaster alleles were constructed by him in the year 2000. The 5' and 3' regions of different species in the chimeric enhancers were introduced in the melanogaster. But the chimeric enhancers were found to be non-functional which indicated that some compensatory changes must have occurred during evolution. i.e. D. Melanogaster and D. pseudoobscura would have gone in two different lines during evolution causing some changes in them. Unlike chimeric enhancers orthologous enhancers gave same pattern expressions. Although they have difference in binding site arrangement, some sort of stabilising selection might be the cause for this. Recently a new study was conducted by taking eve stripe II enhancers of D. melanogaster, D. erecta, D. yakuba, and D. Pseudoobscura from which the rescuing ability of eve mutant genotype was analysed (Ludwig et al, 2005). The wild type genotype was restored by the D. pseudoobscura and D. yakuba eve alleles. Although, Erecta had evolutionary trails which are nearer to melanogaster, they couldn't complement the mutations. This shows that in D. erecta type II enhancers, sequence divergence might have occurred and it may require sequences near to the orthologous regions of D. melanogaster for expressing itself. In the same time, the transcription factor activity could also be diverged. If the function has to maintain even after transcription factor site is diverged, then polymorphism occurring in binding site should also keep on segregating. This is known as segregation of polymorphism. (A Palsson, M Ludwig, and M Kreitman) When experiments were conducted on the eve gene of an enhancer it was found out that phylogenetically conserved site is conserved in the natural population. This study was commented in the personal communication between biologists. This indicates that empirically valid binding sites are not always fixed in a species. Recently in a study conducted in the sea urchin cis-regulatory elements, it was found out that binding site clusters are polymorphic (Balhoff and Wray, 2005). Although this intra specific variation causes a change in enhancer sequence, it's a raw-material for maintaining function. Then we need to look into why enhancer activity is maintained even after so much sequence divergence. The answer to this is that the molecular mechanism which translates the cis-regulatory sequence helps them in evolving in different rates. Thefactors which distinguishes the function and sequence divergence of the cis-regulatory sequences are:

Bio-chemical property of transcription factors

Redundant binding site and enhancer

Change in transcription factor inputs

Co-evolution of transcription factor and their binding sites.

1) Bio-chemical property of transcription factor.

The flexibility of arrangement of spacing in transcription factor binding site is called as degeneracy. The functions of degeneracy help in the evolving of sequence without any enhancer function alterations. This was proved by Arnone and Davidson in 1997. This discovery is important owing to the fact that degeneracy will help the transcription factor in enhancer regulation even in divergence. Thus, this flexible cis-regulatory architecture will help in reshuffling of binding sites without changing the function.

2) Redundant binding site and enhancer

Suppose, mutation occurs in an individual binding site and if it's not causing any disruption in enhancer function, the compensatory binding site can do restructuring in cis-regulatory regions. The spalt and knot genes of Drosophila are repressed in their developing haltere by the Ultrabithorax (Ubx) homeodomain protein. For Ultrabithorax (Ubx), multiple enhancers will be present in both the above said genes. So if at all an individual binding site is lost it will have only minimum effect (Carroll, 2005). Some studies about redundant binding sites were conducted in eve stripe II enhancers also. It was found out from these studies that redundant binding sites promote sequence divergence and help in reorganising eve stripe II. This characteristic was discovered in yoke protein genes also. This characteristic of eve stripe II and yoke protein are proposed by Ludwig et al, 2000 and Piano et al, 1999 simultaneously. The redundancy in enhancer modules will help in accumulation of sequence changes without affecting regulation. This means that even if one element is altered redundant element helps in compensating this function.

3) Change in transcription factor inputs

A mechanism proposed by True and Haag, 2001 clearly tells about this phenomenon. The mechanism is called as developmental system drift (DSD). They proposed that DSD can create enhancers with diverged sequence in conserved regions. Even though evolution occurs in developmental mechanisms, the output will remain the same. This idea is called as DSD. The embryonic pattering of anopheles and Drosophila melanogaster can be called as DSD. This is called as DSD because in both these classes eve stripe proteins are the conserved but the expression pattern which regulates these genes are different (Goltsev et al, 2004). The cis-regulatory elements of eve in Anopheles and D. melanogaster are controlled by some other transcription factors.

4) Co-evolution of transcription factor and their binding sites.

The binding sites in transcription factor are called as DNA binding domain. The evolutionary changes occurring here will help the divergence of cis-regulatory sequence. It has been obsereved that functions of hunchback protein during early embryonic development are conserved across all flies varying from house fly to Drosophila (McGregor et al, 2001a). This indicates that binding domain of Bicoid transcription factor and cis-regulatory sequence of hunchback(hb) enhancer has evolved simultaneously. Although this conservation is present, the cis-regulatory elements of hunch-back might have had many changes in their primary sequence. This has affected the binding site's number and organisation especially in Bicoid. This idea was also proposed by McGregor et al, 2001b. The DNA binding domain in becoid protein has co evolved with hb promoter to maintain the regulatory interaction. (Shaw et al, 2002). Although the cis-regulatory elements maintain the overtime functions, changes in gene expressions are common in Dipteran species. This is because two million years after melanogaster and simulans diverge, half amount of genes in their genome have shown difference in expression levels. It has been proposed that this change could be caused by evolution (Rifkin et al, 2003). The majority of these changes occur because of the functional divergence in cis-regulatory sequence and this idea is supported by Wittkopp et al, 2004.The cis-regulatory elements that give an altered expression pattern could be newly evolved (de novo). It can also occur due to duplication of paralogous enhancer after duplication. The last reason could be modification of existing enhancers.

Theoretically evolutionary novel pattern can occur as de nova. (MacArthur S and Brookfield J, 2004). They stimulated enhancer evolution and by using a model and incorporated positive selection. Both these studies conclude that transcription factor or transcription factor binding sites might occur frequently and will be fixed by population after many years. But till now no empirical evidence is got regarding this.

During gene duplication the enhancer function can get altered. When a gene containing cis-regulatory sequence is altered, two copies can occur redundant. Gu et al (2004) proposed that it is more likely for the duplicated gene to evolve the expressions between the drosophila species and single copy genes. This is because single copy genes will stay more constrained and if there are many copies, there will be a tendency for it to change one of them. When paralogous genes were analysed in D. melanogaster, the primary difference was found to be the expression change. Later it was found out in D. melanogaster that, even though sequence similarity exists, cis-regulatory activity is altered by paralogous regulatory elements causing evolutionary differences. These evolutionary differences are tissue-specific and sex-specific.

Gene expression might diverge because it can alter the existing cis-regulatory function. E.g. Alcohol dehydrogenase (Adh gene). This gene causes in spatiotemporal expression pattern of drosophila species. The cause of this is found to be cis-regulatory sequence themselves (Dickinson et al, 1984; Fang et al, 1991; Papaceit et al, 2004).There is an yellow protein responsible for abdominal pigmentation. The functional divergence of orthologous enhancers cause changes in expression of these yellow proteins in D. melanogaster, D. subobscura, and D. virilise. (Wittkopp et al, 2002). The expression changes of genes in Glucose Dehydrogenase (gld) is also caused due to cis-regulatory changes (Schiff et al , 1992). The factors which control the linear specific pattern could also be evolved from the existing cis-regulatory elements. The main advantage for them is that they will have a binding site already. Now we will check the computer predictions. We can use computational methods on cis-regulatory elements and study the difference between sequence and function. The current computational methods are divided into two:

1) Phylogenetic Foot Printing

Phylogenetic Foot Printing is used for sequence conservation between two sequences in species and cis-regulatory sequence is identified. (Moses et al, 2004)

2) Motive Detection

In this method statistical models are used for recognising binding sites of transcription factors. It can also be the ones which are shared with co regulated genes. Motive detection idea was proposed by Markstein et al, 2002. The main dis-advantage of phylogenentic foot printing is that they miss some of the cis-regulatory elements. This is because sequence conservation is required for other regions also. The analysis is different in motive detection algorithms. Special algorithms are used which looks for binding sites which were experimentally determined earlier. Thus it can identify new enhancers even before checking sequence conservation (Berman et al, 2004). This approach is limited for finding the enhancers to be used by transcription factors with the non-binding sites. The two factors can also be combined which will increase the accuracy.

In our project we check for enhancers in unknown binding sites or to find unknown binding sites. There are computational approaches for finding cis-regulatory evolution. E.g. To study about enhancers or to study about bio-chemical including genetics. Thus, even though computational approaches are present, transgenic tools are also essential.