This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
The central dogma of molecular biology stipulates that DNA gives rise to RNA which in turn gives rise to protein. It follows that a primary process in the molecular biology of the cell is production of RNA from DNA, i.e. the process of transcription. The failure of this process to occur, or to decode faithfully the information encoded in the DNA, would render the cell non-viable. In most eukaryotic cells, most genes are regulated (during development and differentiation, in response to specific cellular signals), at least in part, at the level of transcription initiation. Transcription and its regulation are fundamentally essential processes in the cell.
The process of transcription involves polymerisation of nucleoside triphosphates into RNA, in a DNA template-dependent manner, and in eukaryotes it is catalysed by three RNA polymerase enzymes, RNA polymerase I, RNA polymerase II and RNA polymerase III. RNA polymerase I is the most prominent polymerase activity in the cell, followed by polymerase II while polymerase III is the smallest polymerase activity.
RNA polymerases are huge multi-component complexes of over 10 protein subunits and are around 500 kDa in size. There is considerable relatedness between the three eukaryotic RNA polymerases and to the prokaryotic E. Coli RNA polymerase, especially between the largest and second largest subunits. RNA polymerase II differs from the others in that the largest subunit has a carboxy terminal extension called the carboxy terminal domain (CTD). The CTD contains a highly repeated heptapeptide
which can be heavily phosphorylated. This phosphorylated domain is essential for transcription by RNA polymerase II in most eukaryotes, and it also links the processes of transcription and RNA processing.
Initiation, elongation and termination
RNA polymerases alone cannot recognise an initiation site in the genome. This requires the ordered formation of a stable transcription initiation complex composed of the RNA polymerase complex and another protein complex of a similar size, the pre-initiation complex, consisting of a number of general transcription factors (see below). In addition, experiments in yeast have shown that a further "Mediator" complex associates with the CTD of RNA polymerase II and this 60-70 protein complex, termed the holoenzyme complex binds to the promoter and locates the polymerase to the gene's initiation site. At this stage the CTD is unphosphorylated however, upon transcription initiation the holoenzyme complex dissociates in part, the CTD of the RNA polymerase becomes highly phosphorylated and some new transcription factors associate with it. This new elongation complex transcribes RNA processively along the gene at a rate of up to 2000 nucleotides per minute. At the 3' end of the gene the elongation complex transcribes through the polyadenylation site and terminates at some point downstream where it disassociates from the DNA. There are defined termination signals for RNA polymerases I and III but the site(s) and mechanism of termination for RNA polymerase II has not yet been worked out. However, it is now clear that termination is linked with formation of the 3' end of the transcript via the CTD. The process of termination is important because it avoids the situation of the elongation complex travelling into the promoter of the next gene downstream, which would probably result in decreased transcription initiation for that gene.
The basic transcription machinery
One key general transcription factor common to all 3 RNA polymerase complexes, is TATA binding protein or TBP. TBP is a 38 kDa saddle-shaped monomer which can contact and bend severely TATA-containing DNA in the minor groove. TBP therefore changes the conformation of the DNA and this is thought to facilitate transcription factor binding. It is highly conserved from yeast to mammals and is evolutionarily ancient, a related protein being found in archebacteria. TBP presents a wide outer surface for simultaneous binding of a number of TBP-associated factors (TAFs) and the complexes these form are the positioning factors for RNA polymerase. TAFs appear to regulate the activity of TBP and together they determine the specificity of polymerase binding to promoters.
RNA polymerase II
Formation of the stable transcription initiation complex on the promoter of a gene is directed by a DNA sequence with the consensus TATA/TAA/T, around 30 nucleotides upstream of the transcription initiation site of the gene, the so-called TATA box. Mutations in the TATA box sequence have a striking effect on transcription; few mutations are tolerated, demonstrating that this is a crucial sequence. The primary transcription complex, TFIID, consisting of up to 12 different TAFs, forms on TBP bound to the TATA box, and acts to direct RNA polymerase II to the correct transcription initiation site and as a focus for formation of the stable transcription initiation complex, a huge multi-protein structure which is assembled in a highly ordered manner on TFIID.
RNA polymerase I
Promoters for RNA polymerase I contain two important regulatory elements, a core promoter located from around 20 nucleotides downstream of the transcription initiation site to around 40 nucleotides upstream, and an upstream control element (UCE) situated some 100 nucleotides upstream of the transcription initiation site. The first step in formation of the polymerase I transcription initiation complex is binding of two molecules of Upstream Binding Factor (UBF), one to the UCE and the other to the core promoter element. Interaction between the two UBFs bound to the promoter result in looping out of the intervening promoter DNA and provision of a promoter structure that can be recognised, and bound by, in humans, the core promoter element-binding factor, selectivity factor 1 (SL1). SL1 is a multi-subunit protein composed of TBP and three TBP-associated factors to which RNA polymerase I binds to complete the transcription initiation complex.
RNA polymerase III
The promoters for genes transcribed by RNA polymerase III differ from those for Pol I and Pol II in that the essential promoter elements are located downstream of the transcription initiation site, within the coding region of the gene. For tRNA genes a large multi-subunit protein, TFIIIC, binds with high affinity to the so-called "B-box" within the gene, and with lower affinity to the "A-box" upstream. TFIIIC acts as an assembly site for recruitment of the TFIIIB trimeric complex (containing TBP), which has no distinct sequence requirement for binding. The TFIIIB/C complex is capable of binding RNA polymerase III and initiating transcription. For 5S RNA there is only one promoter box within the gene, the "C-box", which binds TFIIIA. This factor binds TFIIIC, as for the tRNA genes and positions it at a similar distance with respect to the transcription initiation site. TFIIIB can then bind followed by RNA polymerase III as before.
Promoter elements for RNA polymerase II.
Not all promoters contain TATA box sequences. In Drosophila, roughly half of all core promoters contain a TATA box 25-30 nucleotides upstream of the transcription start site combined with an initiator element (Inr) overlapping the start site. Inr elements have the consensus sequence
where Y denotes a pyrimidine (C or T), N denotes any nucleotide and +1 denotes the transcription initiation nucleotide. The other half contain an Inr element combined with a downstream promoter element (DPE) which is located around 30 nucleotides downstream of the start site. All three of these elements act as recognition sites for subunits of TFIID, which contains the TATA-binding protein (TBP) and several TBP-associated factors. In mammals, core promoter structures appear to be even more diverse. Not so much data is yet available however, it is clear that a smaller percentage of mammalian promoters than Drosophila promoters contain TATA boxes. TATA boxes can operate on their own; a smaller percentage act in concert with Inr elements. DPEs exist but whether they work in combination with other core promoter elements has not yet been worked our. Finally, many promoters seem to lack all three of these core elements. Examples are promoters driving a low rate of transcription, and those for genes encoding enzymes of intermediary metabolism. In this case it seems that a GC-rich sequence in the promoter 20-50 nucleotides 5' of the initiation site, binds the transcription factor Sp1 (see below), and a tethering factor links Sp1 and TBP providing a pseudopromoter structure for transcription initiation.
TFIID consists of TBP and a number of other proteins known as TBP-associated factors (TAFs) which appear to allow TFIID to respond to stimulation by transcriptional activators. The other major general transcription factors are shown in the table below.