Advancement of the field of biology

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Leaps in Recombinant DNA Technology due to studies in Gene Expression and Regulation

The 21st century shows great promise in the advancement of the field of Biology. The application of technology to the field has not only allowed for new fast methods of detection and analysis but also facilitated the large-scale accumulation and sharing of data.

Molecular biology revolves around the study of genes, their expression and regulation. The central dogma of molecular biology is the detailed transfer of sequential information by residues. This information however cannot be transferred by a protein back to nucleic acids. It is represented by four major steps: DNA replication, DNA translation to mRNA, mRNA processing and translation of mRNA to a polypeptide. (Crick F. - 1970)

Regulation can take place at various multiple stages such as transcription, RNA processing and transport, translation, mRNA degradation, protein degradation and control of protein activity. In most cases, the most effective steps to regulate are transcription and post-transcriptional modifications. (Choudhuri S. - 2004)

Regulation is of two types. Blocking of transcription by the binding of a repressor protein is termed as 'Negative Regulation'. The repressor is regulated further by the binding of small "effector" molecules. The repressor-effector interaction results in the increase or decrease of transcription. Activator proteins on the other hand, enhance the activity of RNA-polymerase when they bind to the promoter region, and this is called 'Positive Regulation'. As observed with repressors, activators also require the binding of signal molecules to help with its binding to its DNA-site, which is upstream of the promoter. If the organism does not require the target protein, turning off the gene all together saves energy. These regulation mechanisms are vital in cell development where, turning the genes on and off allows the cells to differentiate and carry out specialized functions. (Nelson and Cox - 2004)

Gene expression regulation in bacteria can be understood by two primary systems: 'lac operon' and 'trp operon'.

The lac operon is a gene sequence in Escherichia coli, responsible for lactose metabolism. This regulation mechanism was first discovered by Jacob and Monod in 1960, and is applicable to many different genes.

The operon codes for four genes. lacI gene codes for lac repressor protein. The lacZ gene codes for ß-Galactosidase, which is a lactose hydrolysing enzyme. lacY gene codes for Galactoside Permease, which is a membrane bound transport protein that pumps lactose into the cell. The lacA gene codes for Thiogalactoside Permease, which is an enzyme that is required for lactose metabolism. This operon has two promoter regions, where RNA-polymerase can bind. P1 is upstream of the lacI gene, and the other P is upstream to an operator sequence adjacent to the lacZ gene. This operon is always 'off' and is turned on under certain environmental conditions by 'allolactose', which acts as an inducer molecule. This configuration is maintained (in the presence of a good carbon source such as glucose) by the constant binding of the repressor protein to the operator site, and thus does not let RNA-polymarase bind to the promoter for transcription. In the absence of glucose, E.coli can use an alternative carbon source i.e. lactose. Allolactose, which is a lactose metabolite, on binding to the operon bound respressor, causes a conformational change in the protein and it can no longer bind to the operon. RNA-polymerase is now allowed to transcribe the genes and produce the required proteins. (Jacob and Monod - 1961)

Another way in which this is controlled is by the presence of an activator molecule, Cyclic Adenosine Monophosphate (cAMP). Low glucose levels correspond to low levels of ATP in the organism, which in turn corresponds to high levels of cAMP. The cAMP binding to a cAMP Receptor Protein (CRP) allows the complex to bind to a CRP site, upstream of promoter P. This cAMP-CRP complex, promotes binding of RNA polymerase to promoter P and thus increases the rate of transcription of the gene.

Conversely, when glucose levels are high, large amounts of ATP are synthesized, which in turn results in low levels of cAMP. Without bound activator (cAMP-CRP), there is poor transcription of the gene, even when lactose is present. (Jacob and Monod - 1961)

Another well studied gene regulation mechanism is the trp operon, which is found in many prokaryotic organisms. It was first studied in E.coli by Morse and colleagues in 1969. The trp operon in E.coli, codes for 5 enzymes that are required to convert chorismate to tryptophan. When present, tryptophan binds to the trp repressor molecule, causing a change in conformation which allows it to bind to the trp operator. This is present adjacent to the promoter site and the repressor binds in such a way that RNA polymerase cannot transcribe the genes. When tryptophan levels fall below a critical level, the repressor is released, and transcription takes place. The rate of transcription of these enzymes depends on the concentration of available tryptophan and is regulated by a 'transcription attenuation' process. (Nelson and Cox - 2004)

Gene regulation mechanisms such as these have been taken advantage off in the field of Recombinant DNA Technology. DNA that does not naturally exist but is artificially synthesized by joining DNA sequences, not necessarily having a common origin or even occurring together is called recombinant DNA. Introduction of a foreign gene into a host is done for the expression of a specific trait, commonly in the form of a protein (Berg et al. - 2007). This technique was first successfully conducted by Cohen, Chang, Boyer and Helling in 1973.

This technology has become a part of almost every biological field, and is absolutely essential for scientific development and has various applications to society. In the Food Industry, recombinant DNA has been used to increase the quality and efficiency of strains of fermenting organisms such as yeast, which is very important for breweries and bakeries. In the Agriculture Industry, Genetically Modified Plants have been produced to make plants more resilient against drought, nutrient deficiencies, fungal and bacterial diseases. It has been used to produce less environmentally hazardous insecticides and pesticides, and also to produce growth promoters and plant hormones as fertilizers. It has however had the most impact to the field of Medicine. Production of proteins for disease diagnosis, vaccines for prevention, treatment in the form of drugs and antibiotics produced on a large-scale. Recombinant human hormones such as Insulin, Human Growth Factors, Somatostatin, Cytokines, Interleukins, Interferons, etc are some of the few products made that have greatly improved the quality of medical treatment today (CMR Institute - 2004). A number of discoveries and novel product synthesis have taken place with the help of this technology in Research and Development. Therefore, the process of production of such products must be carefully designed and stringently monitored.

The process to produce recombinant proteins is multi-staged with vigilant screening and verification at every step. There are three factors that greatly influence the expression of a protein. They are the expression vector, the host sell and the suitable growth conditions. Altering any one of these factors can greatly influence the expression of your target protein (Novagen - 2002). I am now going to consider the challenge of expressing a Eukaryotic protein in a prokaryotic host.

STEP 1: Isolating your target gene of interest.

Ideally, when trying to express a particular protein, the location of its coding gene and its sequence should be known. However, this is not always the case.

If the gene sequence of the target eukaryotic protein is known, usually ­­­­chemically synthesized tagged probes complimentary to the gene of interest are used to screen the DNA library of the organism. This is nothing but a complete set of fragments which represent the entire genome of the organism (Old, Primrose & Twyman - 2001). The target fragment is then isolated using techniques such as 'Southern Blot', developed by Edwin­­­­­­­­­­ Southern in 1975. In the case of eukaryotes however, we must remember that the genome is largely non-coding i.e. in the case of Humans, approximately 99% of the genome is introns (non-coding regions of mRNA). These introns are removed prior to translation by a 'splicing mechanism' which joins together the coding sequences to give a functional protein. This is a post-transcriptional modification that takes place in a Eukaryotic cell. Prokaryotes on the other hand have about 99% of their genome as coding, and therefore are not capable of such modifications (Nelson and Cox - 2004). The solution to this is the use of cDNA libraries, which is made by reverse transcription of the total mRNA's produced by a cell. These will lack the sequences for introns. Cloning of such DNA into a prokaryotic vector will give successful transcription of the protein (Old, Primrose & Twyman - 2001). These isolated ­­­­­fragments then need to be prepared to be inserted into an appropriate vector. One of the steps involved in this is amplification of the fragment using 'Polymerase Chain Reaction' (PCR) techniques, developed by Kary Mullis in 1984. The choice of vector will depend on the size of your gene insert and the host in which it is to be expressed.

STEP 2: Selection of an appropriate vector

There are four main types of vectors commonly used - bacterial plasmids, bacteriophages or viruses, cosmids (plasmids with a cos site) and artificial chromosomes such as Yeast Artificial Chromosome (YAC). Selecting a particular vector depends on the desired host cell for growth, the gene insert size and the copy number. In the case of a prokaryotic host such as bacteria, plasmids (natural or recombinant), are a good choice as they have high copy number, they are easy to extract, handle and manipulate.

Usually, transcription vectors are used to code for prokaryotic proteins as they carry compatible Ribosome Binding Sites. Translation vectors are good for the expression of eukaryotic proteins. However, we have to consider both transcription, translation, and their respective modifications, the vectors are more complex.(Rapley - 2000)

The vector needs to have a high copy number to ensure adequate expression. The vector must not produce endonucleases, as they recognize the eukaryotic recombinant protein as foreign and chew them up(Old, Primrose & Twyman - 2001). Unique Restriction Endonuclease cut sites. Restriction endonucleases are enzymes that cleave double-stranded DNA at specific nucleotide sites (called restriction sites). They occur naturally in bacteria as a defense mechanism. (Smith and Wilcox - 1970). In plasmids, these sites need to be unique to prevent lose of vector DNA fragments in the case of restriction activity. Both the vector and the isolated gene of interest need to be cut with the same Restriction enzyme, to allow for complementary base-pairing between the free ends with the help of DNA ligase enzyme. Dna ligases are enzymes that link two double-stranded DNA fragments together by repairing the broken phosphodiester linkage between two adjacent nucleotides. (Olivera & Lehman - 1967). If the target gene does not have the required restriction site flanking it, these can be chemically produced and added to the free ends of the fragment in the form of 'Adapters' or 'Linkers'.

LINKERS & ADAPTERS (Old, Primose & Twyman - 2001; page 39)

Recently, in 1994, Shuman described a new approach to incorporate a foreign gene into a vector without the use of DNA ligase. He describes a single enzyme - vaccinia DNA topoisomerase, which has the dual role of cutting as well as joining dsDNA (Old, Primrose & Twyman - 2001).

The vector also needs to have Antibiotic Resistance Genes. Antibiotic resistance is a property encoded for the prokaryote by genes on its plasmid DNA (Cohen et al. - 1972). Each antibiotic producing species protects itself from action by its own antibiotic, by coding for resistance genes. Thus by using a vector that has a known antibiotic resistance gene on it, the transformed host cells can be selected for by growing the cells in a medium containing the antibiotic. Only the cells containing the recombinant gene and the resistance gene with survive. Thus antibiotic resistivity acts as a good marker.

Another screening mechanism that makes use of the lacZ region of the lac operon is a-Complementation:the screening of bacterial colonies using X-Gal and IPTG. Here, the ß-Galactosidase gene is cleaved into two halves, the fragment coding for the C-terminal fragment is inserted into the host cell while the N-terminal fragment is incorporated into the recombinant vector, giving a fusion protein with the gene of interest (Sambrook and Russell - 2001). To check for successful transformation of the plasmid in the host, the cells are grown a medium containing X-Gal (5-bromo-4-chloro-3-indolylb-D-galactoside). This is a colourless compound that forms a blue indoyl-derivative on cleavage by ß-Galactosidase. The host cells containing the plasmid have both strands of the lacZ gene and thus express an active ß-Gal enzyme giving blue colour colonies on X-Gal, while those that have only part of the ß-Gal give white colonies(Old, Primrose & Twyman - 2001).

An important characteristic to consider is the type of promoter on the vector. For prokaryotes, the promoter to use is the Shine-Dalgarno promoter, which has to be present on the target mRNA as it has a ribosome binding site - AGGAGG. This 6 base sequence is present upstream of the start codon (Shine & Dalgarno - 1975). The 16sRNA ribosomal subunit binds to it and facilitates translation. For eukaryotes however, this sequence is different and is known as Kozak sequence.

It is therefore essential that for a eukaryotic mRNA to be translated into a protein in a prokaryote, this prokaryotic ribosomal specific sequence must be coded for.

The purpose of using an expression vector is to increase the quantity of protein produced. This can be facilitated by using a strong promoter such as the trp operon promoter. A strong promoter would mean an increased rate of RNA polymerase binding, which would give more mRNA and thus more protein.

Using an inducible promoter such as the lac promoter P upstream of your gene of interest means you can selectively choose when you want your protein to be expressed. The gene can be turned on using an inducer like Isopropyl ß-D-1-thiogalactopyranoside (IPTG), which is often used to mimic allolactose. This is very useful in the case of preventing host cell lysis due to over production of recombinant protein (Molecular Station - 2006)

BL21(DE3) is an excellent protein expression vector in E.coli since it lacks the lon and ompT proteases and is very commonly used. It enables high protein expression. It codes for a T7 RNA polymerase which bind to a T7 promoter. The expression of this enzyme is under the control of a lac UV5 promoter. Its activity is further regulated by the binding of the lac repressor protein to the promoter and prevents its expression. This suppression is released in the presence on IPTG, whose binding to the repressor protein will cause a conformational change. The T7 polymerase is then expressed. This level of regulation is necessary to prevent host cell death by production of toxic proteins. Excessive expression of the T7 RNA polymerase is controlled by an inhibitory interaction with T7 lysozyme. (Doherty et al. - 1995)

Novagen in 2002 introduced 'Rosetta Technology'. This was the first system of translation that coded for 6 rare tRNAs in a single strain thus making it "universal" (Novagen - 2002). When expressing a eukaryotic protein gene sequence in a prokaryote, there was always a discrepancy between the frequency of occurrence of triplet codons, that code for the same amino acid. Some eukaryotic codons do not have sufficient complimentary tRNA counterparts in their prokaryotic hosts, making the translation process difficult. This can be overcome by optimizing codon sequences and also using Rosetta competent cells. These cells enhance the expression of eukaryotic proteins that are coded for by codons not commonly used in E. Coli. The plasmid coding for these tRNA's have a chloramphenicol antibiotic resistant gene, which is a marker gene. Like BL21 (DE3), there are Rosetta cells available expressing the tRNA genes on a plasmid which codes for T7 lysozyme (T7 polymerase degenerator) and the lac repressor. They follow the same mechanism of induction in the presence of IPTG. Protease lon and ompT ensures good protein expression and stability. (Novagen - 2002)

The recombinant vectors are then 'transformed' into specific host cells whose properties will complement the increased protein expression.

There are various modes by which transformation can be ensured. Some techniques commonly used are heat shock, electroporation, treatment with divalent cations such as CaCl2.

Step 3: Selecting a host organism

The choice of a host system affects time, cost, quantity and quality of your gene of interest. The advantages of a prokaryotic host such as E.coli are that as compared to eukaryotic hosts such as yeast cells, mammalian cells, etc, they can be cultured very easily. They can be grown on a small scale (in a petri dish) as well as on a large scale (industrial fermentors) and the scaling-up is a simple process. One of the biggest advantages in using prokaryotes is the time saved and the fact that results can be interpreted very frequently. This is due to the doubling time of these organisms. E.coli for example, under optimum conditions has a doubling time of 20mins. A big advantage of E.coli is the fact that numerous recombinant plasmids have already been engineered for it.

Nowadays, companies custom-make oligonucleotide sequences to order. They can modify existing plasmids to suit your specific need, optimize codons in the target sequence specifically for your choice of host. They can even transform the synthesized fragment into the vector and deliver it to you. You only need to know your protein of interest and they take care of the rest.

We can see that the gene selection and cloning process though highly specific and tedious, is revolutionary and absolutely vital for this age of Science. One of the first success stories in this technology and the most well known is the artificial synthesis of recombinant human Insulin hormone in first developed by Genentech in 1978 and then taken by Eli Lilley and Company.

The sequences of the A and B polypeptide chains of insulin are identified and determined. A methionine amino acid sequence is coded for at the start of each of the chains. To compensate for the lack of post-translational modification of proteins i.e. the folding of proteins (which is essential for the protein to be active), the two functional strands, A & B were transformed separately into recombinant plasmids (having lac promoter and lac Z genes) and E.coli host, and grown in culture media separately. The proteins expressed were fused to ß-Galactosidase. The insulin chains were hen extracted, purified, and then mixed together. Di-sulphide bridges form between the two strands, reslting in the formation of a 'Humulin' molecule.

N-glycosylation of proteins (a post-translational modification), was believed to be exclusive to eukaryotes and archae and would not take place for recombinant proteins in bacterial hosts. However now, this glycosylation system has been identified in Campylobacter jejuni . It therefore now might be possible to glycosylate eukaryotic proteins in prokaryotic hosts by transforming the gene sequences that are responsible for this system into the plasmid. Thus there is a possible future in expression of recombinant glycoproteins (such as antibodies as vaccines) in prokaryotes. (Szymanski et al. - 2003)

This technology, though in its infancy has already made leaps and bounds in progress and still has tremendous potential. New more compatible vectors producing fusion proteins to increase the quality of proteins and their purification are being designed for both eukaryotic as well as prokaryotic systems.

This technology, though in its infancy has already made leaps and bounds in progress and still has tremendous potential.