Recombinant Protein Expression Systems Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

To determine the function of a protein; its essential to purify the protein. This can often be an intricate task, therefore to study protein expression a heterologous system within a cell or 'host,' i.e. yeast or bacterial cells is used. The kind of system utilized to carry out the protein expression studies is difficult to be anticipated as it's tricky to understand how the protein will behave within the host. Thanks to the molecular biology revolution the field of protein expression or protein biochemistry has excelled to the furthest of imaginations. The most exploited system is the E.coli. It been a prokaryotic recombinant protein system has several advantages associated. The ease of culture, rapid cell growth of the protein post the cDNA is cloning. The protein expression is induced by IPTG. It allows the expression of the gene from the recombinant plasmid to produce a peptide to a complete protein. The cautionary points influencing the protein expression studies is prokaroyotes such as E.coli do not have introns, there could be translational modification which could lead to protein not working as per expectation; however purification is relatively simple from bacterial expression systems. There are possibilities of the proteins becoming insoluble in inclusion bodies therefore unable to recover to study its functionality.

The most used eukaryotic expression systems involving hosts such as yeast cells Pichia pastoria (Cereghino et al., 2000) have an advantage that it can glycosylate proteins and therefore the activity of the protein is not affected (Hannig & Makrides, 1998; Yokoyama, 2003). Also mammalian and baculovirus cells are great expression systems as they produce high levels of expression compared to that of Escherichia coli, where recovering sufficient amounts is hard (Kost et al., 2005; Rooser et al., 2005). However growth is generally slower in these systems compared to that of prokaryotic cells. No inclusion bodies are present, and the protein maintains their post-translational modifications which is good when studying protein function.

The protein expression starts with the use of cDNA which is isolated from the organism Caenorhabditis elegans. The cDNA is PCR amplified and then cloned into an expression vector. The construct is completed by adding an affinity tag such as His-tag (Baneyx, 1999). The protein is induced by addition of a substrate to the culture medium. The protein is easy to purify by using His tags into vectors. It is purified from host cells by affinity chromatography. Factors affecting the level of expression are toxicity of the protein, growth temperature and host strain. Conditions can be optimised to give maximum amounts of recombinant protein expression. The current experiment aims the expression of a protein from the nematode Caenorhadbditis elegans in E.coli using an inducible recombinant protein expression system known as pET.


As per the protocol -IPTG was added to a concentration of 0.4mM (stock IPTG is 50mg/ml - 20mM), therefore 40ul was added.


Molecular weight of the recombinant protein

Sodium Dodecyl Sulphate PolyAcrylamide Gel Electrophoresis (SDS-PAGE) technique was used to separate the protein markers and the recombinant protein.


Figure 1: The results of the SDS-PAGE; Lane I contains the protein standards in Tris-Glycine buffer. Lane II acts as a control, where as lane III conveys expression of the recombinant protein. Lane III contains the same as lane II, with the addition of Isopropyl-beta-D-thiogalactopyranoside (IPTG).

Quantification of the levels of the recombinant protein expression

To estimate the levels of protein, the used data from Invitrogen technologies provided in table. For some bands in the marker lane (see fig1, Lane I) the amount of protein has been quantified in micrograms. Using the data from table 1 and bands from figure 1, one can estimate the levels of protein.

Table 1: Estimations of the protein concentration for the protein standards per 10ul


per 10ul corresponding in kDa







Alcohol Dehydrogenase

50 µg

Carbonic Anhydrase

36 µg

Myoglobin (blue)





6 µg


4 µg

The estimated level of recombinant protein is 16µg; the band on figure 1 for the recombinant protein is seen well than that of the lysozyme protein by approximately 10x therefore, the estimated value to approximately be similar. However this may not be precise due to estimation of the protein.

The amount of protein load on the gel loaded in the well Lane III, and this corresponds to the micrograms of protein. The 10µl loaded was from 1ml of culture. Therefore 16µg came from 1ml of culture. Now one must work out how much protein per litre, so the volume is multiplied by 1000 meaning protein must be multiplied by 1000, so the answer will be 16000/1000 = 16mg/per litre of culture.

The foremost downside associated with the experiment is lack of the direct control for comparison. As it is clearly visible that the control is similar to that of the experimental lane III; indicating a miss interpretation of the experiment. The lane II is very similar to the lane III suggestive of the possibility of a loop hole in the experiment design. The list of reasons for the similar control and experimental gel run could be due to firstly the reason being missing labeling of the sample so the control was actually on present, secondly the control was contaminated by the experimental material or thirdly the accidental activation by the IPTG of the control sample (culture broth).

The graphical data was constructed using Microsoft Excel package, having been provided with the molecular weights of the protein bands (see fig 1, Lane A) in See Blue Plus2 Pre-Stained Standard. One was able to construct a standard curve (see fig 2) using the migration size (mm) of the protein standard molecular weights (kDa) with the help of the software available on the thermoscientific website. The image below shows the image analysis done and the migration was calculated on the basis. The line of best fit on the graph is not that close to the data points suggesting that perhaps some error is involved. This could have been when measuring the migration of the protein standards was not quite accurate, depending on which region of the band one measured too. Using the standard curve one can then go on to estimate the molecular weight of the recombinant protein (see fig 1).

Figure 2: Image analysis by the Thermoscientific (myImageAnalysis Software)


Figure 3: The Standard curve of log (MW) of protein std vs their migration.


How could the levels of protein expression be improved and how the protein may have been purified from the bacterial lysate

The level of protein expression can be improved by monitoring several factors like: Promoter strength, the translation machinery the ribosome binds site (rbs), spacing flanked by rbs and start codon AUG, stop codon, transcription terminator, replication origin, selection marker of the expression vector, 5' end GC contents and most importantly host cell strains and host systems determines the mRNA level of the recombinant protein.

Under usual circumstances, the stronger the promoter is, the elevated protein yield may be obtained. Best possible transcription commencement may be obtained from the consensus rbs sequence; therefore protein yield may be augmented accordingly. For a chosen recombinant protein, altered rbs may give altered expression level. Most expression vectors contain multiple stop codons in three reading frames and efficient transcription terminators. To express a gene in E.coli, changing the stop codon to TAA in cloning may increase the transcription termination efficiency and the translation termination accuracy (Pfeiffer 2012 Apr 24). The replication origin determines the copy number of the expression vector in a host cell. Elevated copy number expression vectors normally give high protein yield for non-toxic proteins. Many expression vectors use small or intermediate copy number replication origins derived from pBR322 or pACYC plasmids (Hanif 2010 May). The commonest selection markers used on expression vectors are ampicillin, chloramphenicol, kanamycin, and tetracycline resistance genes. Expressing an protein in expression vectors with diverse selection markers can evidently result in altered protein yield even though all other conditions are the similar.

The levels of protein expression also could have been improved by varying the time and the temperature of the induction and even the concentration of the inducer. Varying the concentration of IPTG may also work. One could reduce the GC content at the 5' region and looking at the gene sequence for the recombinant protein there is a high content. Minimizing this content would mean the mRNA would not form secondary structures, so translation could be interrupted, this means lower levels of expression. Using more A and T residues without changing the amino acids could be a potential idea. The method involved enzymes (E) and proteins have a high affinity for certain substrates (S). The addition of substrate bound to beads of the column means the enzyme will bind to the column. To remove the enzyme of the column, one can add large concentration of substrate (modify the substrate which can bind to enzymes even more tightly also). By adding excess amount of substrate, the advantage of method is that it is highly specific, more so substrate specific. So if 1000s of proteins in the original extract, but only one protein will bind to that substrate, only that one will bind to the column (Gräslund et al 2008 feb). In the case of the experiment, the recombinant protein is tagged with His-tag, meaning the protein can be rapidly purified from E.coli using affinity chromatography O'Shaughnessy L, Doyle S;( 2011). His tags have an affinity for ions so this is added in excess amount. Thus the tag will bind the ion meaning only that protein will bind to column.

Suggesting the identity of the recombinant protein

Having estimated the size of the recombinant protein to be 16kDa this value can be searched into the National Centre for Biotechnology Informations (NCBI) website ( using the protein database section. The results show up with those proteins which have a similar MW to 16kDa or are roughly close to this molecular weight. The protein sequence of the protein actin related protein 2/3 complex, subunit 5, 16kDa [Xenopus (Silurana) tropicalis] was chosen as shown in Figure 4

Figure 4: Search 16kDa at the NCBI website:

The protein sequence in FASTA format, which will be put into the BLAST tool on

The figure below shows the result of query put in the NCBI website for 16kDa protein. The experiment involved expressing a C.elegans protein in E.coli of the size around 16kDa.

The figure below is indicative of the confirmation of the gene is present in C.elegans. Using (, the protein was found to be present in C.elegans. Two results came up. The best is chosen to explain - with a score of 112 and E value of 1e-25. The similarity is appreciable enough as represented by these score value, it is a good candidate.

Figure 5: Indicative of the search and confirmation of the protein M01B12.3 C.elegans


Further evidence on the gene summary page revealed the estimated molecular weight of the recombinant protein to be approximately 17kDa and 152 amino acid chain in it. Therefore it is a very good candidate as the estimation of the recombinant protein was 17 kDa. The protein is encoded by the gene M01B12.3 gene, which is arx-7 also called as p16Arc encodes a subunit of the actin related protein of the conserved Arp2/3 complex, with 60% amino acid similarity with the homologous genes from human. arx-7 is an essential gene as disruption in the expression by RNAi results in embryonic arrest due to ventral enclosure during morphogenesis; Arp2/3 depletion results in partial shrinking of the Ea/p apical surfaces and incomplete Ea/p cell internalization; WSP-1 activates Arp2/3 complex and their function in ventral enclosure is cell autonomous; Arp2/3 regulates some Apical Junction components in embryos and adults; Arp2/3 is required during intestinal morphogenesis for regulation of intestinal lumen width in embryos and apical F-actin accumulation during larval and adult growth. Mutations in arx-7 cause PDE axon guidance defects; arx-7 mutants affect PQR growth cone morphology and filopodia formation; arx-7 acts cell-autonomously in PQR filopodia regulation.

In addition to the information received for the worm base it also revealed its similarity to the human as the best human ortholog ARPC5 with (E value:1.2e-20) as seen the figure 6.

Figure 6: Explaining the similarity of the C.elegans and human actin related protein 2/3 complex.

Design PCR primers to amplify the gene suggested within pET bacterial expression plasmid

The spliced coding region will be used to carry out PCR, as the target. However one cannot PCR mRNA, but complementary DNA (cDNA) is PCR which is derived from mRNA; a cDNA clone would be requested.

>M01B12.3 spliced + UTR











Figure 7: The M01B12.3 spliced + UTR (723 bp) which are available for primer designing.

The figure 8 below shows the software which can be used to design primers for a given sequence. It can help to determine the Tm and the condition suitable for the PCR. The software used to design the primer is PRIMER3PLUS

Figure 8: For primer design.

The conventional way of designing the PCR primers is by selecting oligonucleotides ideally between 18-26 nucleotides. The primers will be placed at 5' end and 3'end. The use of restriction site allows the cloning of the PCR in the pET-19b vector. A map provided (not shown) demonstrates the cloning/expression region, where there is selection of restriction sites to clone the product into the site. Xba I, Nco I, Nde I, Xho I, BamH I, and Bpu1102 I - these are sites which one can use to clone the PCR product however one must check that the spliced coding region sequence to see if those sites are present. This is carried out by the tool which checks for restriction sites New England BioLabs - NEB cutter tool ( as seen in figure 9. The results show with zero cutters a possibility of 5 candidates, Bpu1102 (must cut up fragment) was not present. Nco I was chosen due to the restriction site for the endonuclease not present later in the cloning region, meaning every bit will be present in the PCR product.

Figure 9: NEB cutter tool


Now the primers are designed as follows: the sequences are picked for the forward and reverse primers. The restriction sites are added to the 5' prime end of the sequence. The reverse primer is reverse complemented and sent in to the primer design company for them to be prepared. On arrival the primers should appear as shown below in figure 10.


Figure 10: The forward and reverse primers to be used for cloning.

However there is a problem with this primer, the T7 promoter, His-tag add tag rapidly purify protein by affinity chromatography, useful to keep tag, rapidly purify from E.coli culture, once purified such as biochemistry assays, enzyme assays bindings assays, inoculate mice to raise antibodies. Therefore one wants to keep the tag. At the end of the sequence there is a TGA stop codon, this meaning the tag would not be kept. This is because the T7 RNA polymerase would bind to the T7 promoter and then transcribe through sequence making mRNA, and hit this TGA and stop transcribing. Therefore it will not read trough to his-tag, so one must remove the stop codon. Therefore going back to reverse primer, remove the reverse complement (see fig 10) stop codon, so now when product is cloned into plasmid, the read through can occur. The T7 polymerase will continue to transcribe PCR product and his tag, and then it will hit T7 terminal sequence and stop transcribing. But due to taking off 3 nucleotides once must add three nucleotides based on sequence to maintain length of primer. This will generate a PCR product that has a C-terminal His-tag. If BamH I had been chosen downstream of his-tag would of generated protein with N-terminal tag. His-tag add more to recombinant protein in terms of size such as half a kDa.