Primary Structure Analysis Of Proteins Biology Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Proteins are the important group of biomolecules present in an living organism and are known to perform vital functions of the body. Chemically a protein is a polymer of amino acids, linked by peptide bonds and arranged in a sequential manner. This sequential arrangement of the amino acid gives is referred as its primary structure. The primary structure of a protein is determined by the gene corresponding to the protein. A specific sequence of nucleotides in DNA is transcribed into mRNA, which is read by the ribosome in a process called translation. The sequence of a protein is unique to that protein, and defines the structure and function of the protein.

Structurally, polypeptide chain of a protein has its N= terminal and C terminal which is determined by the linkage pattern between two amino acids. The property of the protein is mainly determined by the type of amino acid present in the primary structure. The sequence of amino acids in a protein/ polypeptide chain is determined by Edman's Degradation and mass spectrometry.

In the recent year's advancement in protein sequencing techniques have generated a large amount of data which is deposited in the databases. The protein sequence databases contain data regarding protein sequences. The deposition of the sequence data in the databases have led to the invention of data analysis tools in bioinformatics. The data analysis tools help in understanding the properties of a particular protein whose sequence is under consideration. Analysis of primary protein sequence / structure also helps in planning the laboratory experiment for the purification, understanding the physical chemical properties of the protein, amino acid composition of protein etc.

Retrieving sequences from database/s

Introduction: The sequences data generated by the high throughput techniques is saved in databases, so that the data is readily available for analysis. The primary set of data stored in primary databases. The sequence data of protein can be downloaded and analysed by various analysis tools.

Exercise: To retrieve the protein sequence data from NCBI's protein database

Protocol:

Goto NCBI http://www.ncbi.nlm.nih.gov/

Select "protein" from the dropdown menu of databases

Type the name of the protein , eg: myoglobin

Click on the link provided

The details of the protein locus , accession number and observe the protein sequence in GenPept format

Click on "FASTA" and observe the sequence in FASTA format

Copy the sequence and paste in a word document OR click on "SEND TO" select the destination as "FILE" and download the sequence in FASTA format and click on "CREATE FILE". Save a file at specific destination for further usage

Result: Silk emitted by the silkworm consists of two main proteins, sericin and fibroin.The sequence of Silk fibroin L-chain was retrieved from NCBI database and details like accession number GenPept format were observed.Its FASTA format is as follows

>gi|19221230|gb|AAL83649.1| silk fibroin [Bombyx mori]

MKPIFLVLLVATSAYAAPSVTINQYSDNEIPRDIDDGKASSVISRAWDYVDDTDKSIAILNVQEILKDMA

SQGDYASQASAVAQTAGIIAHLSAGIPGDACAAANVINSYTDGVRSGNFAGFRQSLGPFFGHVGQNLNLI

NQLVINPGQLRYSVGPALGCAGGGRIYDFEAAWDAILASSDSGFLNEEYCIVKRLYNSRNSQSNNIAAYI

TAHLLPPVAQVFHQSAGSITDLLRGVGNGNDATGLVANAQRYIAQAASQVHV

Translation of DNA / RNA sequences into protein

Introduction: Translate tool is the online tool for the translation of DNA / RNA sequences into a protein sequence. The tool is developed by ExPASy (Expert Protein Analysis System) Translation Tool - Swiss Institute of Bioinformatics.

EXCERISE 1: You are provided with a sequence of gene. Translate the gene sequence and find out the protein product.

>gi|50540477|ref|NM_001002706.1| Danio rerio lysozyme g-like 1 (lygl1), mRNA

TTTCAGCTGCAATCACAATAGCACAACATTCAGTGGACTGACTAGTCACAGCTCGAACATTTTGTTTTGT

TCACAGCAGCTTTGAGTCATCATGGGCATTCCGGTGATACTTACCATGTATTTTCTAGCATGCATTTATG

GAGATATCATGAAAATAGACACCACTGGGGCATCAGAGGTGACAGCAAAACAGGACAAGTTAACTGTAAA

GGGAGTTGAAGCCTCTAAAAAACTGGCTGAGCATGATCTGGCCCGAATGGAACAATACAAGTCCAAAATC

CTCAAAGTTGCCCGAGCAAAGCAGATGGACCCGGCTGTGATTGCTGCCATCATATCCAGAGAGTCCAGGG

CTGGAGCGGCACTGAAGGATGGATGGGGTGACCACGGCAATGGCTTTGGTCTCATGCAGGTTGACAAACG

CTACCACAAACTGGTAGGTGCGTGGGACAGCGAGGAACATCTCACACAAGGAACTGAGATACTCATTGGT

TATATTAAAGATATTAAAGCAAAGTTTCCCACATGGACCAAGGAGCAATGCTTTAAAGGTGGAATATCAG

CGTATAATGCAGGTGTGAAGAACGTGCAAACATATGAGCGCATGGATGTGGGCACCACAGGCGGTGATTA

CGCTAATGATGTTGTTGCCCGAGCCCAGTGGTTCAAAAGTAAAGGTTACTGAGGAATAAATGTAGTCTAA

TGCTATTTTTAATAGCTCAGTCTAACCACTGATCACAGTTTTATACTTTATTTTGTATTTGCTGGAAATA

AATAAAATGTCTTTATTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Protocol:

Go to http://web.expasy.org/translate/

Paste the given sequence of DNA/ RNA in the given slot

Click on translate sequence

Note down the results

Result:The EXPASY tool examines the input sequence in all six possible frames (i.e. reading the sequence from 5' to 3' and from 3' to 5' starting with nt 1, nt 2 and nt 3).The translated gene sequence gives various frames one of those is as follows

5'3' Frame 1

X X X X X X X X X X X X X F S C N H N S T T F S G L T S H S S N I L F C S Q Q L Stop V I Met G I P V I L T Met Y F L A C I Y G D I Met K I D T T G A S E V T A K Q D K L T V K G V E A S K K L A E H D L A R Met E Q Y K S K I L K V A R A K Q Met D P A V I A A I I S R E S R A G A A L K D G W G D H G N G F G L Met Q V D K R Y H K L V G A W D S E E H L T Q G T E I L I G Y I K D I K A K F P T W T K E Q C F K G G I S A Y N A G V K N V Q T Y E R Met D V G T T G G D Y A N D V V A R A Q W F K S K G Y Stop G I N V V Stop C Y F Stop Stop L S L T T D H S F I L Y F V F A G N K Stop N V F I Q K K K K K K K K K K

Finding the isoelectric point and molecular weight

Introduction: Compute pI/MW is a tool calculates the estimated pI and Mw of a specified Swiss-Prot/TrEMBL entry or a user-entered AA sequence. These parameters are useful if you want to know the approximate region of a 2-D gel where a protein may be found.

Exercise: You are given a protein sequence find out the theoretical pI and molecular weight of the sequence.

Sequence:

V I M G I P V I L T M Y F L A C I Y G D I M K I D T T G A S E V T A K Q D K L T V K G V E A S K K L A E H D L A R M E Q Y K S K I L K V A R A K Q M D P A V I A A I I S R E S R A G A A L K D G W G D H G N G F G L M Q V D K R Y H K L V G A W D S E E H L T Q G T E I L I G Y I K D I K A K F P T W T K E Q C F K G G I S A Y N A G V K N V Q T Y E R M D V G T T G G D Y A N D V V A R A Q W F K S K G Y

Protocol:

Go to http://web.expasy.org/compute_pi/

Paste the single letter amino acid sequence of the protein/ upload the sequence from a file/ uniprot Database.

Click on compute pI/MW

Note the results

Result: The theorotical isoelectric point andd molecular weight of the given protein sequence was estimated using Swiss-Prot/TrEMBL to be 9.04 and 21859.19 Da

10 20 30 40 50 60

VIMGIPVILT MYFLACIYGD IMKIDTTGAS EVTAKQDKLT VKGVEASKKL AEHDLARMEQ

70 80 90 100 110 120

YKSKILKVAR AKQMDPAVIA AIISRESRAG AALKDGWGDH GNGFGLMQVD KRYHKLVGAW

130 140 150 160 170 180

DSEEHLTQGT EILIGYIKDI KAKFPTWTKE QCFKGGISAY NAGVKNVQTY ERMDVGTTGG

190

DYANDVVARA QWFKSKGY

Theoretical pI/Mw: 9.04 / 21859.19 

Study of peptides

Peptide Cutter predicts potential cleavage sites cleaved by proteases or chemicals in a given protein sequence. PeptideCutter returns the query sequence with the possible cleavage sites mapped on it and /or a table of cleavage site positions.

PeptideCutter searches a protein sequence from the SWISS-PROT and/or TrEMBL databases or a user-entered protein sequence for protease cleavage sites. Single proteases and chemicals, a selection or the whole list of proteases and chemicals can be used. Different forms of output of the results are available: Tables of cleavage sites either grouped alphabetically according to enzyme names or sequentially according to the amino acid number. A third option for output is a map of cleavage sites. The sequence and the cleavage sites mapped onto it are grouped in blocks, the size of which can be chosen by the user to provide a convenient form of print-out.

The program accepts the complete input as one single sequence, even if several are entered.

Numbers and space characters are neglected.

If a sequence in FASTA format is entered, the first line is neglected during further steps of the program.

If letters are entered that do not determine an amino acid (B,J,X or Z) the user will be asked for correction.

The program is case insensitive.

Sequence:

MESLKKLFQPVHEKVDETWSKVTIVGVGQVGMAAAFSMLTQNVTNNIALVDMMADKLKGEMMDLQHGSAF

MRNAKIQSSTDYSITAGSKICVVTAGVRQREGESRLDLVQRNTDVLKQIIPQLIKYSPDTILVIASNPVD

ILTYVTWKISGLPKHRVIGSGTNLDSARFRYLLSDRLGIATTSCHGYIIGEHGDSSVPVWSAVNIAGVRL

SDLNNQIGTDDDPENWKELHENVVKSAYEVIKLKGYTSWAIGLSLAQIVRAILTNANSVHAVSTYLKGEH

GIEDEVFLSLPCVLSHCGVSDVIRQPLTELEVAQLRKSAKVMAKVQNDIKF

Method:

Goto http://web.expasy.org/peptide_cutter/

Paste the given sequence

Select the enzyme or chemical to be used for the cleavage

Click on perform

Results.

Results:Peptide cutter predicetd 9 potential cleavage sites in the given protein sequence by CNBr

10 20 30 40 50 60

MESLKKLFQP VHEKVDETWS KVTIVGVGQV GMAAAFSMLT QNVTNNIALV DMMADKLKGE

70 80 90 100 110 120

MMDLQHGSAF MRNAKIQSST DYSITAGSKI CVVTAGVRQR EGESRLDLVQ RNTDVLKQII

130 140 150 160 170 180

PQLIKYSPDT ILVIASNPVD ILTYVTWKIS GLPKHRVIGS GTNLDSARFR YLLSDRLGIA

190 200 210 220 230 240

TTSCHGYIIG EHGDSSVPVW SAVNIAGVRL SDLNNQIGTD DDPENWKELH ENVVKSAYEV

250 260 270 280 290 300

IKLKGYTSWA IGLSLAQIVR AILTNANSVH AVSTYLKGEH GIEDEVFLSL PCVLSHCGVS

310 320 330

DVIRQPLTEL EVAQLRKSAK VMAKVQNDIK F

The sequence is 331 amino acids long.

Name of enzyme

No. of cleavages

Positions of cleavage sites

CNBr

9

1 32 38 52 53 61 62 71 322

Studying of physical and chemical properties of proteins

ProtParam is a tool which allows the computation of various physical and chemical parameters for a given protein stored in Swiss-Prot or TrEMBL or for a user entered sequence. The computed parameters include the molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity

Sequence:

MESLKKLFQPVHEKVDETWSKVTIVGVGQVGMAAAFSMLTQNVTNNIALVDMMADKLKGEMMDLQHGSAF

MRNAKIQSSTDYSITAGSKICVVTAGVRQREGESRLDLVQRNTDVLKQIIPQLIKYSPDTILVIASNPVD

ILTYVTWKISGLPKHRVIGSGTNLDSARFRYLLSDRLGIATTSCHGYIIGEHGDSSVPVWSAVNIAGVRL

SDLNNQIGTDDDPENWKELHENVVKSAYEVIKLKGYTSWAIGLSLAQIVRAILTNANSVHAVSTYLKGEH

GIEDEVFLSLPCVLSHCGVSDVIRQPLTELEVAQLRKSAKVMAKVQNDIKF

Protocol:

Go to http://web.expasy.org/protparam/

Enter the sequence provided

Click on compute parameters

Analyse and record the results.

Result: The physical and chemical properties of the given protein sequence were computed by ProtParam. Some of them are

Number of amino acids: 331

Molecular weight: 36362.8

Theoretical pI: 6.76

Top of Form

Amino acid composition: 

Ala (A) 23 6.9%

Arg (R) 13 3.9%

Asn (N) 15 4.5%

Asp (D) 19 5.7%

Cys (C) 4 1.2%

Gln (Q) 14 4.2%

Glu (E) 16 4.8%

Gly (G) 22 6.6%

His (H) 9 2.7%

Ile (I) 25 7.6%

Leu (L) 31 9.4%

Lys (K) 21 6.3%

Met (M) 9 2.7%

Phe (F) 6 1.8%

Pro (P) 9 2.7%

Ser (S) 29 8.8%

Thr (T) 19 5.7%

Trp (W) 5 1.5%

Tyr (Y) 8 2.4%

Val (V) 34 10.3%

Pyl (O) 0 0.0%

Sec (U) 0 0.0%

(B) 0 0.0%

(Z) 0 0.0%

(X) 0 0.0%

Bottom of Form

Total number of negatively charged residues (Asp + Glu): 35

Total number of positively charged residues (Arg + Lys): 34

Atomic composition:

Carbon C 1609

Hydrogen H 2603

Nitrogen N 443

Oxygen O 487

Sulfur S 13

Formula: C1609H2603N443O487S13

Total number of atoms: 5155

Extinction coefficients:

Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.

Ext. coefficient 39670

Abs 0.1% (=1 g/l) 1.091, assuming all pairs of Cys residues form cystines

Ext. coefficient 39420

Abs 0.1% (=1 g/l) 1.084, assuming all Cys residues are reduced

Estimated half-life:

The N-terminal of the sequence considered is M (Met).

The estimated half-life is: 30 hours (mammalian reticulocytes, in vitro).

>20 hours (yeast, in vivo).

>10 hours (Escherichia coli, in vivo).

Instability index:

The instability index (II) is computed to be 15.96

This classifies the protein as stable.

Aliphatic index: 102.72

Grand average of hydropathicity (GRAVY): -0.028

Peptide Primary structure

Introduction: PepDraw is a tool that was developed to facilitate the study of the chemical structure and properties of peptides. It allows users to draw the primary chemical structure of an amino acid sequence and predict some chemical properties such as mass, charge, and hydrophobicity.  PepDraw was designed to be a powerful yet user-friendly tool for peptide analysis. It is especially useful for teaching students about the structure and properties of the amino acids.

Sequence:

SVDHGFLVTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAGSCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAPITGENIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWIGYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPFIECHGRGTCNYYANAYSFWLATIERSEMFKKPTPSTLKAGELRTHVSRCQVCMRRTSVSIGYLLVKHSQTDQEPMCPVGMNKLWSGYSLLYFEGQEKAHNQDLGLAGSCLARFSTMPFLYCNPGDVCYYASRNDKSYWLSTTAPLPMMPVAEDEIKPYISRCSVCEAPAIAIAVHSQDVSIPHCPAGWRSLWIGYSFLMHTAAGDEGGGQSLVSPGSCLEDFRATPFIECNGGRGTCHYYANKYSFWLTTIPEQSFQGSPSADTLKAGLIRTHISRCQVCMKNLVDHGFLVTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAGSCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAPITGENIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWIGYSFVMHTSAGAEGSGQALASPGSCLEEFRSAPFIECHGRGTCNYYANAYSFWLATIERSEMFKKPTPSTLKAGELRTHVSRCQVCMRRT

Protocol:

Goto http://www.tulane.edu/~biochem/WW/PepDraw/index.html

Paste the given sequence

Click on draw peptide

Record the properties.

Result: The peptide structure properties and its properties analysed using PepDraw is as follows

Peptide properties

Sequence:

SVDHGFLVTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAGSCLRKFSTM

Length:

60

Mass:

6591.1809

Isoelectric point (pI):

7.32

Net charge:

0

Hydrophobicity:

+46.33 Kcal * mol -1

Extinction coefficient1:

4595 M-1 * cm-1

Extinction coefficient2:

4470 M-1 * cm-1

Peptide structure image:

C:\Users\Jyoti\Desktop\ASH\peptide.png

Random protein sequence generation

Introduction: RandSeq is a tool which generates a random protein sequence. One can use equal amounts of amino acids to generate a random sequence or can use specific amount of amino acid percentages. The tool generates random protein sequences which can be analyzed using different tools.

Protocol:

Goto http://web.expasy.org/randseq/

Select the parameters/ composition of each amino acid

Click on submit

Analyse the results

Result: Random protein sequence generated having equal composition of all amino acids to be analyzed further is as follows

Virtual Sequence: RND29006

ID RND_29006 Unreviewed; 200 AA.

AC RND29006;

DT 26-Aug-2012.

DE Randomly generated sequence, created by ExPASy WWW server tool

DE RandSeq for 103.29.118.216.

CC -!- MISCELLANEOUS: This sequence was generated using equal composition for all amino acids.

SQ SEQUENCE 200 AA; 23795 MW; FED65773033E0235 CRC64;

WFWYDMPEME QDMDSKQVYM GRGKDDIICT INNRYPAFHC LNCPNMQMTE NNRFGRCRDS

TLWWSQHASA NCPQMYRCKP NGEAHIWEEY VCNWTWKKIK GFPGMVYKIP WPDHSITLFI

DMELGLQCLT KSSHAFPLMV PFARGHYETS WHHGYCQVGT VVDQFAWSQQ TCFEAHVIFI

YDAAYLVKLR KRNELHVSTR

//

CONCLUSION

Primary structure is the linear arrangment of amino acids in a protein and the location of covalent linkages such as disulfide bonds between amino acids.Protein sequences can be retrieved from various databases like PDB,Uniprot,NCBI's database.Bioinfomatics tools like EXPASY,PepDraw can be used to predict their primary structure,physical and chemical properties like isoelectric pH,molecular weight no of atoms,cleavage sites etc.whereas the primary structure an the physical and chemical properties of unknown proteins can also be predicted by tools like .at any given time the mrna content can be translated to obtain the protein content of the cell as well as vice versa can be done to obtain cdna

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.