This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Proteomics deals with the large-scale determination of cellular and gene function directly at the protein level. Proteomics studies face several challenges due to the highly complex cellular proteomes and the low abundant proteins. Liquid chromatography-mass spectrometry based proteomics has become a key to overcome these issues. It increases the identification and data qualities and thus resulting in the increase of whole proteomics research coverage. This chapter will describe the basic principles and instruments which are being used for protein characterization using bottom-up and top-down approaches for protein identification as well as used for quantitation and post-translational modification proteomics. Moreover, several latest mass spectrometers have also been explained briefly to keep the reader up to date with recent advancement in liquid chromatography-mass spectrometry based proteomics.
Key words: Proteomics, Liquid chromatography-mass spectrometry, Electrospray ionization, Matrix-assisted laser desorption ionization, bottom-up, top-down, Post-translational modification
In the past couple of decade liquid chromatography-mass spectrometry (LC-MS) based proteomics has been emerged as effective medium for identification, characterization and quantification of proteins which are also integral components of the essential processes of life. To date, MS based proteomics is illustrated as an indispensable technology for analysis of primary sequence, protein-protein interactions or post-translational modifications (PTMs) when applied to small sets of proteins. Currently, mapping of numerous organelles, affinity-based protein-protein interactions, description of the parasite genome and proteome have been studied. It has ability to identify and to quantify thousands of proteins from complex samples and hence can be expected to impact broadly on medicine and biology. LC-MS based proteomics which is based on the discovery and development of protein ionization and recognized by the Nobel Prize in chemistry (year 2002). As I understand this field is too expansive for a comprehensive, single chapter exploration and thus I would like to apologize in advance for the many omissions. However, I do hope that this chapter captures the excitement of fundamental achievements in MS-based proteomics and points towards future developments.
Principles and Instrumentation
The term 'proteomics', a newcomer to the -omics era, first incepted in the context of two-dimensional (2D) gel electrophoresis(Wilkins et al., 1996) is always being a broad instrument intensive research area. Two-dimensional gel electrophoresis felt short for their identification as it has low resolving ability and its visualization is limited to membrane proteins and hence to the most abundant of these proteins. Due to several limitations 2D gel electrophoresis has been gradually and widely replaced by LC-MS based proteomics.
Liquid chromatography (LC-MS) and liquid chromatography-tandem mass spectrometry (LC-MS/MS) are major tools used in protein identification, which is straightforward as only two unique peptides are usually sufficient to recognize a protein. Top-down and bottom-up sequencing are the two approaches used for protein identification. The first one is a relatively new approach and involves fragmenting intact proteins directly. The second approach, bottom-up sequencing, also known as shotgun sequencing (Fig 1), is a traditional approach and after protein digestion it fragments peptides in the gas phase. Then the liquid chromatography (and other related methods) comes into action to fractionate the resultant complex peptide mixture. The liquid chromatography was coupled with MS, termed as LC-MS and evolved as a major analytical platform for proteomics because of its sensitivity, selectivity, accuracy, speed and throughput (Chen et al., 2007a; Chen & Pramanik, 2008; Chen et al., 2007b; Choudhary & Mann, 2010; Cravatt et al., 2007; Domon & Aebersold, 2006; Domon & Aebersold, 2010; Ong & Mann, 2005; Qian et al., 2006; Wilm, 2009). Development of non-destructive ionization methods in MS which might analyze the intact biomolecules without significant degradation evolved the efficient LC-MS methods and played a vital role in the field of proteomics, historically.
By definition, a mass spectrometer consists of three basic components. First is an ion source which is used to convert analyte molecules into gas-phase ions. Second is a mass analyser which measures the ionized analytes in terms of their mass-to-charge ratio (m/z). And the third is the detector that registers the number of ions at each m/z values. In the mass spectrometric analysis the analyte molecules from solution or solid phase is transferred into the gas phase resulting into charged ions followed by their separation in mass analyzer based on the measurement of the mass/charge ratio of analytes.
Ionization Techniques: These techniques are used to convert molecules into ions which can be further manipulated according to magnetic or electric fields. These are critical because of the difficulty of conversion of biological molecules such as peptides and proteins which are polar, zwitter ionic in nature into gas-phase ions without any fragmentation or degradation.
The most commonly applied soft ionization processes are matrix assisted laser desorption ionization (MALDI) (Karas & Hillenkamp, 1988) and electrospray ionization (ESI) (Fenn et al., 1989). The ability of LC-MS to provide mass measurement accuracy in parts per million (ppm) level and additional information about peptide amino acid sequence by performing tandem MS (MS/MS) measurements clearly explains its power.
Matrix-Assisted Laser Desorption Ionization (MALDI): A pulsed laser is used to deposit energy into a "matrix" that absorbs the energy of the laser and further used to assist and promote thermal desorption of the molecules and ions into the gas phase (Karas & Hillenkamp, 1988). It produces ions in packets rather than a continuous beam and hence mass spectrometer is required that can either trap all the ions for a subsequent mass analysis or measure a complete mass spectrum without scanning a mass range. The speed of analysis depends on the speed of pulsing of the laser. Currently 200-Hz lasers are being used for MALDI applications in comparison to the typical laser of 20-Hz nitrogen laser (314 nm). Thus increase in speed has improved the speed of proteomics experiments. After the demonstration of the deposition of separations onto a sample plate (Walker et al., 1995) MALDI has become more effective. Since the pulsing of the laser is controlled by the operator, the data can be evaluated after each pulse and hence decisions can be made as to how to proceed.
Electrospray Ionization (ESI): An electrically generated fine mist of ions is sprayed into the inlet of a mass spectrometer at atmospheric pressure (Fenn et al., 1989) to create ions of the analyte. A potential difference is created between the capillary through which the liquid flows and the inlet to the mass spectrometer to form the small droplets of liquid which are further transferred into either a device to induce fragmentation of the droplets into increasingly smaller sizes or a heating device to cause evaporation of solvent. The ions are desorbed from the droplet to create bare ions when the droplets have reached the particular point known as Rayleigh limit, at which charge repulsion exceeds the surface tension of the droplet. These bare ions are then transmitted into the ion optics of the mass spectrometer. The desorption proceeds rapidly with proper tuning of heat and gas pressure i.e. electrospray elements. ESI can readily convert solution-phase molecules into gas-phase ions but the ions produced are multiply charged. Due to this calculation of molecular weight is complicated as there is no one-to-one relationship between m/z value and molecular weight (z = 1). ESI to the analysis of peptides and proteins has been advanced after a recognized work of the reduction in the flow rate of the liquid used to create the electrospray to create ions more efficiently (Emmett & Caprioli, 1994; Smith et al., 1990).
Mass analyzer: Relatively simple peptide mixtures are normally analyzed by the MALDI based MS whereas ESI based MS is preferred for the analysis of complex samples. In spite of it MALDI is similar in character toÂ ESIÂ as both techniques are soft ionization techniques in that ions are created with low internal energy and thus undergo little fragmentation, though MALDI produces many fewer multiply charged ions. Besides, MALDI is very sensitive and more tolerant trap than ESI in the presence of contaminants such as small amount of detergents or salts.
LC-MSs are used to perform alone or combination of the followings studies: (i) to measure the molecular masses of peptides and proteins (ii) to determine additional structural features including the amino acid sequence (iii) to find out the attachment-sites and (iv) to reveal the type of posttranslational modifications. And mass-to-charge ratios can be readily and accurately measured for the intact ions but this information does not provide data on the covalent structure of the ion. Hence after the initial mass determination, specific required ions are selectively subjected to fragmentation, referred as tandem mass spectrometry (MS/MS) to get the detailed structural features of the peptides from the analysis of the masses of the selected specific fragments of MS.
Mass analyser is, literally and figuratively, central to the LC-MS technology. There are four basic types of mass analyser currently used in proteomics research. These are quadrupole (Q), ion trap (linear ion trap, LIT; quadrupole ion trap, QIT), time of flight (ToF) and Fourier transform ion cyclotron resonance (FT-ICR) mass analyzers Their capabilities and analytical characteristics are summarized in Table 1.
Since these analyzers are very different in performance and design these can be used alone or put together in tandem way to utilize the advantage of the strengths and to overcome the weakness of each. Consequently 'hybrid' instruments i.e. Q-q-Q, Q-LIT, Q-TOF, TOF-TOF and LIT-FT-ICR have been designed and utilized, fruitfully, to tackle the challenges of proteomics studies (Aebersold & Goodlett, 2001; Aebersold & Mann, 2003) by combining the capabilities of various mass analyzers. Several mass analyzers as well as hybrid mass analyzers which are currently used are given below.
Ion trap (IT) mass analyzers: In IT mass analyzers, the ions are first 'trapped' or 'captured' for a certain time interval and are then subjected to MS or MS/MS analysis (Cooks et al., 1991) providing unmatched sensitivity and fast data acquisition. ITs are sensitive, robust and relatively inexpensive and hence provide high-throughput analyses of proteomics. However, limitation of ITs based resolution is their low-ion trapping capacity and their relatively low mass accuracy due to space-charging effects that distorts the distribution of ions.
The 'linear' (Hager, 2002) or 'two-dimensional ion trap'(Schwartz et al., 2002) (LIT) is an exciting recent development where ions are stored in a cylindrical volume with higher ion-trapping capacities, almost 10 times more ions than that of the traditional, three-dimensional ion traps, expanding the dynamic range, resolution, mass accuracy and the overall sensitivity of this technique. These also have multiple-stage sequential MS/MS capabilities and have been used in phosphoproteomics, one of the PTMs (Olsen & Mann, 2004). LITs have the ability to scan at much faster speeds (15,000 AMU/s vs 5500 AMU/s), which increases the number of scans that can be acquired over the course of an LC analysis. LITs have limits to the mass resolution or accuracy because at normal scan speeds unit resolution (âˆ¼2000) is obtained, but slowing the scan speed can yield much higher resolutions (15,000 resolutions over a 10 AMU window). With the decrease in scan speeds, the mass range has to be decreased and hence forced the ions of like charge to close together. Thus MS/MS is performed in LITs by measuring m/z values with respect to time rather than space and thus benefited with better ion statistics and increased scan speed. Consequently, more data can be acquired at better quality over a three-dimensional ion trap with the expense of mass accuracy and resolution.
Time of flight (ToF) mass analyzer: In ToF analyzer, a mass spectrum is measured by determining the flight time of ions in vacuum through a tube having a specific length. The time-of-flight of ions is related to their m/z values and thus a mass spectrum can be acquired. Mass accuracy in the parts per million (ppm) ranges can are achieved having resolving power more than 12,000. ToF mass analyzers are the basic analytical platforms for both ESI and MALDI method. In the MS mode of the Q-q-ToF instrument the quadrupole guides the ionized analytes to the ToF analyzer for the mass analysis while in the MS/MS mode, the first quadrupole provides the precursor ions to undergo fragmentation in the second quadrupole (through collision-induced dissociation (CID)) and followed by the analysis of the product ions in the ToF device. The obtained spectra exhibit high resolution with good mass accuracy in both the modes and hence it is said that the instrument is fruitful for quantitative proteomics, protein identification as well as for the PTMs proteomics.
The MS/MS based on the ToF-ToF MS (Medzihradszky et al., 2000b) does not show a dramatic increase in speed in comparison to the IT mass spectrometer unless ToF-ToF instrument was equipped with a MALDI source (Bienvenut et al., 2002) which provided a large quantity of sample to generate a large ion signal without accumulating signal for long periods for improving signal to noise ratio.
Quadrupole-Quadrupole Ion Trap (Q TRAP) mass analyzer: A novel use of the LIT is to combine it with two quadrupoles to create a configuration similar to a triple quadrupole i.e., the second analyzer is substituted by a LIT (Hager, 2002; Le Blanc et al., 2003). Another advantage of this type of configuration is its ability to use alternative scan modes by trapping the ions and measuring each trapped m/z value in turn and hence Q TRAP geometry offers the several scanning capabilities including product ion (Hager & Yves Le Blanc, 2003), precursor ion and neutral loss scanning. Q TRAP also offers the unique multiple reaction capability which allows the detection of specific transitions between the precursor and one fragment of a given peptide. LIT has an addition advantage (Hager, 2003) as in this case product ion spectra can be acquired as a function of the internal energy of the fragment ions by monitoring the stability of the ions as a function of time and hence resulted in a simplification of the tandem mass spectra for peptides.
MALDI have recently been coupled to Q TRAP (Krutchinsky et al., 2001) and to two types of ToF instruments. First instrument (Medzihradszky et al., 2000a) is that in which, two ToF sections are separated by a collision cell i.e. ToF-ToF instrument while the second instrument is the hybrid Q-ToF instrument in which the collision cell is placed between a Q mass filter and a ToF analyzer (Loboda et al., 2000). In a first mass analyser (ToF or Q) ions of a desired m/z are selected and further fragmented in a collision cell and for the 'read out' of fragmented masses by a ToF analyzer. These instruments have high mass accuracy, sensitivity and resolution. ESI can also interchangeably be coupled with Q-ToF instrument.
Fourier Transform Ion Cyclotron Resonance (FT ICR) mass analyzer: In this analyzers the ions are captured under high vacuum in a high magnetic field and provided a breakthrough in terms of sensitivity, resolving power and mass accuracy (i.e. upto low ppm- sub ppm range) (Lipton et al., 2002; Marshall et al., 1998; Martin et al., 2000; Senko et al., 1997; Valaskovic et al., 1996). It results better data quality and better peak capacity and thus more signals are detected in comparison to those instruments which result lower resolving power.
The hybrid FT-ICR instruments have been developed by coupling with an external LIT device to generate MS/MS spectra which provides accurate masses of the precursor ions. One strategy is to use two Q mass filters to isolate and to fragment the ions followed by their injection into ICR cell and their collection in the final stage. In the second way LIT is used outside the FT to collect ions followed by isolation and fragmentation of the ions prior to their injection into the ICR cell (Wilcox et al., 2002). Overfilling of the ICR cell results in poor signal to noise because its leads to space charging, in which ions are repelled from one another therefore distorts the natural motion of ions within the magnetic field. This shortcoming is overcome by this hybrid instrument as ions can be counted inside the LIT and therefore a proper number of ions can be injected accordingly into the ICR cell. This additional advantage would create a high impact in proteomics in future. However, slow acquisition rate, the expense, operational complexity, low peptide-fragmentation efficiency and the restricted dynamic range of IT MSs have limited their routine use in proteomics research.
Orbitrap mass analyzer: Recently, a new type of mass analyzer, Orbitrap, was invented (Makarov, 2000) having similar characteristics (i.e. resolution and mass accuracy) to an FT-ICR mass analyzer but without using the expensive superconducting magnet and used in proteomics research (Hu et al., 2005; Yates et al., 2005). In this mass analyzer a central spindle-like electrode has been introduced. The trapped ions orbit around this electrode and oscillate harmonically along its axis with a frequency characteristic of their m/z values and consequently an image current is induced in the outer electrodes and which is Fourier transformed into the time domain resulting the mass spectra.
A hybrid Orbitrap instrument has been introduced by coupling with LIT viz. LTQ-Orbitrap (Makarov et al., 2006; Venable et al., 2007) and therefore capabilities of LIT (i.e. robustness, sensitivity and MS/MS capability) combined with capabilities of the Orbitrap (i.e. very high mass accuracy and high resolution capability) to provide an advanced tool having resolution upto 40000 and mass accuracies upto 2 ppm for proteomics research.
Protein Identification Proteomics
The very first step in protein identification is the molecular weight determination. The MALDI- and ESI-MS technique can be utilized to analyze intact proteins or peptides. The protein identification or sequence determination of a protein via LC-MS can be achieved using either 'bottom-up' proteomics i.e. analysis of enzymatically or chemically produced peptides (Henzel et al., 1993) or 'top-down' proteomics i.e. in the form of whole-protein analysis (Bogdanov & Smith, 2005; Ge et al., 2002; Kelleher, 2004). Currently, various LC-MS based proteomic strategies have been developed for protein identification and characterization and broadly lie (Fig 2) in following two approaches.
Bottom-up Proteomics: Broadly, in 'bottom-up' proteomics protein presence is identified by peptide detection. These are usually conducted in two ways. First is 'sort-then-break' strategy which is applied to purified proteins or very simple mixtures of proteins and in which protein is fractionated off-line and then separated and digested with peptides. These inferred peptides (e.g. tryptic peptides) are then analyzed directly by 'peptide mass fingerprinting' (Henzel et al., 1993) or by LC separation and a MS/MS detection (Ogorzalek Loo et al., 2005). Second is 'break-then-sort' strategy, well-known as 'shortgun proteomics' (McDonald & Yates III, 2003) where protein is digested into smaller peptides by proteolytic cleavage prior to any prefractionation or separation and further separated by online multidimensional chromatography using MS/MS experiments (Fig 2) (Hunt et al., 1986; Yates III, 1998) under CID to obtain fragment ions for a database search (sequence tag). In this type of strategy typically rapidly scanning analyzers such as IT have been used. The amino acid-specific fragment ions, b-ions (N-terminus) and y-ions (C-terminus), obtained from the cleavage of amide bonds in polypeptide ions under typical CID conditions (Roepstorff & Fohlman, 1984) can be used to derive amino acid sequences of polypeptides, leading to the identification of proteins via a database search.
The separation strategy in 'sort-then-break' approach uses 1D or 2D gel electrophoresis with in-gel digestion of spots representing intact proteins, followed by capillary LC-MS/MS experiments. It can also provide the molecular weight of proteins and isoelectric point (2D gel) for protein identifications. It has several shortcomings as it is a labor-intensive operation, provide poor recovery of large or hydrophobic proteins and occur loss of proteins during gel separation and in-gel digestions. While in 'break-then-sort' approach (shortgun proteomics) the digested proteins are analyzed by online 1D- or 2D-LC-MS (Hancock et al., 2002; McDonald & Yates III, 2002). The obtained result as MS/MS spectra for individual eluting peptides are searched and correlated by a computer algorithmic program (Eng et al., 1994) or a relatively new computer algorithmic program, Mascot (Perkins et al., 1999) against a database of proteins derived from genomic sequencing. However, the online 2D-LC-MS method is termed as multidimensional protein identification technology (MudPIT) (Washburn et al., 2001) and is now widely implemented in proteomics research. It has been utilized in the analyses of organisms, complete cell lysates, tissue extracts, subcellular fractions and other subproteomes (Delahunty & Yates III, 2007). Recently, ultra-high-pressure LC (Livesay et al., 2007) and anion-and cation mixed-bed ion exchange techniques (Motoyama et al., 2007) have been used with MudPIT to improve peptide separation.
The several MS systems, which are previously elaborated in this chapter, i.e. Q-ToF (Chernushevich et al., 2001), LIT (Schwartz et al., 2002), FT-ICR (Bogdanov & Smith, 2005), Orbitrap (Hu et al., 2005), LTQ-Orbitrap (Makarov et al., 2006), LIT-FT-ICR (May et al., 2007) have been used successfully for 'bottom-up' proteomics.
There are few shortcomings in 'bottom-up' proteomics. The intact molecular protein species is not directly measured and hence its molecular form is unknown in this approach. Besides, for extremely complex mixtures of peptides the dynamic range might be an issue. However, there are too many peptides which are generated for direct mass spectral analysis and limited the general sequence coverage within 5-70 %. There is another challenge due to the possible presence of the same peptide sequence in multiple distinct proteins or in protein isoforms which can stop us to decipher and to determine the identities of proteins in the sample i.e. the protein inference may create problem. During MS/MS fragmentation at the peptide level post-translational modifications are likely to be lost. In spite of this 'bottom-up' has become most widely used approach in proteomics research due to its relatively high-throughput format, the availability of mature LC-MS instrumentation and excellent software development.
Top-down Proteomics: To overcome the shortcomings of 'bottom-up' proteomics, an alternative and relatively new proteomics i.e. 'top-down' proteomics, has been incorporated and utilized in the last few years (Collier et al., 2008; Kelleher, 2004; Liu et al., 2009a; Liu et al., 2009b; McLafferty et al., 2007). In 'top-down' proteomics, the intact protein molecular ions are subjected to gas-phase ionization and subsequent high-resolution mass measurement followed by their direct fragmentation inside the advanced mass spectrometer (i.e. FT-ICR, LTQ-FT-ICR and LTQ-Orbitrap) without prior digestion. Protein purification and fractionation is often performed offline. Theoretically, it covers the entire protein sequence under examination (i.e. 100 % coverage) and consequently it may provide relatively more complete characterization of protein isoforms and PTMs. In this approach we can examine the modifications indicated by the mass discrepancy between the DNA sequence predicted value and the measured mass, directly. Besides, the time-consuming protein digestion is eliminated in this approach. Thus 'top-up' proteomics increases the experimental efficiency and at the same time also decreases the error rates for proteins' identification and for quantitation proteomics.
The dependence of 'top-down' proteomics' success in only upon it's the ability to fragment the intact proteins. The dissociation efficiency has greatly improved with development of two dissociation techniques. One is electron-capture dissociation (ECD) (Cooper et al., 2005; Ge et al., 2002; Shi et al., 2000; Zabrouskov & Whitelegge, 2007; Zubarev, 2006; Zubarev, 2004; Zubarev et al., 1998) in which low-energy electrons are captured by multiply charged protein ions consequently fragmentation occurs by cleavage of N-C(R) bond and hence generate mainly c- and z-ions. Second is electron-transfer dissociation (ETD) (Mikesh et al., 2006; Syka et al., 2004) in which, singly charged anions transfer an electron to multiply charged protein ions and induce fragmentation similar to ECD. An advantage of ECD is that there is preservation of post-translational modifications during the fragmentation process.
However, large proteins (> 50 kDa) could not be readily analyzed due to increasing complexity, resulted from non-covalent interactions among the gas-phase protein ion's tertiary structure. Despite it, recently, the 'top-down' proteomics approach has been used to analyze proteins having relatively large molecular weight (229 kDa) by 'prefolding dissociation' to dissociate labeled typical residues from each terminus (Han et al., 2006). A 'middle-down' strategy has been introduced (Garcia et al., 2007c) using limited digestion to produce larger peptides (>5 kDa) either for very large proteins or for some specific proteins. These top-down and middle-down approaches have been used further in characterization of post-translational modifications of histones which has demonstrated an improved characterization of histone variants and their multiply modified forms (Garcia et al., 2007a; Garcia et al., 2007b; Pesavento et al., 2004; Siuti & Kelleher, 2007). Besides, several advanced instruments i.e. TOF (Ogorzalek Loo et al., 2001), LTQ-Orbitrap (Macek et al., 2006) and LTQ- FTICR (Parks et al., 2007) have been used to demonstrate the efficacy of 'top-down' proteomics.
Although the 'top-down' proteomics is a powerful tool in protein modification analysis, but the development of suitable MS activation methods/instrumentation for efficient MS/MS data acquisition and the better understanding of protein fragmentation patterns remain significant challenges. We know that it is primarily performed by directly infusing the single protein or the simple protein mixture (which is separated off-line) and hence improvement in the coupling of online separation techniques as well as in bioinformatics tools for database search would be required.
Successful detection and identification of protein-specific peptides confirm their presence within the sample. However, the failure to identify or detect a peptide does not necessarily mean that the protein is absent, as the peptides may simply be below the threshold of detection. Therefore, MS protein identification schemes provide a very limited picture of protein abundance in a sample because of its Boolean nature. Thus, although sensitive MS-based proteomic approaches readily identify a large number of proteins, bypassing the gel-visualization step deprives us of any measure of protein abundance in the sample. Moreover, most changes resulting from a targeted perturbation of a biological system are only detectable if some quantitative information is obtained.
There are several factors i.e. sample handling, digestion efficiency and separation affects the quantification of peptide and protein physicochemical properties and hence resulted into large differences in MS responses (Bantscheff et al., 2007). Thus it can be said that the relative abundances of different proteins may not be directed by relative peptide intensities. For instance ESI based LC-MS is mostly influenced by ion suppression (Tang et al., 2004). The quantity of the peptide being ionized, ionization efficiency and under some conditions the properties of coeluting peptides govern the peptide intensity. Ion suppression can be alleviated either by using internal standards (Annesley, 2003; Elliott et al., 2009) or by using lower flow rates (Schmidt et al., 2003; Tang et al., 2004). The LC-MS based quantitative proteomics is readily affected by separation peak capacity and reproducibility of the LC and the mass measurement accuracy and resolving power of the MS. To overcome these serious concerns ultra-performance LC and high-mass accuracy/resolution mass spectrometers have been developed and commercialized. These have made LC-MS based quantitative proteomics relatively more accessible and reliable to the biologists. In spite of this we cannot say that there is a "one-size-fits-all" method available (Table 2) which fulfills every quantitative need and answer all the peculiar biological queries. However, several approaches are being utilized for the quantitation proteomics as:
2-D Gel Electrophoresis Approach: This gel-based approach compares two sets of protein mixtures (from different cell states) run under standard operation conditions (Berkelman, 2008) and visualizes using immunological approaches or by using conventional protein stains. 'Bottom-up' proteomics is then applied to the cut out and digested proteins in gel spots having different intensities. This method is readily utilized to the proteins with post-translational modifications. The main drawback of this method is its gel-to-gel reproducibility. A related differential gel electrophoresis has been introduced which provide a dynamic range over four orders of magnitude. Besides, 2D gel electrophoresis is not fruitful for hydrophobic proteins, very large and very small proteins and very acidic or basic species.
On account of the facts that stable isotopes (H2, C13, N15 and O18) can be incorporated for sample labeling which would result in a mass shift and that mass shift would clearly distinguish the identical peptides from different samples within a single LC-MS analysis, a relatively new isotopic labeling methods have been introduced. So far, isotope labeling approaches have been described that are specific for sulphydryl groups (Gygi et al., 1999; Zhou et al., 2002), amino groups (Münchbach et al., 2000), the active sites for serine (Liu et al., 1999) and cysteine hydrolases (Greenbaum et al., 2000), for phosphate ester groups (Oda et al., 2001; Zhou et al., 2001) and for N-linked carbohydrates (Hui et al., 2003) and is being explained below.
Metabolic Isotopic Labeling Approach: In this approach unique isotopically labeled amino acids are exposed to cells. Then the digested peptides is separated with identical primary sequence from mixed cell lysates purely on the basis of the mass difference conferred by the incorporation of isotopically labeled amino acids. It was first applied to proteomic analysis by LC-MS using N15 (Oda et al., 1999) and successfully applied in future (Dong et al., 2007; McClatchy et al., 2007; Sechi et al., 2007), typically using heavy salts or amino acids (Conrads et al., 2002). Since, in the present case (where N15 is used for labeling) the amount of stable isotope present within each amino acid is quietly dependent on the number of containing nitrogen atoms the following downstream analysis may become more complex. Hence a popular alternative has emerged by using the stable isotope labeling of amino acids in cell culture (Ong et al., 2002) and termed as SILAC in which only selected amino acids are labeled. SILAC along with super-SILAC have been found fruitful in measuring the output of signaling networks and recognizing the protein interactions (Geiger et al., 2010; Guo et al., 2008; Krüger et al., 2008; Mann, 2006; Vermeulen et al., 2007). This metabolic isotopic labeling approach requires much less sample preparation for quantitative studies, although it cannot be used on tissue samples or body fluids.
Post-biosynthetic Tagging Approach: This type of labeling approaches can be applied to all set of samples because these are performed postlysis. First is the 'enzymatic labeling' in which the peptide bond hydrolysis step of the digestion is performed using H2O18 to result a carboxyl oxygen exchange with the addition of a second O18 (Mirgorodskaya et al., 2000; Schnölzer et al., 1996; Yao et al., 2001). Nevertheless, this reaction is enzyme-dependent and significantly less efficient and consequently requires an instrument which may provide high mass accuracy and high mass resolution capabilities to distinguish the small mass difference accurately.
Another is the 'chemical tagging approach' which relies upon the addition of an isotopically unique functional group to a peptide to distinguish separate samples by their unique mass and consequently, allows measurement of changes in expression levels between two proteomes in a single experiment. For the quantitation proteomics several methods have been developed, including isotope-coded affinity tags (Gygi et al., 1999) or similar reagents (Zhou et al., 2002), using O18/O16-labeled water (Stewart et al., 2001) and especially, isobaric mass tags, where the different mass tags are observed only upon fragmentation (Thompson et al., 2003). The complete molecule of isobaric mass tagging approach consists of three sections: a reporter mass group, a mass balancer group and an amine-specific reactive group. The overall mass of balance and reporter groups of the molecule is kept constant using differential isotopic enrichment with C13, N15 and O18. However, the amine-specific reactive group targets the peptide having amino-termini and lysine side-chains. A specific isobaric tags for relative and absolute quantification (iTRAQ) (Ross et al., 2004) in which incorporation of reporter mass ions was increased up to eight has become popular isobaric mass tag because it has reduced the amount of run time required to analyze multiple samples. This iTRAQ has become popular especially for labeling primary amino groups i.e. for those peptides which have lysine side chains and have N-terminals (Aggarwal et al., 2006; Yates et al., 2009). Since iTRAQ is possible through differentially tagging and spiking known amounts of standard peptides, and hence is not used commonly. In addition to drawback of possible side reactions other vital issue in chemical tagging approach is that its dynamic range may not be sufficient to cover expression levels spreading over a few orders of magnitude.
Targeted Approach: In this approach a synthetic stable isotope-labeled peptide having a known concentration is used according to the previous sampling results for protein quantification using MS by monitoring the intact peptide mass and specific fragment ions during the course of the run i.e. selective reaction monitoring (Gerber et al., 2003). This approach has been used to monitor phosphoproteomics related to the cell cycle transitions (Mayya et al., 2006).
Label-free Approach: The approach relies on the correlation of the peptide mass spectral peak data with the abundance of the protein in the sample (Old et al., 2005). Two available ways to measure the protein abundance are mass spectral peak intensities of peptide ions and number of MS/MS spectra assigned to represent protein amount. In the first's case, three or more peptides in common per protein is needed for reproducible quantitation as at least one peptide is shared by a pair of samples and is further utilized for peak area calculation from extracted ion chromatograms. However, in the second case four or more spectra per protein are often required for accurate quantitation as at least one spectrum in either sample pair is used from MS/MS data for any peptide in a given protein. A fully automated and computationally efficient label-free LC-MS method was developed for determining differences in complex mixtures followed by the targeted MS/MS analysis of identified differences (Meng et al., 2007; Petyuk et al., 2010; Wiener et al., 2004). This method selectively finds statistically significant differences in the intensity of both high- and low-abundance ions, responsible for the variability of measured intensities which is an important concern of label-free LC-MS studies. The label-free approach does not require metabolic labeling or labor-intense stable isotope labeling and applicable to various sample types (tissues and body fluids). Under-sampling where peptide co-elution and sample complexity result in an underestimation of abundance is sole limitation of this approach. To overcome this and improve the label-free quantification an accurate mass and time tagging approach was incorporated (Zimmer et al., 2006).
Post-Translational Modification Proteomics
Post-translational modifications (PTMs) proteomics control many biological processes and examine the nature of their modifications to help in the fundamental understanding of mechanisms of cell regulation. It has been used to determine the protein function by analyzing the changes in the cellular location, turnover, activity and interactions with other proteins (Choudhary & Mann, 2010). However, the standard protein profiling techniques have not succeeded to detect easily the alteration in the chemical state of a protein (Simon & Cravatt, 2008). Although recent development in efficient large-scale PTM proteomics have resulted many important information about how cells process (Choudhary & Mann, 2010) yet it is not sufficient. Capture and enrichment has played the key role in the PTM analysis. The challenges regarding to PTM analysis are the chemical diversity, the sub-stoichiometric level as well as the labile nature of modifications. Comprehensive PTM analysis would be impractical at any significant depth of coverage as more than 300 chemical PTMs have been identified, recently (Armah et al., 2007).
There are many PTMs occurring in proteins such as ubiquitylation, deamidation, acetylation and methylation. The identification of two common and novel PTMs proteomics, glycoproteomics and phosphproteomics is important in the understanding of biology and the preparation of therapeutic proteins. Advanced MS technologies allow identification of these two PTMs and provide a significant advantage over and complement other antibody-based approaches. Nevertheless, other affinity based enrichment methods i.e. lectin affinity chromatography and hydrazide capture coupled with MS for glycoproteomics studies. Additionally, the immobilized metal ion chromatography (IMAC) and metal oxide affinity chromatography (MOAC) coupled with MS for phosphproteomics studies. These have been applied in their respective proteomics research with significant success and are discussed below in detail. However, the molecular weight of peptides may be decreased or increased during PTMs; therefore, modification-specific signals are resulted in MS/MS.
Glycosylation is precisely a carbohydrate modification of proteins and it is an important PTM proteomics and plays crucial roles in biological events such as modulating protein structures, cell recognition and cell-cell interaction. Glycosylation plays an important role in protein stability, secretion, localization, function and turnover (Marino et al., 2010). It can introduce heterogeneity into the protein due to the generation of various glycoforms. Changes in levels and types of glycosylation can be associated with disease and hence it represents the most common modification for recombinant proteins products expressed in insect and mammalian cell lines. Two major types of glycoproteomics are N-linked and O-linked. The N-linked glycans are most commonly connected to asparagine residues while the O-linked glycans are attached to the hydroxy side chain of serine and threonine residues. Glycoproteomics are typically performed using following three approaches: (i) glycopeptides' characterization; (ii) intact glycoproteins based characterization of the glycan; and (iii) structural analysis of enzymatically or chemically released glycans.
A typical glycoproteomics involves enzymatic digestion of the glycoprotein so that each glycosylation site is located within a separate peptide followed by LC-MS and LC-MS/MS analysis (An et al., 2003; Wuhrer et al., 2007; Wuhrer et al., 2005) (Fig 3). The dissociation of a glycopeptide in a CID precursor-ion mode as sugar-specific oxonium ions from LC-MS/MS, allow the glycopeptides to be isolated from the mixture of peptides for further studies and also provide information on the primary sequence of the peptide, the type of sugar attached and the modified amino acid residue (Carr et al., 1993; Colangelo et al., 1999; Huddleston et al., 1993; Jedrzejewski & Lehmann, 1997).
There could be some critical issues in glycoproteomics i.e. the extensive gas-phase deglycosylation for the location of sugar site attachment as well as ionization efficiency of glycopeptides in comparison to their unmodified forms(Carr et al., 1993; Nemeth et al., 2001). Thus, several methods have been employed to overcome these potential issues:
By removing glycans through enzymatic digestion using N-glycosidase (for N-linked glycans) or Î²-elimination (for O-linked glycans) (Bond & Kohler, 2007). In the former case deglycosylation converts asparagine to aspartate with a mass increase of 1 Da and thus simplifies the downstream determination of the glycosylation site by LC-MS/MS. However, in second case, the elimination converts serine to alanine and threonine to aminobutytic acid with 16-Da mass losses, establishing the site of sugar attachment.
By using lectin column-mediated affinity purification of the N-linked glycopeptides followed by LC-MS analysis (Bunkenborg et al., 2004; Kaji et al., 2003; Posch et al., 2008; Xiong & Regnier, 2002) (Fig 3).
Top-down sequence analysis of whole glycoprotein ions using CID (Reid et al., 2002) by eliminated gas-phase deglycosylation of N-linked glycopeptides or using ETD (Schroeder et al., 2005) by characterizing glycosylation of paxillin, resulted a novel glycosylation site on Ser 74. Beside this, further development of suitable software tools will greatly enhance throughput in glycoproteomics (Ashline et al., 2007).
Phosphoproteomics is a reversible PTM involved in the majority of cellular processes as it plays a vital role in intercellular communication and also in the functioning of the nervous and immune systems. The normal extent of phosphorylation proteins in the human genome is possible up to 30 % (Cohen, 2002) but the abnormal phosphorylation is linked as a cause of various human disease states (Blume-Jensen & Hunter, 2001). It happens mostly on serine, threonine and tyrosine residues but in very less extent on histidine, arginine, lysine, cysteine, glutamic acid and aspartic acid in eukaryotic cells. Protein phosphorylation sites elucidation may reveal the potential drug targets and contribute to understand the cell-signaling mechanisms.
The general approach for phosphoproteomics is based on detection of the phosphor moiety of the phosphorylated peptides under CID conditions in MS/MS experiments (Gafken & Lampe, 2006; Goshe, 2006). In positive ion mode the neutral loss scanning (98 Da, H3PO4 or HPO3 and H2O) (Chang et al., 2004; Covey et al., 1991) and in negative ion mode the precursor ion scanning (79 Da, PO3) (Annan et al., 2001; Huddleston et al., 1993; Neubauer & Mann, 1999) detect the loss of the phosphate group to identify phosphopeptides from complicated samples using a Q TRAP (Le Blanc et al., 2003). Top-down approach has been applied to characterize the phosphopeptides by utilizing ECD (Shi et al., 2001) and to detect PTM of paxillin (a focal adhesion adapter protein) resulting 29 phosphorylation sites by utilizing ETD (Schroeder et al., 2005). Multiple reactions monitoring has been used in phosphoproteomics to monitor for the loss of a phosphate group (by a transition of 98 Da) with the increased sensitivity (Unwin et al., 2005).
Phosphoproteomics faces several challenges including low stoichiometry due to the transient nature of this PTM, signal suppression of phosphate-containing peptides in the positive ion mode, the inherent labile nature of the phosphate group resulting to the neutral loss of HPO3 (80 Da) upon CID and difficulty in achieving full sequence coverage of the peptide for long phosphopeptides. To overcome these shortcomings relatively mature techniques i.e. immobilized metal ion typically Fe3+ or Ga3+ ions, affinity chromatography (IMAC) (Andersson & Porath, 1986; Corthals et al., 2005; Ficarro et al., 2002; Posewitz & Tempst, 1999) and metal oxide typically TiO2 or ZrO2, affinity chromatography (MOAC) (Aryal & Ross, 2010; Cantin et al., 2007; Kweon & Håkansson, 2008; Kweon & Håkansson, 2006) have been used for selective enrichment of phosphopeptides. Affinity-based enrichment of phosphotyrosine-containing proteins has also been explored using immunoprecipitation with anti-phosphotyrosine antibody (Pandey & Mann, 2000; Zheng et al., 2005).
Proteomics has become an essential part of systems biology research because proteins have extremely valuable information for the description of biological processes. It faces several challenges including the alterations of the proteins e.g. roughly 200 forms of post-translational modification (PTM) can be found in proteins. Another challenge is to express all proteins at equal or at least up to similar levels in the proteome. For instance the 12 most abundant proteins (i.e. albumin, immunoglobulin G, immunoglobulin A, immunoglobulin M, fibrinogen, transferrin, haptoglobin, Î±2-macroglobulin, Î±-1-acid glycoprotein, Î±-1-antitrypsin and high density lipoproteins (Apo A-I and Apo A-II)) contribute to the most of protein mass of human blood (i.e. approximately 95 % of total) (Zhang et al., 2010) and hence prior removal of these proteins from a biological sample is must. This prior removal in essential because in the ionization process the peptides generated from these proteins may interfere with peptides generated from the less abundant proteins as the low abundant proteins may not be detected by MS. Additionally, the fact that the majority of proteins are in the low abundance class make the situation worse. However, in comparison to the dynamic range of a LC-MS (104-106) the concentration range for protein expression levels in human plasma is very high (1011) (Anderson & Anderson, 2002). Therefore, for effective analysis the proteome must be fractionated prior to detection and quantification by MS. But rapid advances in LC-MS technologies suggest that these limitations may be transient.
There are several key issues which are being taken into consideration nowadays and in upcoming future.
â€¢ Improved LC systems can be utilized in LC-MS: As illustrated in this chapter, LC-MS is proved to be an effective tool in proteomics studies. Coupling of ultra high-performance liquid chromatography (Motoyama et al., 2006; Plumb et al., 2004) or strong cation exchange and reversed-phase LC with ion mobility (Liu et al., 2007) with MS has enhanced the effectiveness of LC-MS in proteomics studies. Additionally, LC using microfluidic devices (Fortier et al., 2005; Hardouin et al., 2006; Yin et al., 2004) can be a valuable tool in identifying low-abundance proteins in proteomics studies.
â€¢ Optimized use of newer MS in LC-MS: To analyze complex protein samples better, currently, the use of different types of newer mass spectrometer including ToF MS (Liu et al., 2004) MALDI and SELDI (Kiehntopf et al., 2007) have been increased in both the 'bottom-up' and 'top-down' proteomics.
â€¢ Advancement in 'top-down' proteomics in LC-MS: With new advances in sample preparation, ion activation methods and software development the use of LC-MS in the top-down proteomics can become very effective.
â€¢ Use of 'bottom-up' and 'top-down' approaches as complimentary LC-MS technique: Since the 'top-down' proteomics is an emerging field as in this case intact proteins are analyzed directly using high-resolution MS and dissociated subsequently by a MS/MS (Ge et al., 2002; McLafferty et al., 2007; Parks et al., 2007; Siuti & Kelleher, 2007; Syka et al., 2004) and hence can be combined with the conventional and successful 'bottom-up' proteomics to offer complimentary approaches (Deterding et al., 2007; Strader et al., 2004) to achieve the best results for proteomics studies in complex biological systems.
In this way this chapter concludes with the fact that LC-MS has become a prerequisite proteomics tool for the investigating the protein interactions, protein expressions and protein modifications. The role of LC-MS has been drastically increased due to introduction of new instruments which work on new fragmentation strategies and analysis methods. Nevertheless, the complete qualitative and quantitative proteomics study remains a tremendous challenge due to less sensitivity and low dynamic range of the currently available and popular methods. Continued improvements in LC-MS based proteomics will become crucial as the biggest challenge for the future would be in the use of the database which has been gathered till now.