Application of Proteomics in Disease Diagnosis
Disclaimer: This work has been submitted by a student. This is not an example of the work written by our professional academic writers. You can view samples of our professional work here.
Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.
Published: Tue, 08 May 2018
- 1.1 Brief history of the development of proteomics
- 1.2 Two-dimensional protein electrophoresis
- 1.3 Mass spectrometry
- 1.4 Protein arrays
- 1.5 Protein databases
- 2.1.1 Hodgkin’s lymphoma
- 2.1.2 Acute leukaemias
- 18.104.22.168 Acute myeloid leukaemia
- 22.214.171.124 Childhood acute lymphoblastic leukaemia
- 2.2.1 Ovarian cancer
- 2.2.2 Prostate cancer
- 2.2.3 Breast cancer
- 2.2.4 Colorectal cancer
- 2.2.5 Bladder cancer
- 2.2.6 Melanoma
- 2.2.7 Other cancers
- 2.4.1 Cardiovascular disease
- 126.96.36.199 Cardiovascular disease biomarkers
- 2.4.2 Alzheimer’s disease
- 2.4.2 Inflammatory bowel disease
Although genomics has been the focus of considerable research in recent years, it is now widely acknowledged that proteins are responsible for the biochemical complexity which exists within the cells of the human body due to their dynamic nature and interactions with other proteins. Advancing techniques in proteomics offer a broad range of tools for the identification and characterisation of proteins from human cell and tissue samples. Proteomics plays a major role in the identification and validation of new diagnostic biomarkers for many types of blood-related and infectious diseases and various cancers. Current proteomics techniques and the application of proteomics in each of these disease areas will be examined.
The term ‘proteome’ was first used in 1994 and refers to all of the proteins within a cell or tissue (Cristea et al. 2004). Proteomics aims to determine not only the level of expression of a given protein, but also its structure, modifications, localisation and interaction with other proteins (Rezaul et al. 2008). The ultimate goal of proteomics is to study all of the expressed proteins within a given biological system (de Hoog and Mann 1997; Bertucci et al. 2006). Genomics has been the focus of intense research in recent years; sequencing of the human genome is now complete and its size has been shown to be approximately 3,200 Mb. A total of 35,000 protein-coding genes have been identified, comprising just ï€¼2% of the total genomic DNA. By studying the genome to better understand the function of these genes, it became apparent that it is difficult to accurately predict gene products from the gene sequence alone, since genes are transcribed in messenger RNA (mRNA). Cells use a process known as alternative splicing to generate diversity. This yields a family of structurally related mRNAs each with a subset of exons that encodes one protein within a whole family of protein isoforms and there is no single relationship between gene sequence and mRNA transcript. Investigating gene expression at the protein level is therefore more informative – it can be argued that proteins are responsible for the true biochemical complexity within cells and tissues since a single gene can result in many different protein isoforms which can become active by post-translational medication or interaction with other proteins. In many human diseases, incorrect modification or conformation leads to the disease state (e.g. in Alzheimer’s disease and Parkinson’s disease, both caused by abnormal folding), and these modifications cannot be detected at the gene level.
In contrast to the human genome, a complete proteome has not yet been characterised for any organism. Further, while the genome is relatively static and unchanging, cellular proteins are highly dynamic and are constantly undergoing changes such as cell membrane binding, interactions with other proteins and/or gaining or losing various chemical groups. Proteins also play a vital role in intra- and inter-cellular communication networks to enable them to respond to the changing needs of the organism. Proteomics has the potential for broad applicability across many areas of medicine and biology and is becoming increasingly widely used in the study of human disease to identify novel biomarkers for use in screening, diagnosis and early detection, and the prediction of therapeutic response (Azad et al. 2006). Although a few clinically approved protein biomarkers are currently available, effective biomarkers are not yet available for many cancers and other diseases and further research is needed to identify suitable markers (Cho 2007). Other promising areas of proteomics research include the investigation of patterns of altered protein expression at the cellular, tissue and subcellular levels; and the identification of new therapeutic protein targets (Hanash 2003). The international Human Proteome Organisation (HUPO) was established in 2001 with the aim of creating a global network to encourage the development of proteomics technologies and sharing knowledge and expertise. A number of databases have also been established to store the sequences of the proteomes of both normal and disease human cells and tissues.
This paper examines techniques for proteome profiling in current practice and discusses present and future applications of proteomics in the diagnosis of human diseases including blood-related disease, various types of cancer, and infectious disease.
The earliest technique for mass spectrometry was described by Thompson back in 1899; however, it was not until 1975 that two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) was developed by O’Farrell as a method for separating proteins from Escherichia coli. In 1988, matrix-assisted laser desorption ionisation (MALDI) was developed, a technique that overcame the technical difficulties associated with earlier ionisation methods used in mass spectrometry that allowed large biomolecules to be analysed. One year later, in 1989, the use of electrospray ionisation (ESI) was described offering an additional ionisation method for large biomolecules. In the same year, peptide mass fingerprinting was developed, but this method was widely used due to the need for specialised instruments and it not until the introduction of the first commercial instruments for MALDI mass spectrometry in 1992 that the use of peptide mass fingerprinting became widespread. The use of isotope affinity tags for the quantitative analysis of complex protein mixtures was developed in 1999, followed two years later by the development of the first highly sensitive techniques for protein detection in 2001. These included difference-gel 2D-electrophoresis (DIGE), surface enhanced laser desorption/ionisation time of flight (SELDI-TOF) mass spectrometry and protein arrays. In 2003, techniques for immunodepletion (i.e. the removal of high abundance proteins from protein mixtures to allow the identification and analysis of low abundance proteins) were introduced. Those techniques which have had the greatest impact on furthering our understanding of proteins involved in human disease are discussed below.
Two-dimensional (2D) protein electrophoresis was traditionally the method of choice for protein separation, allowing up to 10,000 proteins to be distinguished visually from complex protein mixtures from cells, tissues or body fluids (Andreoli et al. 2006). In the first dimension, proteins are separated on a thin gel layer according to their isoelectric charge (i.e. a pH value at which the overall charge on the protein is equal to zero). A pH gradient is created within the gel such that when an electric current is applied, proteins migrate towards either the anode or cathode, according to their overall charge, to the point where the pH of the gel is equal to the isoelectric point of the protein. In the second dimension, polyacrylamide gel electrophoresis (PAGE) is used to separate proteins according to their size. After both stages of the separation, proteins are visualised by either staining or radioactive labelling. Individual proteins can be excised and purified, for example by mass spectrometry (Kavallaris and Marshall 2005).
While undoubtedly still a core technique for the analysis of complex protein mixtures in human samples, the value of 2D electrophoresis in modern proteomics research is the subject of debate. It is argued that this technique is time-consuming and still lacking adequate automation (Cristea et al. 2004); further, poor reproducibility; the possible loss of certain proteins within a sample during the extraction process; and lack of sensitivity, with certain types of proteins such as membrane proteins, those present in very small amounts, or those very small in size being difficult to visualise, also present issues for concern.
Mass spectrometry has been increasingly used over the past decade for analyses involving complex samples of proteins. Many of the recent advances in proteomics are down to the increased sensitivity of mass spectrometry and the ability of this technique to detect and characterise low levels of proteins (Cristea et al. 2004). Mass spectrometry measures the mass-to-charge ratio (m/z) of a mixture of ions in the gas phase under vacuum, and the number of ions present at each m/z value. The mass spectrometer consists of a source of ionisation, a mass analyser and a detector (Aebersold and Mann 2003). Proteins are first digested with an enzyme (usually trypsin) which cleaves the protein at specific amino acid sequences. The ionisation source removes electrons from the protein fragments to produce positively charged particles. The charged peptide fragments are separated in the spectrometer according to their m/z, using a method that varies according to the type of spectrometer (e.g. time of flight [TOF] through a flight tube), while the detector measures the signal intensity of each fragment. The resulting mass spectrum consists of a series of peaks, each of which corresponds to a particular peptide fragment present within the sample. The height of the peak is related to the abundance of protein in the sample (Aebersold and Mann 2003).
A major drawback of the older mass spectrometry techniques was that proteins became too fragmented. The development of two newer low-energy ionisation techniques, matrix assisted laser detection of desorption/ionisation (MALDI) and electrospray ionisation (ESI) has been beneficial since they can produce ions in the gas phase from a variety of biomolecules including peptides, proteins, drug metabolites and oligonucleotides, without fragmenting them too much and destroying them (Cristea et al. 2004). A number of commercial mass spectrometers are now available that use either MALDI or ESI.
In MALDI-TOF mass spectrometry, protein samples are mixed with a laser-reactive matrix solution. The resulting solution is then applied to the surface of a metal MALDI plate and the proteins that are bound to the substrate are ionised. SELDI-TOF mass spectrometry is a combination of chromatography and mass spectrometry that uses an affinity-based method of mass spectrometry comprising a protein chip modified with a chromatographic affinity surface. The protein sample is applied to the chip and captured proteins are ionised using matrix-assisted ionisation. A spectrum of ion m/z ratios is generated from the sample. While individual proteins within the sample cannot be identified from these spectra, the relative abundance of proteins with each m/z ratio can be determined.
Protein arrays or protein chips first became available in 2001 and allow the detection and comparison of large number of different proteins. Early approaches attempted to reproduce existing biochemical and immunological assays, such as the enzyme-linked immunosorbent assay (ELISA) in a miniature form (Emili and Cagney 2000); Laurell and Marko-Varga 2002). Protein arrays comprise a library of proteins immobilised on a 2D grid contained on a biochip. These chips are able to extract and retain proteins from media
Various types of protein chips are now available including antibody arrays, antigen arrays and tissue arrays. The major advantage of this type of technology is that it allows large numbers of proteins on a single chip to be scanned simultaneously and data obtained on individual proteins. Limitations of existing protein array techniques include the method of attachment of proteins to the chip, which can result in proteins being fixed in random orientations that can cause damage and inactivation; and the need for prior purification of proteins which adds additional time and cost to this methodology.
Identification of proteins from the sequence obtained by mass spectrometry has been made possible by the development of a number of protein databases. The first major protein database, Swiss-Prot, was established in 1986 and is maintained by a number of institutions across Europe. Since then, a number of other databases including the Protein Information Resource (OIR), Protein Research Foundation (PRF) and the Protein Data Bank (PDB) have been set up in a variety of different countries.
According to a quote from Lion and Tissot (2008) in their recent editorial: “…proteomics is pervading all fields of blood-related disciplines…and is beginning to be used as a qualification tool for clinical practices in transfusion services”. Plasma is among the most widely studied of all body fluids. In 1977, a total of just 40 proteins had been identified, while the HUPO Plasma Proteome Project now holds data on a total of 3020 non-redundant proteins (Omenn et al. 2005; States et al. 2006). In contrast to the plasma proteome, the red blood cell and platelet proteomes are both much less well researched. By 2006, a total of 566 red blood cell membrane proteins had been identified (Pasini et al. 2006), with 641 platelet proteins identified in 2005 (Moebiust et al. 2005). Proteomics can be used to diagnose blood-related diseases, by comparison with reference cell lines or healthy control individuals, and study disease pathology. Two examples of the application of proteomics in the diagnosis of blood-related disease are discussed below.
Hodgkin’s lymphoma is a cancer originating in the lymphocytes that is characterised by the presence of large multinucleate cells known as Reed-Sternberg cells. An experiment was conducted to investigate various lymphoid neoplasms including Hodgkin’s lymphoma, Burkitt’s lymphoma, B-cell acute lymphoblastic leukemia, anaplastic large-cell lymphoma, adult T-cell leukemia and cutaneous T-cell lymphoma (Fujii et al. 2005; Fujii et al. 2006). Results showed that the protein profile of Hodgkin’s lymphoma cells was different from other lymphoid neoplasm cells and also from the control cells used in the experiment (i.e. reference B and T cells), and although Hodgkin’s lymphoma cells develop from B cells, they appear to develop their own distinct pattern of protein expression throughout the progression of the disease.
The profile of proteins of Hodgkin and Reed-Steenberg cells in culture have been analysed, in order to investigate their secretory factors that interact with inflammatory cells and which may act as biomarkers (Ma et al. 2008). A total of 1290 proteins were identified in the cell culture medium, of which 368 were believed to be secretory proteins; among these, 37 were classed inflammation or immune response proteins and 26 as cell communication proteins. Further assessment of the inflammation/immune response proteins revealed the presence of classical Hodgkin’s lymphoma-associated chemokines (e.g. TARC, which recruits Th2 and CD4+CD25+ regulatory T cells; IP-10; and RANTES, which attracts T cells, natural killer cells and eosinophils) and some new ones (e.g. fractalkine), together with a group of cytokine-related proteins (e.g. IL-IR2, MIF, CD44 and IL-25). The presence of these new secretory proteins in the supernatant of Hodgkin’s lymphoma cells allow hypotheses to be proposed on the mechanisms by which these cells evade the cytotoxic T-lymphocyte response and trigger the inflammation and proliferation of reactive cells. These proteins may also serve as biomarkers.
Acute myeloid leukaemia (AML) is an aggressive blood cancer that is characterised by the accumulation of myeloid precursor cells in the bone marrow (Sjøholt et al. 2008). Five-year survival rates are typically poor (approximately 50%) with current therapies. Proteomic analysis using 2-DE and MALDI-TOF analysis identified a number of proteins that are differentially expressed between different subtypes of AML cells (López-Pedrera et al. 2006). These included suppressor gene proteins, metabolic enzymes, antioxidants, structural proteins and signal transduction mediators. Of these, seven proteins were found to be significantly altered in AML blast cells compared with normal mononuclear blood cells: alpha-enolase, RhoGD12, annexin A10, catalase, peroxiredoxin 2, tromomyosin 3, and lipocortin 1 (annexin 1). All of these are known to play major role in cellular functions such as glycolysis, tumour suppression, apoptosis, angiogenesis and metastasis and may servce as potential candidates for biomarkers for diagnosis of AML.
In childhood acute lymphoblastic leukaemia (ALL), there is an accumulation of immature lymphocyte cells, known as blast cells blast cells, within the bone marrow. Response to initial glucocorticoid therapy is a reliable predictor of response to multiagent chemotherapy (Lauten et al. 2006). Patients who are resistant to glucocorticoids (i.e. prednisone poor responders, PPR) show poorer survival than glucocorticoid sensitive (i.e. prednisone good responders, PGR) patients. Experiments using 2-DE and ELDI-TOF mass spectrometry were performed to investigate differential protein expression in leukaemic blasts from PPR and PGR patients. Results showed that those proteins overexpressed in the PPR group included catalase, RING finger protein 22 alpha, valosin-containing protein (VCP) and a G-protein-coupled receptor. Further validation of VCP showed that high levels of protein expression are associated with poor prednisone response and this protein may provide a suitable biomarker candidate for diagnosis of glucocorticoid resistance in ALL.
Over 11 million people worldwide are currently diagnosed with cancer every year, a figure estimated to rise to 16 million by the year 2020 (Cho 2007). Biomarkers offer enormous potential in oncology, both as diagnostic tools and in monitoring disease progression and the efficacy and safety of cancer therapies. The application of proteomics to this field provides the opportunity to develop a full understanding of disease pathology, and develop new biomarkers for detection and early diagnosis. The use of proteomics to identify and characterise potential new biomarkers for various blood-related cancers was discussed in the previous chapter. This chapter will focus on the use of proteomics in other types of cancer.
Epithelial ovarian cancer is the leading cause of mortality among the gynaecological cancers due to its asymptomatic nature, lack of early detection markers and resistance to currently available chemotherapies (Jurisicova et al. 2008; Nossov et al. 2008). Ovarian cancer is typically diagnosed at an advanced stage with a survival rate after 5 years of just 20%. Currently available tests for ovarian cancer include CA-125, transvaginal ultrasound, or a combination of both of these but all lack adequate sensitivity and specificity for general screening. There is therefore an urgent need for new biomarkers for diagnosis. A number of proteomics-based research projects have identified several potential candidates. For example, glycoproteomic analyses were used to investigate glycosylated proteins present in media from ovarian cell lines, in sera from patients with ovarian cancer, and in sera from healthy control patients (Li et al. 2008). Analysis of sera from cancer patients showed the presence of 4 abundant glycoproteins. Of these, fibronectin and immunoglobulin A1 were shown to produce N-linked glycan fragments that differed from those in control serum samples and may have potential as disease biomarkers.
Prostate cancer is the most common non-cutaneous male cancer in Western Europe and the United States (Byrne et al. 2008; Shariat et al. 2008). While the discovery of prostate-specific antigen (PSA) significantly improved management of this disease during the 1990s, the lack of specificity of PSA and its inability to differentiate between prostate cancer and other benign conditions have led to over-detection and over-treatment in some patients (Byrne et al. 2008; Shariat et al. 2008). There is therefore a need for new prostate cancer biomarkers with improved sensitivity to improve the accuracy of diagnosis. One promising biomarker identified using 2-DE and mass spectrometry is annexin I, which experiments have shown is underexpressed in early stage prostate cancer (Cho 2007). In other research, proteomic analysis of serum samples from 12 patients with either of two grades of prostate cancer (Gleason score 5 or 7) was performed using immunoaffinity depletion and 2D-DIGE. Results showed differential expression of 63 proteins between the 2 groups, with 13 of these differences reaching statistical significance (pï€¼0.05) (Byrne et al. 2008). Two of these differentially expressed proteins, pigment epithelium-derived factor (PEDF) and zinc-alpha2-glycoprotein (ZAG) have undergone extensive validation and it has been found that PEDF is a more accurate predictor of early stage prostate cancer than ZAG.
Other blood-based biomarkers that have shown promise in early phase studies include human glandular kallikrein; early prostate cancer antigens; insulin-like growth factor-I (IGF-I) and its binding proteins, IGFBP-2 and IGFBP-3; urokinase plasminogen activation system, transforming growth factor beta1; interleukin-6; chromogranin A, prostate secretory protein; prostate-specific membrane antigen; PCa-specific autoantibodies; and alpha-methylacyl-CoA racemase (Bensalah et al. 2008).
Breast cancer is a significant cause of morbidity and mortality worldwide. Early detection is key in reducing mortality, but the lack of efficient detection methods hinders diagnosis of this disease (Gast et al. 2008). Isotope-coded affinity tag tandem mass spectrometry has been used to show that alpha2 HS-glycoprotein is under-expressed in nipple aspirate fluid from tumour-containing breasts, while the proteins lipophilin B, beta-globin, hemopexin and vitamin D-binding protein precursor are all overexpressed (Pawlik et al. 2006). The use of other proteomics techniques including MALDI-TOF and SELDI-TOF mass spectrometry have yielded many protein peaks that may be candidate biomarkers, but few of these have been structurally identified (Garrisi et al. 2008). There is therefore considerable potential for new biomarkers for early diagnosis of breast cancer but further research is needed to fully elucidate those candidates identified to date.
Improvements in early detection of colorectal cancer are urgently needed (Liu et al. 2006; Ransohoff et al. 2008). SELDI-TOF mass spectrometry was used to analyse the differential expression of proteins in serum of patients with colorectal cancer and healthy control subjects. Results showed that the pattern of differentially expressed peaks within the spectra obtained could distinguish between patients with colorectal cancer and healthy subjects with 95% sensitivity and specificity (Liu et al. 2006). In other research, immunoblotting and microarray analysis were used to identify six proteins that are overexpressed in colorectal tumour tissues: ANXA3, BMP4, LCN2, SPARC, MMP7 and MMP11 (Madoz-Gurpide et al. 2006). To identify those proteins that are regulated during colorectal cancer, 2-DE was performed using samples of neoplastic colorectal tissue and normal samples for comparison (Xing et al. 2006). The protein secretagogin, which is expressed by the endocrine cells, was shown to be significantly down-regulated in neoplastic colorectal tissue and this protein may therefore be a potential disease biomarker for colorectal cancer.
A number of proteins have been shown using 2-DE and mass spectrometry to be associated with bladder cancer including fatty acid binding proteins, annexin V, heat shock protein (Hsp) 27 and lactate dehydrogenase (Sheng et al. 2006). The same researchers also identified a group of proteins which show altered patterns of expression in patients with bladder cancer, and this group includes annexin I, 15-hydroxyprostaglandin dehydrogenase, galactin-1, lysophospholipase and mitochondrial short-chain enoyl-conezyme A hydratase 1 precursor. Further validation of these proteins may identify diagnostic biomarker candidates.
Although melanomas constitute just 4% of all skin cancers, they are responsible for 80% of skin cancer-related deaths and current diagnostic methods are often inadequate (Rezaul et al. 2008). Accurate early diagnosis may reduce the need for adjuvant therapy and long-term follow-up, while the identification of patients at high risk of recurrence may allow for more cost-effective disease management. The use of proteomics to increase understanding of the changes that take place during disease progression and recurrence would therefore be highly beneficial in diagnosis and management.
Proteomics techniques have also been used to investigate differential protein expression and regulation in many other types of cancers in addition to those discussed here. These include oesophageal cancer, gastrointestinal stromal tumours, hepatocellular carcinomas, nasopharyngeal carcinoma, pancreatic cancer, renal cancer, and urothelial carcinoma (Cho 2007).
Many infectious diseases have been investigated using proteomics including HIV/AIDS, tuberculosis, malaria, measles, hepatitis and severe acute respiratory syndrome (SARS) (List et al. 2008). This paper will discuss the application of proteomics in infectious disease in tuberculosis and SARS.
The human immunodeficiency virus (HIV) infects several different types of cells during the progression of infection to full-blown acquired immune deficiency syndrome (AIDS), and a persistent infection is established in CD4+ T lymphocytes and macrophages (Berro et al. 2007). MALDI-TOF has been used to demonstrate the presence of 17 different proteins on the surface of T-cells infected with HIV virus-1 (HIV-1), compared with parental cells, and the differential expression of a number of these proteins including Bruton’s tyrosine kinase and X-linked inhibitor of apoptosis. These proteins may be of potential value as diagnostic biomarkers for HIV, but further validation and characterisation is needed.
Proteomics has also been used to study the interaction between HIV and mononuclear phagocytes (Ciborowski and Gendelman 2006). Mononuclear phagocytes include bone marrow monocyte-derived macrophages; histiocytes; alveolar macrophages; Kupffer cells; perivascular macrophages; and microglia) and function as reservoirs for HIV. Proteomics has been used to investigate the way in which the virus alters the immunoregulatory activities of mononuclear phagocytes and examine the relationship between the virus and the host cell.
Tuberculosis constitutes a considerable global burden. Increased migration has led to growing prevalence rates of TB in countries such as the UK, where this disease has previously been well-controlled through effective screening and immunisation programmes. Further, drug-resistant strains of the causative pathogen, Mycobacterium tuberculosis, are also a growing cause for concern (Kavallaris et al. 2005). A reliable screening test to detect pre-clinical infection is needed to allow early initiation of treatment which could potentially reduce transmission of the disease. The use of proteomics has identified two proteins that are secreted in vitro by common clinical isolates of M. tuberculosis, namely rRv3369 and rRv3874, which may have potential as serodiagnostic antigens (Bank et al. 2004). More recently, proteomic fingerprinting has been used together with mass spectrometry and pattern recognition methods to identify other diagnostic biomarkers for tubercu
Cite This Work
To export a reference to this article please select a referencing stye below: