bioinformatic prediction of lipoprotein in gram positive bacteria

Published:

With the well established features of signal peptide and well conserved lipobox box with invariant cysteine of Gram-positive bacteria, bioinformatic analysis of these signal peptides will be able to predict the number of lipoproteins present in Gram-positive bacteria using the signal peptide sequence features and the well conserved lipobox sequence within it.

Signal peptides tend to be shorter that secretory signal peptides which indicate that the c-region is shorter and contains apolar amino acids when both Gram-positive and Gram-negative signal peptides were analysed. It implies that it is a continuation of the hydrophobic domain which is differentiated primarily by the sequence conservation preceding the invariant lipid-modified cysteine.

To derive to the prediction of lipoproteins using bioinformatic analysis, Sutcliffe and Harrington, 2002 in their experiment had first created a dataset of experimentally verified Gram-positive Lipoproteins that contain lipidation in vivo.There were several requirements were considered to prove the lipoprotein is lipidated. They were (1) metabolic labelling of the protein, in its source organism, with exogenously supplied radiolabelled fatty acid (normally palmitate, incorporation of which is typically detected by autoradiography following protein electrophoresis). 2) Inhibition of Lpp signal peptide processing by treatment with the antibiotic globomycin, which specifically inhibits Lsp (Inukai et al., 1978). (3) Chemical characterization of the purified protein. (4) Evidence that protein processing is disrupted by mutation in either Lgt or Lsp, or following site directed mutagenesis to replace the lipobox cysteine. Within this set of criteria and along with extensive review of scientific journals, 33 proteins were identified as proven bacterial lipoprotein and were tabulated below in the table below. Streptococcus Pyogenes LppC protein was identified with inhibition of Lpp signal peptide processing by treatment with the antibiotic globomycin, which specifically inhibits Lsp (experimental evidence 2).

Lady using a tablet
Lady using a tablet

Professional

Essay Writers

Lady Using Tablet

Get your grade
or your money back

using our Essay Writing Service!

Essay Writing Service

It is this sequence, typically Leucine−3-Alanine/Serine−2-Glycine/Alanine−1-Cysteine+1 at positions -3to +1 that is referred to as the lipobox sequence and the consensus pattern of amino acid usage have been documented in Prosite under the pattern entry PS00013/PDOC00013. Based on these sequences, there were several pattern expressions being modified to reduce false positive lipoprotein being predicted in the Gram-positive genome wide search.

In the initial Prosite consensus pattern syntax PS0013 (the sequence motif determining lipidation), the permitted lipobox amino acids preceding the cysteine at position-1 to-4 have been indicated and{DERK}(6)-[LIVMFWSTAG](2)-[LIVMFYSTAGCQ]-[AGS]-C wherein the permitted lipobox amino acids preceding the cysteine at positions -1to -4 are indicated and the absence of charged residues (i.e. no D, E, R or K) within the signal peptide h-region is prescribed. Furthermore, the pattern is subject to the application of additional rules: Firstly, the cysteine must be between positions 15 and 35 of the sequence under consideration and secondly, there must be at least one lysine or one arginine in the first seven positions of the sequence. These rules localize the pattern to N-terminal sequences with n-regions characteristic of signal peptides. Membrane spanning domain (MSD) in protein sequence was also predicted setting of 14 aa for hydrophobic domain.

Large number of putative genes was identified using the N-terminal sequence of the signal peptide in various Gram-positive bacterial genomes. For determining the false-positive bacterial lipoproteins, they were analysed individually, searching the N-terminal MSD and also the additional number of MSD using TMpred. In Bacterial Lipoprotein sequence in which there is absence of MSD or there is extension of N-terminal MSD beyond the cysteine lipobox were considered possible false-positive. Other sequence analysis was done by SignalP as the MSD analysis was not sufficient for justification for false positive lipoprotein as both the Ctac and the QoxA proven Lpps had two additional MSD predicted beyond their N-terminal lipid anchors. For this sequence analysis conditions that were considered to prove Lpps were false-positive were signal peptide features were absent altogether and /or the lipobox cysteine was internalised rather than to terminal to n h-region/MSD. For sequences which require further clarification, they were analysed using (notably TopPred2 and DAS) predicting transmembrane regions and a consensus taken as to the position of the putative lipobox cysteine relative to the length of the first predicted MSD.

Given with Prosite consensus pattern syntax PS0013 and experimentally verified Bacterial Lpps, they were analysed for lipobox features which confirmed previously observed preferences (Braun & Wu, 1994; Sutcliffe & Russell, 1995), notably in the high frequency of leucine at the -3 position, and alanine or serine at -2 of the lipobox.

Lady using a tablet
Lady using a tablet

Comprehensive

Writing Services

Lady Using Tablet

Plagiarism-free
Always on Time

Marked to Standard

Order Now

However, compared with the PS00013 pattern, some deviations and restrictions were apparent. Alanine and glycine were the only amino acids noted at the -1 position, as previously noted for proven Lpps generally (von Heijne, 1989). Significantly, it was observed that the B. subtilis KapB sequence contains a glutamic acid (E) at position -4 of its lipobox, an occurrence proscribed by the PS00013 pattern. There were some contradictions to the PS0013 pattern as two proven Lpps (KapB of B. subtilis

and MI43 of Mycobacterium intracellulare) had N termini with no arginine or lysines in the first 7 aa which is not within the rules of the syntax pattern.

Given that certain proven Lpps had signal peptide sequences features clearly contradictory to those described by the PS00013 pattern, and that additional discrimination against false-positives is likely to result from the derivation of a taxon-speci®c pattern, a modi®ed pattern was de®ned for the proven Lpp signal peptide sequences of Gram-positive bacteria (Table 2). This pattern, ![MV]-X(0,13)-[RK]-²DERKQ´(6,20)- [LIVMFESTAG]-[LVIAM]-[IVMSTAFG]-[AG]-C (using Prosite syntax), was termed G­LPP (Table 3).

S. pyogenes sequences in the SWISS-PROT/TrEMBL database containing the PS00013 pattern were identified and compared to those identified in a similar pattern search with G­LPP (Table 4). The PS00013 search identified 36 sequences of which nine (25%) were excluded as unlikely Lpps (false-positives) using the criteria described herein. The G­LPP pattern search again proved more discriminatory than PS00013 in that 26 sequences were identified, of which only one (4%) was considered as an unlikely Lpp. Thus eight out of nine (89%) of the unlikely Lpps initially identified using PS00013 were excluded (Table 4). The PS00013 search identified 36 sequences of which nine (25%) were excluded as unlikely Lpps (false-positives) using the criteria described herein. The G­LPP pattern search again proved more discriminatory than PS00013 in those 26 sequences were identified, of which only one (4%) was considered as an unlikely Lpp. Thus eight out of nine (89%) of the unlikely Lpps initially identified using PS00013 were excluded (Table 4).

Signal peptide features can be described using 'pattern expressions' written in Prosite syntax as shown in Table I. These patterns can be used for the bioinformatic identification of bacterial lipoproteins. '<' indicates the pattern is restricted to the N-terminus of the protein and at each position thereafter the amino acids shown are either permitted (square brackets) or prohibited (curly brackets). X is any amino acid. Where stretches of amino acids can vary in length, the range is indicated in parentheses. The original G+LPP pattern was described by analysis of the signal peptide features of 33 experimentally verified lipoproteins [54]. An extended dataset of 90 experimentally verified lipoprotein signal peptides was used to revise this pattern (G+LPPv2; [55]). The essential cysteine is considered the +1 position and, along with amino acids at positions

-3 to -1, constitutes the 'lipobox'.

The Prosite profile P51257 (originally pattern PS00013) is based on the analysis of signal peptides from Gram-negative and other bacteria [53] and is notably more relaxed in the -2 and -3 positions.

PS00013 were excluded (Table 4). Both searches identi

-

5. Specific Lipoproteins of Streptococcus Pyogenes

Lipoproteins identified and verified in the bioinformatic approach using pattern set to identifying the amino acid sequence with functional predictions may contribute to the virulence of the Streptococcus Pyogenes pathogen.

With both list of lipoproteins experimentally verified from Sutcliffe and Harrington, 2002; Lei et al, 2004), I will discuss how these lipoproteins contribute to pathogens or immune protection.

Spy0385 is the lipoprotein component of an ABC transporter encoded by spy0383-spy0386 that is homologous to a ferrichrome ABC transporter, which suggests that Spy0385 is involved in iron acquisition. Moreover, growth in iron-depleted medium results in up-regulation of transcription of spy0385 [50]. Iron is essential for GAS growth and also regulates expression of virulence factors [51-53]. Inhibition of iron uptake might be the basis whereby antibodies raised against Spy0385 detrimentally alter the growth of GAS in blood in vitro. Of note, proteins involved in iron acquisition in other human pathogens are under intense investigation as potential therapeutic targets [30-36, 54].

Spy1390 is a homologue of the protease maturation protein PrtM of L. lactis [49]. PrtM is essential for the production of active forms of the serine protease PrtP. Of note, immunization of mice with the PrtM homologue of Streptococcus pneumonia elicits opsonophagocytic antibodies and a protective immune response [55].

Lady using a tablet
Lady using a tablet

This Essay is

a Student's Work

Lady Using Tablet

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Examples of our work

The spy1390 gene is possibly in an operon with spy1392 that encodes a putative permease, but no ATP-binding protein gene is present nearby.

Spy1558 is a thioredoxin homologue and probably is involved in the repair of oxidative damage to proteins [56].

Spy1274 is a component of a putative amino acid ABC transporter and was the only transporter for amino acid/peptides and sugars in our study that induced antibodies that promote significant bactericidal activity. Spy1245 is part of a putative phosphate transporter and homologues of this protein have not been reported to be protective antigens in other bacterial pathogens.

Spy 2007 is putative laminin adhesion lipoprotein which is involved in helping the S.pyogenes to bind to the host epithelial cells. It acts as a mediator between the host and the S.pyogenes and plays a role in causing the diseases pharyngitis, impetigo, scarlet fever, and streptococcal toxic shock-like syndrome as these disease require adhesion to the epithelial cells of the host (Yutaka Terao et al , 2001).

Spy2037 have regions of homology with a protease maturation protein of Lactococcus lactis [49] , PrtM is a lipoprotein which is in involved in proteolytic system of Lactococcus lactis and helps in bacteria growth by providing nitrogen through break down of the amino acids .

Spy0857 has a secretion signal sequence typical of many conventional secreted proteins

and Spy1792 has an LPXTG motif at the carboxyterminus characteristic of cell wall-linked proteins of gram-positive bacteria [45]. Links the virulence factor.

Spy0778, Spy1094, Spy1592, and Spy2032 appear to be the binding proteins of 4 putative ABC transporters.

The transporters containing Spy0293 [46] and Spy2000 [47] function in uptake of oligopeptides and di- peptides, respectively. Recombinant Spy0453 has been shown to bind Zn2+, Cu2+, and Fe3+ [48].

Three putative non-ABC trans- porter lipoproteins (Spy0457, Spy1882, and Spy2066) have ho- mology with peptidyl-prolyl cis-trans isomerase, acid phosphatase, and dipeptidase enzymes, respectively.

Spy1094 and Spy2007 may contribute to pathogen adhesion to the host. Spy1390 and Spy2037 have regions of homology with a protease maturation protein of Lactococcus lactis [49].

We identified 30 putative GAS lipoproteins by searching for proteins with features of the lipoprotein signal sequences among the streptococcal extracellular proteins identified by hydrophilicity blot analysis. Sutcliffe and Harrington [44] recently identified 28 putative lipoprotein genes of GAS using a modified version of the Prosite consensus pattern PS00013 (HYPERLINK "http://www/"hHYPERLINK "http://www/"ttHYPERLINK "http://www/"pHYPERLINK "http://www/":HYPERLINK "http://www/"/HYPERLINK "http://www/"/www.expasy.ch/prosite/) of lipoprotein signalsequences,![MV]-X(0,13)-[RK]-{DERKQ}(6,20)-[LIVMFESTAG]-[LVIAM]-[IVMSTAFG]-[AG]-C.

Four of the proteins they identified (Spy0247, Spy0351, Spy0857, and Spy1792) were not included in our list.

We excluded Spy0247 and Spy0351, as well as another protein, Spy1315, because they appear to be integral membrane proteins with several transmembrane domains. Spy0857 has a secretion signal sequence typical of many conventional secreted proteins, and Spy1792 has an LPXTG motif at the carboxyterminus characteristic of cell wall-linked proteins of gram-positive bacteria [45].

Conversely, Spy0163, Spy0457, Spy0778, Spy1094, Spy1592, and Spy2032 were not identified as putative lipoproteins by Sutcliffe and Harrington [44].

Spy0778, Spy1094, Spy1592, and Spy2032 appear to bethe binding proteins of 4 putative ABC transporters. These discrepancies demonstrate that several bioinformatic strategies are useful in searches designed to identify proteins with par ticular characteristics of interest.

Inferred functions of the putative lipoproteins.

Sixteen of the 30 genes we identified are part of putative operons that also encode 1 or 2 permeases and an ATP-binding protein. These lipoproteins appear to be the binding proteins of putative ABC transporters (table 2) that constitute uptake systems for amino acids, peptides, sugars, metal ions, heme, phosphate, or unknown nutrients.

The transporters containing Spy0293 [46] and Spy2000 [47] function in uptake of oligopeptides and di- peptides, respectively. Recombinant Spy0453 has been shown to bind Zn2+, Cu2+, and Fe3+ [48].

Three putative non-ABC trans- porter lipoproteins (Spy0457, Spy1882, and Spy2066) have ho- mology with peptidyl-prolyl cis-trans isomerase, acid phospha tase, and dipeptidase enzymes, respectively.\

Spy1094 and Spy2007 may contribute to pathogen adhesion to the host. Spy1390 and Spy2037 have regions of homology with a protease maturation protein of Lactococcus lactis [49].