Streptococcus Pyogenes Bacterium Human Pathogen Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Streptococcus pyogenes bacterium a worldwide known human pathogen is also known as the Group A Streptococcus, GAS. It is Gram-positive, non-motile, and spherical in shape which grows in long chains or pairs (Figure 1A). S.pyogenes is distinguished from the other streptococci by the presence of the Lance field Group A carbohydrate found on its cell wall and it produces large zone of beta-hemolysis ( haemoglobin released when the erythrocytes are completely disrupted) when grown on a plate enriched in blood agar (Figure 1B). Therefore it is known as the Group A (beta-haemolytic) Streptococcus (GAS).

Gram-positive bacteria lack a retentive outer membrane and thus, they have evolved several mechanisms for anchoring proteins within their membrane. Of the several mechanisms that Gram-positive bacteria have evolved, N-terminal lipidation, a major mechanism which allows proteins to be anchored in the bacteria membrane and these proteins are known as bacterial lipoproteins (Lpps). These bacterial lipoproteins have their N-terminal modified with N-Acyl Diacyl Glyceryl group (Sutcliffe and Harrington 2002).

Lipid modification of bacterial proteins enables them to efficiently carry out their important functions between the cell wall and the environment (Braun et al, 1993). Lpps performs wide range of critical functions such as substrate binding proteins (SBPs) in ABC transporter system; in antibiotic resistance; in cell signalling; in protein export and folding; in sporulation and germination; in conjugation and various other functions (Sutcliffe and Russell 1995).

Thus, the functions of lipoproteins of Gram-positive bacteria are comparable to the surface proteins of Gram- negative bacteria. For example, the substrate binding proteins of the ABC transporter system are typically Lpp in Gram-positive bacteria as well as in Gram-negative bacteria(Sutcliffe and Russell 1995).

Structure of signal peptides from bacterial lipoproteins

Proteins to be exported across the cell membrane via the two distinct export pathways require proteins with N-terminal signal peptides for recognition. There are two types of signal peptides namely, Type I and Type II signal peptides.

Figure 2: Type I and Type II signal peptides via Sec and Tat Dependent Transport (a) General Secretory Pathway, (b) Tat System

Both Type I and Type II signal peptides is composed of three distinct segments: a positively charged amino-terminal N, a central, H-(hydrophobic) region and a more polar carboxy terminal C (cleavage region) (von Heijne 1989).

Within type I signal peptide, it sustains a recognition motif (A-X-A, where X can be any amino acid) for type I signal peptidase cleavage activity. In the case of Type II signal peptide, it contains the recognition motif sequence of (L-3-[A/S/T]-2-[G/A]--1-C+1) for type II signal peptidase and its cleavage site often referred as lipoprotein ‘lipobox’.

In comparison between the signal peptides recognised by the two export pathways, the Tat system signal peptides has a conserved SRRXFLK sequence between N-region and H-region, within which the twin arginine (RR) motif is almost absolutely conserved compared to that of Sec pathway signal peptide (Figure 2).

3. Translocation across cellular membrane

In order to reach their site of function, a significant proportion of newly synthesised proteins need to be translocated to outside of cell. This is done mainly via the general secretory (Sec) pathway and Tat (twin arginine protein transport) system. The difference between the two systems lies in whether the conformation of the translocated protein is folded or unfolded.

The Sec pathway is the predominant route of transportation of proteins across the cytoplasmic membrane among the two distinct pathways(Driessen and Nouwen 2008). In the Sec pathway, proteins with an unfolded conformation have to synthesised with N-terminal signal peptides which will be excised at a later stage during the exportation, via a signal peptidase situated on the cytoplasmic membrane. There have been findings in which some of the putative lipoproteins are exported via SecA2 dependent accessory pathway across the cytoplasmic membrane in unfolded state found in some of the Gram-positive bacteria but not all (Lenz, Mohammadi et al. 2003; Gibbons, Wolschendorf et al. 2007).

In Tat system, proteins with folded conformation, even oligomeric proteins are transported to cytoplasmic membrane (Sargent, Berks et al. 2006). Proteins bearing N â€"terminal signal peptide containing invariant and essential twin arginine motif, are targeted to Tat system for exportation. Lipoproteins precursors exported through Tat system are in a fully folded conformation were confirmed during an analysis of dimethyl sulphoxide (Dms) reductase in Gram-negative bacteria (Gralnick, Vali et al. 2006).

This indicates that Tat system is a mechanism which is fundamentally different from the Sec pathway and requires the proteins to be folded before they cross the cellular membrane.

Both translocation pathways require proteins bearing N-terminal signal peptides which are required for the recognition to be exported across the membrane. Proteins destined for lipidation contains a motif in their signal peptides also known as the lipobox which forwards them to lipoprotein biogenesis machinery after exportation via either Sec pathway or Tat system.

Lipoprotein Biogenesis

To attain their full fledged function, the newly synthesised pro-lipoproteins exported across the cytoplasmic membrane via the Sec pathway or Tat system will be channelled to the lipoprotein biogenesis machinery. This channelling process will be guided by the conserved motif (lipobox) in the protein signal peptide.

Figure 3: Biogenesis of Lipoproteins

The lipoprotein biogenesis pathway in Gram-negative bacteria occurs in three step pathway in a strict order. Once the protein is exported across the cellular membrane with the guidance of the signal peptide II via Sec or Tat dependent system, the conserved cysteine residue within the lipobox of the signal peptide is modified with a diacyglycerol group attached through a thioether linkage. The above whole reaction is catalyzed by the enzyme prolipoprotein diacylglycerol transferase (Lgt), using phospholipid substrates (Qi, Sankaran et al. 1995; Sankaran, Gupta et al. 1995).

Once after the conserved residue gets lipdated, the signal peptide is cleaved within the lipobox by a specific lipoprotein signal peptidase II (Lsp), enzyme to release the lipidated cysteine as the N-terminal for the mature bacterial lipoprotein(Sankaran and Wu 1995) . Above mentioned steps for lipidation of Gram-positive bacteria, it was confirmed to be vital and sufficient for Gram-positive protein lipidation.

In a third step modification of the lipoprotein in the biogenesis pathway would be the lipid modification of the lipoprotein by a fatty acylation of the amino group of the N-terminal diacylglycerol modified Cys to form N-acyl diacylglycerol Cysteine by the enzyme, lipoprotein amino acyl transferase. This enzyme is not conserved as its homologues are not available in the genomes of low G-C Gram positive bacteria (Tjalsma, Kontinen et al. 1999). In Gram negative, it is an essential step which frees the lipoproteins from the plasma membrane and transported to the outer membrane via the Lol (lipoprotein localisation) pathway.

In Gram-positive bacteria lipoprotein biosynthesis pathway, the initial the two stages are similar to that of Gram-negative bacteria. Intriguingly, in recent studies, it has been noted the enzyme Lsp has non-specific cleavage activity towards non-lipidated substrates in Listeria monocytogenes as well as in Streptococcus agalactiae (Baumgartner, Karst et al. 2007; Henneke, Dramsi et al. 2008; Bray, Sutcliffe et al. 2009).

In Listeria monocytogenes, Gram-positive bacteria, the lgt gene was intentionally deleted and the non-lipidated lipoproteins were still cleaved by the Lsp enzyme. This proves that lipoprotein lipidation is not essential for the Lsp enzyme cleavage activity and Lsp could have cleavage activity towards non-lipidated substrates in some Gram-positive bacteria (Baumgartner, Karst et al. 2007).

In Streptococcus agalactiae (GBS), both lgt and lsp genes were inactivated; the non-lipidated lipoproteins were still cleaved by the type I signal peptidase. This indicates that there are alternative peptidases to process the lipoprotein when the biosynthesis pathway enzymes are inactivated such as type 1 signal peptidase and Lsp when lgt enzymes inactivated (Henneke, Dramsi et al. 2008).

Gaining insights into these recent studies, it can be concluded that the lipoprotein biogenesis pathway in Gram-positive bacteria does not occur in a strict manner and there are alternative processing methods when the biosynthetic pathway enzymes are inactivated.

Bioinformatic Prediction of Lipoprotein in Gram-positive Bacteria

With the well established features of signal peptide and through the identification of lipobox containing invariant cysteine, bioinformatic analysis of these signal peptides is able to identify potential lipoproteins in Gram-positive bacteria.

Based on the sequence analysis of signal peptides of Gram positive bacteria and Gram-negative bacteria, it was noted that lipoprotein signal peptides tend to be shorter that secretory signal peptides which indicate that the c-region is shorter and contains apolar amino acids. It implies that it is a continuation of the hydrophobic domain which is primarily based on the sequence conservation preceding the invariant lipid-modified cysteine (von Heijne 1989).

Using the signal peptide sequences containing the lipobox, the Prosite consensus pattern syntax describing the sequence motif determining lipidation was constructed as {DERK}(6)-[LIVMFWSTAG](2)-[LIVMFYSTAGCQ]-[AGS]-C to recognise bacterial lipoprotein sequences. In this pattern expression, the allowed amino acids preceding the cysteine are at position -1 to -4 and the missing charged residues in h region are indicated as (D, E, R or K). This pattern expression has certain set of rules to be adhered that the cysteine must be between positions 15 to 35 and there has to be a arginine or lysine in the first seven positions of the sequence in order to place the pattern in sync with the N-terminal sequence with n region characteristic of signal peptide (Sutcliffe and Harrington 2002).

Large number of putative Lpps were identified through molecular genetic studies and quite number of these identified Lpps could be false-positive as they seem to contain a cysteine within the signal of exported proteins or proteins targeted for insertion into the plasma membrane. It was also noted that there were differences in the stretch of amino acids preceding the invariant cysteine in the signal peptides features of different bacterial taxa.

To derive to the prediction of lipoproteins using bioinformatic analysis, Sutcliffe and Harrington (2002) created a dataset of experimentally verified Gram-positive lipoproteins. These lipoproteins were identified based on several approaches: (1) metabolic labelling with radiolabelled fatty acid (palmitate); (2) Inhibition of Lsp (bacterial signal peptide) using the antibiotic globomycin; (3) Biochemical characterization of the purified protein and (4) Evidence that protein processing is disrupted by mutation in either Lgt or Lsp, or following site directed mutagenesis to replace the lipobox cysteine. Within this set of criteria and along with extensive review of scientific journals, 33 proteins were identified as proven bacterial lipoproteins

To further validate the above 33 lipoproteins indentified, several other bioinformatic sequence analysis were performed. Bacterial Lpps sequence were obtained from Prosite website and restricting the searches to Bacillus subtilis or S. pyogenes. Using the TMpred program, membrane spanning domains (MSD) in the above sequences were predicted, with a minimum length of the hydrophobic domain set 14aa and the signal peptides sequences were analysed using the Signal 2.0 (refined hidden Markov model version 2.0). For further clarification of the Lpp sequences, TopPred2 (transmembrane predictor) and DAS programs were used.

In the exclusion of the bacterial Lpps that are false positive, Bacterial Lpps N-terminal sequences were analysed individually using TMpred and SignalP. Lpps sequences which clearly denotes the absence of MSD and the extension of the most N-terminal beyond the invariant cysteine were known to be possible false-positives. TMpred was not justifiable as the CatC and the QoxA proven Lpps contained two additional MSD beyond their N-terminal lipid anchors. To encounter this, SignalP was used to analyse the sequences, bacterial Lpps where signal peptides features were absent and or the lipobox sequence which is internal to an h-region /MSD were confirmed to be false positive. Further clarification of these sequences were analysed using the TopPred2 and DAS and a general sequence was taken to position of invariant cysteine from the fist predicted MSD.

From the analysis of the signal peptide lipobox features from the above bioinformatic programs, it justified previous studies results in which there were high frequency of leucine in -3 position and alanine or serine at -2 position of the lipobox. In comparison with PS0013 pattern, there were obvious deviations and restrictions: alanine and glycine are the only amino acids indicated at -1 position and two proven Lpps had no arginine or lysine in the first amino acids which is contradictory to the PS0013 pattern.

Analysis of the lipobox sequences from the 33 experimentally verified lipoproteins, it was noted that n- regions had mean length of 6.7 +- 3.5 within the length of 3-15aa, the h-region length was 12.1+_ 2.3 aa within the length of 6-20aa. These details are in agreement with the findings of h-features indicated for the putative Lpp of B.subtilis. The mean invariant cysteine position was 24.0+_3.6 with the range of 17-33 aa length which proves the bacterial signal peptides are typically shorter compared to the signal peptides involved in directing protein export in Gram-positive. The mean length of the combined h-and c-regions to be 17.1 aa was noted as it is sufficient to span a typical bilayer membrane. From these data, it is noted the conserved residues are positioned at the outer face of cytoplasmic membrane in where the Lgt enzyme interacts with the invariant cysteine in the lipobox.

Since the PS0013 pattern is contradictory to certain proven Lpps in Gram-Positive bacteria as well as additional discriminations is likely to result due to the differences in signal peptide features of different bacterial taxa, a modified pattern, G+LPP was constructed for identifying the 33 proven bacterial Lpps. G+LPP pattern, is described as < [MV]-X(0,13)-[RK]-{DERKQ}(6,20)-[LIVMFESTAG]-[LVIAM]-[IVMSTAFG]-[AG]-C(using Prosite syntax).

In comparison of the G+LPP pattern stringency to that of PS0013 pattern in identifying putative bacterial Lpps, it provided a greater discrimination against the false-positive bacterial Lpps sequences when tested in B.subtilis genome. PS0013 pattern search identified 103 putative Lpps while G+LPP pattern identified 61 probable Lpps together with 6 proven Lpps in the above mentioned organism. Thus, the usage of G+LPP pattern to predict bacterial Lpps with a great confidence.

Both the Prosite pattern as well the G+ LPP pattern were applied to the S.pyogenes genome, retrieved from SWISS-PROT/TrEMBL database. The Prosite pattern search identified 36 sequences, out of which 9 were excluded as unlikely Lpps while the G+LPP pattern search identified 26 Lpps, out of which only one was known to be unlikely Lpps. Thus with these data, 8 out of 9 Lpps identified by PS00013 were excluded.

Both the search patterns identified previously identified and proven LppC Lpp as well several other Lpps that were identified and proven. A total of 24 Lpps identified in the S.pyogenes genome using the pattern search represents 1.5% of the S.pyogenes proteome which is comparable to the 36 Lpps identified in the S. pneumoniae genome.

Apart from identifying common previously identified Lpps by the both patterns, there were sequences which were picked up as possible putative Lpps specific to each pattern but not to both. In PS00013 pattern search, three putative Lpps sequences namely, Spy1972, Spy1361, Spy2066 were identified but not with the G+LPP pattern. Spy1972 n-region signal sequence is unusual in length and contains a LPXTG motif in the C-terminal. Spy1361 contains glutamine residue within the h-region and Spy 2006 signal sequence are not clear. Due to above differences in signal sequences, they were not picked by the G+LPP pattern which warrants evidence to prove that they are indeed putative bacterial Lpps. Likewise, G+LPP pattern search identified a signal sequence, Spy0903 but not with PS0013 pattern due to its extended signal features in the n-region.

Bacterial Lpps signal sequences that were missed by the both pattern searches were further analysed by using a combination of strategies namely, analysis of the S.pyogenes genome annotation, homologues searches of pneumococcal Lpps, PEDANT search and blast searches with low stringency. With the above searches, six possible false-positive bacterial Lpps were identified; Spy0163, Spy1592, Spy0778, Spy1306, Spy0457 and Spy2033.

Among these possible bacterial Lpps, four of them are substrate binding proteins (SBP). Spy0163 is a paralogue of Spy1228 but after refining the signal sequence of n region with alternative start methionine and lysine of this Lpp, it was accepted by G+LPP pattern and their motif were proven by Rosati et al. Spy1592 was excluded by the pattern search as it contains asparagine at -4 position in the signal sequence. However in the ORF of the S.pyogenes genome contains serine at this position which indicates that it is indeed a Lpp in some strains. Likewise the Spy0457 Lpp, peptidyl-prolyl isomerase of the cyclophilin family, contains an asparagine in the -4 position but it is highly homologous to the pneumococcal Sp0771 which indicates it may assist in the folding of exported proteins.

Both Spy0778 and Spy1306 were excluded in the pattern search as they contain proline in the -4 position which warrants further evidence. In the case of Spy2033, it has abnormal signal sequence of 64aa but intriguingly, its h-region ends within the invariant lipobox cysteine which tallies with the general consensus of a typical Lpp signal peptide. Its alternative start at M41 is consistent with sequence alignment as well as its homologue, the Streptococcus cristatus putative Lpp TptA which warrant further verification of this sequence.