Evaluation of the bioethanol fermentation parameters: As above mentioned, differently from other fermentation processes, Brazilian distilleries recycle yeast cells during the bioethanol fermentation process. However, in order to reproduce the industrial fermentation conditions in laboratorial assays the first cycle is considered not useful for analysis because it is a round of yeast adaptation and therefore the data are generally highly variable (Argueso et al., 2009). Accordingly, yeast cells were sampled in the second fermentative cycle and not in the first cycle (or adaptation cycle), after the harsh acid treatment (see Figure 1). To validate the cellular material that would be used for microarray hybridization analysis, we measured several parameters aiming to verify the fermentation efficiency (Table 1 and Figure 1B). By the end of the 9-hour fermentation process, the ethanol yield and the cell viability were about the same for both industrial strains (Table 1). The consumption of sulfuric acid to adjust the pH during the yeast treatment step, the accumulation of residual sugars, the production of glycerol and ethanol and the accumulated CO2 loss was also very similar to both strains (Table 1 and Figure 1B). All these parameters indicated that the ethanol fermentation process was successfully occurring and also showed the superior ability of these industrial strains to produce ethanol because they are able to use more efficiently sugars derived from sugarcane.
Microarray hybridization analysis: To gain an insight on which pathways are modulated during bioethanol fermentation by the industrial strains PE-2 and CAT-1, we determined the transcriptional profile of these strains. Total RNA extracted from these cultures was used to amplify fluorescent-labeled cRNAs for competitive microarray hybridizations. We have compared the mRNA expression of the PE-2 and CAT-1 strain grown for 3 and 9 hours (feeding and fermentation) with the 0-hour growth (reference post-acid treatment; Figure 1A). In these experiments, the main aim was to focus on genes that have increased or decreased mRNA accumulation. The full dataset was deposited in the Gene Expression Omnibus (GEO) from the National Center of Biotechnology Information (NCBI) with the number GSE26619 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26619) In the CAT-1 and PE-2 strain, we were able to observe 1,591 genes modulated in at least one time point when compared to the reference time [genes with log ratios ï‚³ 1 (592 genes) and ï‚£ 1 (999 genes), respectively, t-test, p-value 0.01]. These genes are involved with a variety of cellular processes with increased or decreased mRNA expression (Supplementary Tables 1S and 2S show the 1,591 genes with log ratios ï‚³ 1 or ï‚£ 1, respectively). These genes were classified into Slim Mapper functional categories (http://www.yeastgenome.org; Supplementary Tables 1S and 2S). We investigated in more detail which gene ontology category was enriched in the genes whose mRNA accumulation was decreased or increased in both strains (Table 2). In the genes with a decreased mRNA accumulation, there is a significant enrichment for the following GO categories (repressed versus induction conditions): mitochondrion organization (16.9 % versus 2.0 %), cellular lipid metabolic process (6.0 % versus 1.0 %), and transport (7.0 % versus 1.9 %). In contrast, in the genes with an increased mRNA accumulation, there is a significant enrichment for the following GO categories (repressed versus induction conditions): cytoskeleton organization (6.0 % versus 0.5 %), fungal-type cell wall organization (1.9 % versus 0.5 %), DNA metabolic process (7.1 % versus 2.0 %), and translation and ribosome biogenesis (10.0 % versus 1.0 %).
The 1,591 modulated genes have also been analyzed with the aid of a K-means algorithm in an attempt to cluster genes according to the similarities in their expression profiles. Their distribution into 50 distinct clusters shows genes with minor alterations in their expression levels while a large number of them were more severely up- or down-regulated at one or more steps of the fermentation process (Figure 2). We focused our attention on few clusters that seemed to contain genes with the most intense and consistent up- and down-regulation profiles during feeding and fermentation steps (Figure 2, clusters 1, 2, 3, 4, 8, and 34 [as counted from left to right, beginning at the top left panel; indicated with bold lines]). In cluster 3, the genes have increased mRNA accumulation during the feeding and fermentation steps, but their accumulation is higher during the fermentation step (Figure 3A). In this cluster, there is an increased mRNA accumulation of HSP26, a small heat shock protein with chaperone activity that suppress unfolded proteins aggregation and that is expressed in stressed cells (Bossier et al. 1989) and ERD2 encoding an HDEL receptor, an integral membrane protein that binds to the HDEL motif in proteins destined for retention in the endoplasmic reticulum (ER), playing a role in maintenance of normal levels of ER-resident proteins (Semenza et al. 1990). In the same cluster, we also observed CWP2 encoding a covalently linked cell wall mannoprotein, major constituent of the cell wall, that plays a role in stabilizing the cell wall and it is also involved in low pH resistance (van der Vaart et al. 1995), and two other genes encoding proteins important for glutathione metabolism, GTT3 and CYS3 (Samanta and Liang 2003).
Clusters 2 and 8 show genes with increased mRNA accumulation during the feeding and fermentation steps, but their accumulation is higher during the feeding step (Figures 3B-C). Most of the genes in these clusters encode proteins important for transcription and protein synthesis, reflecting the intense metabolic activities of these two strains during bioethanol fermentation, mostly in the feeding step (Figures 3B-C). Besides that, several genes from the glycolytic pathway, such as PGK1, GLD3, TPI1, GPM1, ENO2, and PYK1 are present in these clusters. PDC1 and PDC5, major isoforms of the three pyruvate decarboxylase isoenzymes which are key enzymes in alcoholic fermentation decarboxylating pyruvate to acetaldehyde (Pronk et al. 1996), are also present in these clusters; the third isoform, PDC6 is also showing increased expression and it is present in cluster 14 (Figure 2; Supplementary Table 2S). Two subunits (B and G) of the vacuolar H+-ATPase (V-H+-ATPase), VMA2 and VMA10, also showed increased mRNA accumulation (cluster 8, Figure 3C); actually, six of the eight subunits of the V-H+-ATPase showed increased mRNA accumulation in our microarray analysis (Rothman et al. 1989; Supplementary Table 2S). Finally, in cluster 8 (Figure 3C), there is an increased mRNA accumulation of UTR2, a chitin transglycosylase that functions in the transfer of chitin to Î²-1-6 and Î²-1-3 glucans in the cell wall and that is similar to and functionally redundant with CHR1, observed as more expressed in Cluster 4 (Rodríguez-Peña et al. 2000; Figure 3D). In cluster 4, that shows increased and constant mRNA accumulation in both feeding and fermentation steps, there is TIR3, a cell wall mannoprotein of the Srp1p/Tip1p family of serine-alanine-rich proteins that it is expressed under anaerobic conditions and required for anaerobic growth (Abramova et al. 2001). The ADH1, a fermentative isoenzyme, the main alcohol dehydrogenase required for the reduction of acetaldehyde to ethanol is also more expressed (cluster 4; Figure 3D). Finally, in this cluster, two genes encoding proteins important in cellular response to oxidative stress, TRX1 and AHP1 encoding a cytoplasmic thioredoxin and a thiol-specific peroxiredoxin, respectively, have increased mRNA accumulation (Grant 2001; cluster 4, Figure 3D).
Clusters 1 and 34 show genes with decreased mRNA accumulation during the feeding and fermentation steps, but in cluster 34 their accumulation is higher during the fermentation step (Figures 3E-F). In cluster 1, there are four genes encoding putative hexose transporters, HXT13, -15, -16, and -17 and RGI2, a protein of unknown function involved in energy metabolism under respiratory conditions whose expression is induced under carbon limitation and repressed under high glucose (Domitrovic et al. 2010; Figure 3E). In cluster 34, ADR1 and CIT2, encoding a carbon source-responsive zinc-finger transcription factor and citrate synthase, respectively, showed decreased ADR1 is required for transcription of the glucose-repressed gene ADH2, of peroxisomal protein genes, and of genes required for ethanol, glycerol, and fatty acid utilization (Young et al. 2003).
Systems analysis for bioethanol production by S. cerevisiae CAT-1 and PE-2 strains: The transcriptomics data obtained from yeast CAT-1 and PE-2 strains submitted to the treatment conditions described in this work prompt us to ask how the repressed or induced genes affect different biological processes that alter the fermentative process. In this sense, we initiated a search for potential proteins and/or mechanisms and their associated biological processes affected by these conditions commonly used in the bioethanol industry. To achieve this goal, two different PPPI networks using yeast transcriptomics data were retrieved from Saccharomyces Genome Database (SGD): one associated to repressed genes (repressed genes-associated PPPI network) and one associated to induced genes (induced genes-associated network). The induced genes-associated PPPI network obtained from SGD contains 3,273 nodes and 13,053 connectors while the repressed-associated gene PPPI network contains 4,342 nodes and 20,122 connectors (data not shown).
Both induced- and repressed-genes associated PPPI networks are highly overlapping, sharing 2,822 common genes, including repressed and induced genes (Figure 4A; Supplementary Table 3S). The repressed-genes associated PPPI network contains 1,520 unique genes (Figure 4A), while the induced-genes associated network contains 451 unique genes (Figure 4A). The elevated degree of network overlapping prompts us to merge both networks in a unique graph, containing 4,793 nodes and 33,175 connectors (Union network; Figure 4B). GO analysis of union network indicate the participation of important biological processes, like: (i) cell cycle, (ii) vesicle-mediated transport, (iii) mitotic cell cycle, (iv) chromosome organization and biogenesis, (v) cytoskeleton organization and biogenesis, (vi) response to stress, (vii) aging, and (viii) DNA repair (Supplementary Table 4S).
In order to obtain information about global network topology, a strongly connected cluster (SCC) subgraph was obtained (Figure 5A), which is formed mainly by induced and repressed genes (Figure 5A; Supplementary Table 3S). The SCC subgraph is composed by 1,221 nodes and 9,028 connectors, representing approximately 26 % of all nodes found in the union network. A GO analysis of SCC subgraph indicated the presence of biological processes, like cellular respiration, alcohol biosynthetic pathway, aromatic amino acid catabolic processes, glucan metabolic processes (associated to the maintenance of cell wall integrity), among others (Supplementary Table 4S).
Taking into account the data gathered from this initial systems biology analysis, we prompted to get more information about the major nodes involved in the information flow inside the SCC network using network centralities. This allows us to identify nodes (and the consequent biological processes) that have a relevant position in the overall network architecture (Borgatti 2005) and many network centralities have been developed to evaluate the importance of a node for a given network, e.g., node degree, betweenness, and eigenvector measures (Borgatti 2005). Centralities have been recently applied to quantify the centrality and prestige of actors in social networks (Borgatti 2005) and to understand the structure and properties of complex biological, technological and infrastructural networks (Estrada and Hatano 2010; Estrada 2006). Many of the nodes in a given network that show elevated values of centrality are important points of vulnerability, indicating that any attack to these nodes could introduce strong perturbations in the network. This graph principle has been exploited to identify proteins that are essential for an organism or that occupy a central position in a biological process (Borgatti 2005; Yu et al. 2007). Thus, in this work, two major network centralities that could be associated with the SCC network were evaluated: node degree and betweenness (Figure 5B).
Node degree represents the simplest centrality measure in a given network, corresponding to the number of nodes adjacent to a given node, where adjacent means directly connected (Scardoni et al. 2009). The node degree represents the "popularity" of a given node, and highly connected nodes in a network are termed hubs. Hubs can be classified into party hubs (nodes that interact with most of their partners simultaneously, like those found within cliques) and date hubs (nodes that interacts asynchronously with other nodes, like those found between motifs) (Yu et al. 2007). Next, betweenness is a measure that indicates to what extent a specific node is between all other nodes within the network (Newman 2005). In a general sense, betweenness show the influence of a node over the spread of information throughout the network.
On the other hand, bottleneck is a local topologic data that is defined as all nodes with high betweenness values and different nodes degrees, indicating that those nodes are central points that control the communication between other nodes within the network. Bottleneck also indicates all nodes that are ''between'' highly interconnected subgraph clusters, and removing a bottleneck could divide a network (Girvan and Newman 2002, Yu et al. 2007). The measures of betweenness and node degree allow us to define the bottleneck nodes. Bottleneck nodes correspond to highly central proteins that connect several complexes or are peripheral members of central complexes, being important communication points between two complexes (Yu et al. 2007). Bottleneck nodes tend to be essential proteins in a network (Yu et al. 2007).
The centrality analysis of SCC subnetwork (Figure 5A) indicated the presence of 178 bottleneck nodes (Figure 5B; Supplementary Table 3S), corresponding to approximately 15 % of all nodes present in the SCC network. From this major bottleneck subnetwork, it was possible to separate all transcriptional induced-associated bottleneck nodes (induced bottleneck network; Figure 6A) and all transcriptional repressed-associated bottleneck nodes (repressed bottleneck network; Figure 6B).
The induced bottleneck network is composed by 78 nodes and 304 connectors (Figure 6A; Supplementary Table 3S), whose proteins participate in different biological processes, like cell division, cytokinesis, protein retention in endoplasmic reticulum (ER) lumen, response to oxidative stress, intracellular pH reduction, among others (Table 3). By its turn, the repressed bottleneck network is composed by 95 nodes and 640 connectors (Figure 6B; Supplementary Table 4S). By applying GO analysis, it can be observed that the repressed nodes participate in biological processes like ubiquitin-dependent protein catabolic process, peroxisome inhibition, response to DNA damage stimulus, ATP synthesis, among others (Table 4).
Validation of the microarray hybridization analysis: The identification of induced and repressed bottleneck nodes allows to select important genes that can be manipulated experimentally in order to improve the production of ethanol, yeast cell viability and lifespan, and stress resistance in industrial conditions. In the next two sub-sections, we show validation data about the role played by clusters related to damage to the cell wall, response to oxidative stress, and protein retention in endoplasmic reticulum (ER) lumen. The microarray hybridization analysis strongly suggested that the PE-2 and CAT-1 strains have increased adaptation to stressing conditions that are present in the industrial fermenters during bioethanol fermentation. Based on these data, these conditions are related to stresses that affect the cell wall and oxidative defenses. Thus, we tested the growth of these strains as well as laboratory haploid and diploid strains S288c and BY4743, respectively, by using a spot dilution assay in the presence of agents that elicit cell wall perturbation, oxidative stress and unfolded protein responses. Experiments were carried out in solid or liquid YPD medium supplemented with either 2 or 10 % glucose (Figure 7). We also added to these assays two haploid strains (PE E13 and PE E14) derived from PE-2 strain. When these strains were grown in the presence of cell wall perturbing agents such as calcofluor white (CFW), caffeine, and Congo Red (Figures 7B-D, respectively), the CAT-1 and PE-2 strains showed much higher tolerance to CFW 30 ïM and caffeine 10 mM (both at 2 and 10 % glucose; Figures 7B-C) than the haploid and diploid laboratory strains S288c and BY4743 (Figure 7B-C). Interestingly, both haploid strains PE E13 and PE E14 (derived from PE-2) showed lower growth in the presence of CFW 30 ïM than the parental PE-2 strain (Figure 7B). Surprisingly, a different behavior was observed for caffeine 10 mM since PE E13 strain has grown more than the PE PE-2 and PE E14 strains at 10 mM caffeine for both glucose concentrations (Figure 7C).
Next step, we evaluated the growth of these strains on oxidative stressing agents, such as paraquat, t-butyl, and hydrogen peroxide (Figure 8). CAT-1 and PE-2 strains are more tolerant to paraquat 0.1 mM than the laboratory strain BY4743 (at 2 % glucose; Figures 8A-B); however, both industrial strains are more tolerant than the two laboratory strains to paraquat 0.1 mM added to 10 % glucose medium (Figure 8B). Both haploid strains PE E13 and PE E14 are as tolerant to paraquat as the parental PE-2 strain (Figure 8B). In the presence of t-butyl 0.5 mM, the industrial strains are more tolerant at both glucose concentrations than the haploid and diploid laboratory strains (Figure 8C). Both haploid derivatives PE E13 and PE E14 are more sensitive to t-butyl 0.5 mM than PE-2 parental strain, but PE E13 is more tolerant than PE E14 (Figure 8C). Finally, in the presence of H2O2 5 mM, PE-2 strain is more tolerant than CAT-1 and the laboratory strain S288c at 2 % glucose (Figure 8D). However, at 10 % glucose CAT-1 and PE-2 are more tolerant to H2O2 5 mM than BY4743 but not S288c (Figure 8D). At 2 % glucose PE E13 and PE E14 haploid derivatives are more sensitive to H2O2 5 mM than PE-2 parental strain, but at 10 % glucose it is as tolerant as PE-2 strain (Figure 8D).
These results emphasize the robustness of the CAT-1 and PE-2 industrial strains to a combination of stressing conditions and strongly suggest that the genetics involved in these phenotypes is complex considering the very heterogeneous growth behavior of the PE-2 haploid derivatives in these conditions.
PE-2 and CAT-1 strains have increased unfolded protein response (UPR): The increased mRNA accumulation of genes encoding proteins involved in protein retention in ER lumen and retrograde vesicle-mediated transport, Golgi to ER, suggested a possible engagement of the Unfolded Protein Response (UPR). Thus, it is possible that these strains have increased UPR during this fermentation process. To start addressing this hypothesis, we have grown these strains as well as the haploid and diploid laboratory strains in the presence of dithiothreitol (DTT), an agent that disrupts endoplasmic reticulum homeostasis (ER). This agent induces ER stress by impairing the N-linked glycosylation that promotes folding of nascent polypeptide chains in the ER (Helenius and Aebi 2004). The CAT-1 and PE-2 strains showed comparable growth to the wild type haploid and diploid strains as well as the PE-2 haploid strains in the presence of 20 mM DTT+ glucose 2 % (Figure 9A). In contrast, when the glucose concentration is increased to 10 % in the presence of 20 mM DTT, the CAT-1 and PE-2 strains showed a much more robust growth than the wild type and PE-2 haploid strains (Figure 9A). Once more, we observed the PE-2 haploid strains displayed a much reduced growth when compared with the diploid PE-2 strain (Figure 9A).
The accumulation of aberrant folded proteins in the ER activates the bifunctional transmembrane kinase/endoribonuclease Ire1 (Korennykh et al. 2009). Ire1 excises an intron from yeast HAC1 cytoplasmic precursor mRNA HAC1u (uninduced), removing it to generate the induced form of the HAC1 mRNA, HAC1i (induced) (Cox and Walter 1996). This creates a frame-shift in the mRNA, allowing for the translation of a transcription factor that moves to the nucleus and regulates the expression of UPR target genes (Kimata et al. 2006; Travers et al. 2000). In S. cerevisiae, the Ire1p-mediated splicing of a 252-nucleotide (nt) unconventional intron from HAC1 mRNA relieves the transcript from a translational block (Kawahara et al. 1997; Ruegsegger et al. 2001). We adapted an RT-PCR assay [developed by (Bicknell et al. 2007)] aiming to distinguish the increased accumulation of S. cerevisiae HACAi (Figure 9B). For this set of experiments, we have used S. cerevisiae S288c as a control and the same bioethanol fermentation conditions used for the microarray hybridization analysis to determine the HAC1i mRNA splicing. We have used as a normalizer control, TAF10, a gene encoding a subunit (145 kDa) of TFIID and SAGA complexes, involved in RNA polymerase II transcription initiation and in chromatin modification. Recently, this gene was shown as a suitable reference gene for quantitative gene expression analysis by real-time RT-PCR in yeast biological samples covering a large panel of physiological states (Teste et al. 2009). In addition, TAF10 was not shown as modulated in our microarray hybridization experiments (data not shown). At the beginning (0 hours) and at 3-hours fermentation, the HAC1i mRNA accumulation levels are already increased by twice in the PE-2 and CAT-1 strains when compared to the laboratory S288c strain (Figure 9C). By 9-hour fermentation, the HAC1i mRNA accumulation levels are at least 30-times higher than in the laboratory S288c strain (Figure 9C). We also investigated the UPR to ethanol tolerance in these strains (Figure 9D). These three strains were incubated overnight at 30 oC in the absence and presence of ethanol 10 %, and their viability verified. The CAT-1 and PE-2 strains showed about 40 to 50 % viability while S288c and BY4743 showed about 28 to 32 % viability (Figure 9D). Accordingly, there is a much higher HAC1 mRNA accumulation during ethanol exposure of CAT-1 and PE-2 (2.5 to 3.5-times; Figure 9D). In contrast, there is no HAC1 mRNA accumulation in the laboratory S288c (Figure 9D).
Taken together, these results strongly indicate that the PE-2 and CAT-1 strains have increase induction of the UPR and this could be one of the selective advantages that favor these strains during competition in the hard conditions of the industrial fermentation in Brazil.