This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
A proper knowledge about the osmotic pressure and thermodynamic behaviours of protein solutions is vital for designing an efficient protein separation process. It is also of great importance to develop a rapid and inexpensive technique to accurately estimate the protein osmotic pressure. An intelligent model based on the feed-forward artificial neural network (ANN) to estimate the osmotic pressure of bovine serum albumin (BSA) in terms of pH, ionic strength and BSA concentration is proposed in this paper. Osmotic pressure of BSA is also modeled through the application of a colloidal interaction approach. Molecular interaction forces such as electrostatic, London-van der Waals, and hydration along with entropy pressure are considered in the colloidal model to predict the BSA osmotic pressure. The ANN predictions were compared with the results obtained from the colloidal interaction model and experimental data. Good agreement was observed between the predicted osmotic pressure values and the experimental data. It is also concluded that the ANN technique exhibits higher accuracy in predicting the osmotic pressure of BSA for wide ranges of input variables if 8 neurons are selected for the hidden layer in the ANN structure. Results of this study indicate that ionic strength and pH among the input parameters selected for the ANN have the greatest impacts on the osmotic pressure value. The proposed ANN model serves as a reliable tool for fast, low cost and effective assessment of osmotic pressure in the absence of adequate experimental data.
Keywords: Osmotic Pressure, Artificial Neural Network, Bovine Serum Albumin, Molecular Interaction, Back Propagation
Osmotic pressure of protein molecules plays an important role in design and scale-up of separation and purification processes such as chromatography and membrane filtration. In a membrane separation process, the concentration of the retained solutes at the membrane surface may reach saturation levels which can lead to elevated osmotic pressures. The high osmotic pressure at the membrane surface decreases the efficiency of the process by lowering the permeate flux. Thus, accurate prediction of osmotic pressure of protein solutions especially at high concentrations is crucial in estimating permeate flux during membrane filtration processes (Howell et al., 1993).
The magnitude of osmotic pressure depends on the physicochemical properties of the solution such as pH, ionic strength, protein type and concentration. A thermodynamic relation for osmotic pressure can be derived from the Gibbs free energy equation and the concept of chemical potential (Hiemenz and Rajagopalan, 1977). Considering the assumptions of ideal and incompressible solution, applicability of van't Hoff's equation obtained from the Gibbs free energy equation is limited to dilute solutions. Based on the van't Hoff's equation, osmotic pressure increases linearly with solute concentration. For concentrated solutions, the van't Hoff's equation can be modified using virial equations as the coefficients of the equation are obtained by fitting the truncated virial equation to the experimental data. According to statistical mechanics, the second virial coefficient corresponds to the interactions between particle pairs, while higher orders of virial coefficients are associated with larger number of particles (Cheryan, 1998; Everett, 1988). The major limitation of this method is that the experimental data is usually scarce in the literature and confined to a dilute range.
Various experimental techniques exist to determine the osmotic pressure. Osmotic pressure is generally measured by a membrane osmometer that is made of two chambers, separated by a semi-permeable membrane (permeable only to the solvent) (Moon et al., 2000). Another common method that estimates the osmotic virial coefficient is the static light scattering (SLS) technique (Ahamed et al., 2005). The SLS is usually used to obtain the molecular weight of the solute and the second virial coefficient. Due to its expediency and physiological importance, bovine serum albumin (BSA) in aqueous solutions has been widely employed in studies related to membrane ultrafiltration processes. However, very few comprehensive investigations have been conducted on the BSA osmotic pressure at high concentrations. The most important research in this regard has been done by Vilker et al. (1984). Osmotic pressure of BSA up to a protein concentration of 475 g/L was measured in their study by a membrane osmometer for limited ranges of pH and ionic strength. A considerable deviation from ideality was experienced even at moderate concentrations of protein. A strong relationship between the osmotic pressure and solution pH was observed. Kanal et al. (1994) examined the effect of pH on the osmotic pressure for a dilute range of BSA concentration up to 100 g/L in 0.1M NaCl buffer solution. Osmotic pressure was found to be minimum at pH of 4.6 and increased by a factor of five when pH dropped from 4.6 to 3. The variation of osmotic pressure with pH was explained by the change in the protein conformation and aggregation.
The concept of molecular interactions was employed by various researchers (Wu and Prausnitz, 1999; Lin et al., 2001; Bowen et.al, 1995a, 1996a,) to predict the osmotic pressure of BSA. Wu and Prausnitz (1999) developed two van der Waals type models and compared the results against experimental osmotic pressure of dilute BSA solutions (<150g/L) within the ranges of 4.5-7.4 and 1-5 M for pH and sodium chloride concentrations, respectively. In both models, the potential of mean force between proteins consisted of hard-sphere repulsion, van der Waals attraction, and double-layer repulsion. The contribution of the hard sphere repulsion to the osmotic pressure was represented by the Carnahan-Starling equation of state (EOS) in both models. The second virial coefficient in the first model and the random-phase-approximation (RPA) of the second model represented the van der Waals and double-layer interactions. Neither model was successful in predicting the BSA osmotic pressure at high salt concentrations. A more precise potential of mean force was recommended for molecular modeling of osmotic pressure of protein molecules in salt solution. Lin et al. (2001) proposed an equation of state to predict the BSA osmotic pressure with one adjustable parameter. They expanded the Duh and Mier-Y-Teran equation of state for one Yukawa potential to two Yukawa potentials to represent the repulsive interaction and the attractive interaction between charged BSA molecules. The Carnahan-Starling equation was used to calculate the hard sphere repulsive interaction. The adjustable parameter was the dispersion energy parameter in the Yukawa potentials instead of Hamaker constant in the Derjaguin-Landau-Verwey-Overbeek (DLVO) theory and the parameter of minimum distance between two protein surfaces. Their model was not valid when the BSA concentration was high or/and asymmetric microions existed in the solution. A colloidal interaction model was developed by Bowen et.al (1996a; 1995a) to calculate the osmotic pressure of colloidal system applying the extended DLVO theory. The DLVO theory considered a linear combination of London-van der Waals attractive and electrostatic repulsive forces as a function of distance separating particles (Hunter, 2001). The multiparticle nature of the electrostatic interaction between protein molecules was taken into account using the Wigner-Seitz model. The van der Waals attraction force was estimated using the Lifshitz-Hamaker constant. Good agreement between the colloidal model results and the experimental BSA osmotic pressure was found for the studied operating range.
Artificial neural network (ANN) models are potentially reliable tools for the estimation of complex parameters such as osmotic pressure if the network is successfully trained. Many interconnections in the ANN offer a huge degrees of freedom or fitting variables. Thus, ANN enables to illustrate the non-linearity of a system in contrast to conventional techniques. Another benefit of ANNs is that they are dynamically adaptive where they can learn and adjust to new conditions in which the efficiency of ANN is not sufficient with old situations. Moreover, the ANN model is able to handle systems with several inputs and outputs.
In this paper, artificial neural network (ANN) and colloidal interaction model were employed to predict osmotic pressure of BSA. Osmotic pressure was considered as a function of pH, ionic strength, and BSA concentration. The colloidal interaction model also included the effect of thermodynamic properties on the magnitude of BSA osmotic pressure. In the ANN model, the experimental data from the literature was divided into three categories, including training, validation, and testing datasets (Maier and Dandy, 2000). Levenberg-Marquardt optimization algorithm was used for training the network. The optimum ANN structure was determined through minimizing the absolute percent error (ANN model predictions and experimental data) and the absolute relative difference between the values calculated by ANN and colloidal interaction models. The performance of the mathematical models presented in this study was checked based on statistical parameters, namely R-squared, maximum and mean absolute percent errors. Reasonable agreement between the results was obtained which demonstrated the usefulness of the ANN model in prediction of BSA osmotic pressure. The methodologies used in this study and the results obtained are discussed in details throughout the following sections.
Artificial Neural Network
Artificial neural network (ANN) is a robust technique which is able to capture and represent complex relationships between inputs and outputs. ANN has been inspired from the information processing mechanisms of the brain (Zurada, 1992; Murray, 1995). According to William James (1980) "the amount of activity at any given point in the brain cortex is the sum of the tendencies of all other points to discharge into it, such tendencies being proportionate 1) to the number of times the excitement of other points may have accompanied that of the point in question; 2) to the intensities of such excitements; and 3) to the absence of any rival point functionally disconnected with the first point". This idea enabled McCulloch and Pitts (1943) to formulate the first models of a biological neuron. Over the last few years, ANN has been broadly employed for numerous applications such as process control, behavior prediction, model recognition, and system classification.
An ANN model is composed of a large number of interconnected neurons which can be considered as a computing engine that receives inputs, processes them in a hidden layer and then generates an output. Figure 1 depicts a simple architecture of an ANN with one hidden layer. The numbers of neurons and hidden layers depend on the complexity and nonlinearity of the problem. Each neuron is connected to the input and output by a corresponding weight. Inputs to each neuron are multiplied by their corresponding weights (IW). After that, the products are summed up together and with a bias neuron (b1), and the sum is processed using a nonlinear transfer function such as hyperbolic tangent sigmoid (tansig in MATLABÂ®) to obtain a1 (Torrecilla et al., 2005). The produced matrix, a1, is subjected to layer weights (LW) and the bias, b2. The product is applied to a linear transfer function (purelin in MATLABÂ®) to create an output. Initial values of weights are randomly assigned by employing uniform or Gaussian distribution. There are a number of different neural network structures and learning algorithms (e.g., back propagation (BP) and multiple layer perceptron (MLP)) for the ANN technique. The BP method is recognized as one of the most general learning algorithms. The BP technique is employed in the feed-forward ANN. It implies that the neurons are arranged in layers, and convey their signals "forward", and then the errors are propagated backwards. The BP algorithm includes a training process in which a series of inputs and outputs is provided and the network predicts the outputs based on the randomly assigned initial weights. Then the error (mean squared error) between real and predicted results is computed. The weights are modified through the training process until the error between the actual output and the predicted output is minimized. The BP algorithm adjusts the weights using the gradient descent principle where the change in the weight is proportional to the error gradient with a negative sign (Alshihri et al., 2009; Hagan and Beale, 1996). The validation set prevents overfitting of the network by stopping training once MSE in the validation set begins to increase. The performance of the ANN model is finally checked by introducing new and independent datasets to the trained model.
An alternative to the MLP is the radial basis function (RBF) network. The RBF network is recommended for function approximation problems which have local minima. The RBF type ANN guarantees convergence to globally optimum parameters. Furthermore, RBF networks perform more robustly, compared to MLP networks when noised input data set are involved in ANN. For the problems without local minima and for classification problems, MLP networks with BP training algorithm are preferred.
Colloidal Interaction Model
The methodology proposed by Bowen et al. (1996a; 1995a) was employed in this study to estimate the osmotic pressure of BSA for wide ranges of pH, ionic strength, and BSA concentration. Table 1 summarizes the equations which were used in the colloidal interaction model. Electrostatics (FELEC), London-van der Waals (FATT), and hydration forces (FHYD) as well as entropic pressure (PENT) were considered in the extended DLVO theory to estimate the osmotic pressure (Equations 1-4). The modified Gouy-Chapman electrical double layer model (EDL) was used to describe charge and potential distribution around the charged BSA molecules in the electrolyte (Bowen and Williams, 1996a; Hunter, 1993; Brett, 1993). In the EDL model, electrical double layer was formed by a compact layer of hydrated counterions around the protein surface followed by a diffuse layer extending into the bulk solution. Since the charge on the protein surface was not fully compensated by the compact layer, additional ions were attracted to the surface with weaker electrostatic forces. The ion distribution was expressed by the Poisson Boltzmann equation (Equations 5-7). When charged particles approach one another, their diffusion layers start to overlap, leading to a repulsive force which prevents further closeness. The multi-particle nature of such interactions was considered using a Wigner-Seitz cell model (Figure 2); each cell was presumed to be comprised of a single charged particle surrounded by a shell of fluid (Bowen and Jenner, 1995a; Wigner and Seitz, 1934). The effective area occupied by the protein at a hypothetical plane (surface of a hexagonal cell), Ah, was calculated from Equations 2-4. The parameter is the volume fraction of the protein and was calculated from Equation 4. Surface charge and zeta potential of the protein molecule were calculated based on the charge regulation model. The charge regulation model required the type and number of the amino acid groups of the protein participated in the ionization reaction and also the amount of ions adsorbed on the protein surface (Bowen and Williams, 1996a).
Equation 9 represents the London-van der Waals energy between two similar size spherical particles with their centers a distance (Dp+2a) apart. Molecules in close proximity induce charge polarization due to the electromagnetic fluctuations. These forces grouped as London-van der Waals forces are inherently attractive. These attractive forces can become effective when the surfaces approach one another. The van der Waals forces (Equations 8-9) required the estimation of the Hamaker constant for BSA which was obtained from the refractive index data of BSA in the solution and Lorenz-Lorentz equation (Bowen and Williams, 1996b; Hough and White, 1980; Bowen and Jenner, 1995b; Nir, 1977). Other repulsive forces between proteins are the hydration forces often referred to as polar interactions (Equation 10). Strong polar interactions orient water molecules adsorbed on the surface of proteins, and thus the stability of the colloidal system is conferred by those hydrated water molecules that force the two proteins apart at contact. Entropic pressure (Equation 11) was calculated using an equation proposed by Hall, offering the best hard sphere entropic pressure results for both high and low volume fractions (Hall, 1972). Technical readers are encouraged to study the following references for more information (Bowen and Jenner, 1995a; Bowen and Jenner, 1995b; Wigner and Seitz, 1934; Bowen and Williams, 1996a; Bowen and Williams, 1996b; Hough and White, 1980).
Results and Discussion
The ANN model included three independent variables (pH, ionic strength and protein concentration). As was discussed earlier, the magnitude of the osmotic pressure is controlled by these three parameters (Bowen and Williams, 1996a; Vilker et al., 1984). Input data was normalized before starting the ANN modeling to avoid any false influence of factors with higher order of magnitude. Data normalization was performed using Equation (12) as follows:
where xmax and xmin are the highest and lowest values of variable x, respectively. The ANN model was trained using experimental data from literature (Vilker et al., 1984; Wu and Prausnitz, 1999). The training and validation datasets were selected randomly from the available osmotic pressure data.Â MLP network was used in the current study to estimate osmotic pressure. The conventional back propagation method was used to modify the weights. Osmotic pressureÂ in terms of input parameters selected in this study does not have multi-peak nonlinear functions (Vilker et al., 1984; Wu and Prausnitz, 1999). Thus, there is no concern of running into local optima rather than global optima (It is worth mentioning again that RBF networks are recommended for problems with local minima). Therefore, conventional back propagation is an effective training algorithm that can be used with no risks of disconvergence. Implementation of different ANN models and comparison of their performance for a particular process (or/and phenomenon) would be a part of our future research work.
MATLABÂ® software version 7 from Mathworks, Natick, Massachusetts was employed for the ANN modeling. The Levenberg-Marquardt optimization algorithm was the back propagation technique selected for training the neural network. The algorithm is the fastest back propagation algorithm in MATLABÂ® toolbox. Hyperbolic tangent sigmoid and linear functions were the transfer functions for the hidden and the output layers, respectively. The performance of the ANN model was tested by 12 new and independent datasets. One hidden layer was selected in the ANN model. According to the universal approximation theory, one hidden layer with a sufficient number of neurons can model any set of input/output data to a reasonable degree of accuracy (Tambe et al., 1996).
Table 2 shows the performance of neural network models with various numbers of neurons in a single hidden layer. The performance of the network was tested by calculating R-squared, mean absolute percent error, and maximum/minimum absolute percent errors for training (including validation) and testing datasets. The corresponding formulas for the above statistical parameters are presented in Appendix A. Generally, an R2 value larger than 0.9 shows a satisfactory performance for the proposed model; while, an R2 extent in the range of 0.8-0.9 is an indicator of a good performance and values lower than 0.8 indicate an unacceptable performance for the model suggested to predict a particular variable. R2 was found higher than 0.998 for neural networks with 4 and higher numbers of neurons. High values for mean and maximum percent errors were obtained when the number of neuron was less than 4. Increasing the number of neurons from 2 to 8 lowered the mean percent error by 93% from 66.4% to 4.5%. The maximum and minimum absolute errors for the network with 8 neurons were 29% and 0.01%, respectively. Increasing the number of neurons from 8 to 10 increased the maximum absolute percent error of the training and testing phases by 30% and 45%, respectively. Higher numbers of neurons were avoided in order to prevent overtraining of the ANN model (Omidbakhsh et al, 2010).
Table 2: Performance of ANN with various numbers of neurons in one hidden layer
Figure 3 shows scatter plots of the predicted (ANN) BSA osmotic pressure versus the experimental data (left) and the percent error for the training (including validation) and testing data points. Despite the limited availability of experimental data, the ANN model (3:8:1) was capable of predicting BSA osmotic pressure with a mean absolute percent error of approximately 5% for both training and testing phases.
Colloidal Interaction Model
Zeta Potential of BSA
Zeta potential of BSA was estimated employing the charge regulation model for a pH range of 4-10 and an ionic strength range of 0.01-1M; results are shown in Figure 4. Surface charge and zeta potential of BSA were calculated based on the ionization reaction of amino acids. Type and number of amino acids participating in the ionization reaction, pH, and ionic strength were required in the development of the charge regulation model. The amino acid sequence of BSA indicated the number of each amino acid in the protein molecule (Bowen and Williams, 1996a). Sodium chloride was assumed to be the only salt in the solution. BSA surface carried a positive net charge at pH values lower than its isoelectric point and a negative net charge at pH values higher than the isoelectric point. At the isoelectric point, the surface net charge and consequently zeta potential were zero. The estimated isoelectric point for BSA varied between 4.4 and 5.2, depending on the ionic strength of the solution. The experimental isoelectric point for BSA in 0.15M NaCl solution was reported to be 4.72 (Vilker et al. 1984) which is in good agreement with the value estimated in this study (â‰ˆ 4.76). Results showed that zeta potential was strongly dominated by the effect of pH, while ionic strength had a lower influence on the magnitude of zeta potential. Effect of pH on the zeta potential was found more significant at low values of ionic strength. The predicted zeta potential of BSA at ionic strength of 0.03M in sodium chloride solution was compared with the experimental data (Bowen and Williams, 1996a) and acceptable agreement was observed.
Osmotic pressure of BSA as a function of protein concentration was calculated at various pH and ionic strength values based on the colloidal interaction model. Figure 5 shows the results obtained in this phase of work. Electrostatic, London-van der Waals, and hydration forces as well as entropic pressure were considered in the colloidal interaction model. At pH close to the isoelectric point of BSA, the net charge on the protein was small and therefore, electrostatic repulsion and osmotic pressure were low. Osmotic pressure increased as pH diverged from the isoelectric point. Increase in the ionic strength, however, reversed the effect by shielding charges, causing molecular contraction and thereby decreasing the osmotic pressure. At an arbitrary BSA concentration of 300 g/L and ionic strengths of 0.03 and 0.15M, osmotic pressure decreased by approximately 70% and 35%, respectively, when pH dropped from 7.4 to 5.4. It can be concluded that osmotic pressure becomes more sensitive to the ionic strength as pH diverges from the isoelectric point of BSA. At a BSA concentration of 300 g/L and pH of 5.4, osmotic pressure decreased by approximately 60% when ionic strength increased from 0.03 to 0.15M. At the same BSA concentration and a pH value of 9, increasing the ionic strength from 0.03 to 0.15M resulted in the osmotic pressure reduction by approximately 80%.
Performance of the colloidal interaction model was evaluated by plotting predicted osmotic pressure of BSA versus experimental data, as shown in Figure 6. The developed colloidal interaction model predicted the osmotic pressure of BSA with an R2 and mean absolute percent error of 0.954 and 29%, respectively, without any adjustable parameter. The colloidal model experienced lower accuracy at pH 4.5 due to over prediction of BSA osmotic pressure. The mean absolute percent error was obtained 14%, 8%, and 68% at pH 7.4, 5.4, and 4.5, respectively. The maximum percent error of 120% which exhibited a large error was observed at pH 4.5 and ionic strength 0.15M when protein concentration was higher than 300 g/L. The values of maximum absolute percent error were 34% and 22% at pH 7.4 and 5.4, respectively. Scatter plot of percent error versus protein concentration (Figure 6(C)) and versus ionic strength (data not shown) followed no specific trend.
ANN versus Colloidal Interaction Model
Table 3 lists the osmotic pressure data obtained by the ANN and colloidal interaction model and the percent error for each data point. The results confirms that the osmotic pressure calculated by the neural network model (3:8:1) is more accurate than the osmotic pressure calculated by the colloidal interaction model.
Since the ANN model was trained for limited ranges of pH, ionic strength and BSA concentration, it was essential to evaluate the interpolation/extrapolation power of the neural network model. Neural network predictions at new pH, ionic strength, and protein concentration were compared only against colloidal interaction model predictions because experimental data of BSA osmotic pressure was not available for all values of pH, ionic strength, and BSA concentration. The optimum neural network model consisting of one hidden layer with 8 neurons (3:8:1) was selected for the comparison purposes. The absolute relative difference was calculated at each data point, as depicted in Figure 7. A comparison was made at ionic strengths of 0.03, 0.15 and 1 molar when the pH varies between 4.5 and 7.4.
Absolute relative difference contour plots in Figure 7 showed maximum peaks at pH equal to 7. Absolute relative difference was found as high as 800% at ionic strength of 0.03M. The maximum absolute relative difference of 1Ã-103% was observed at pH 7 and ionic strength of 0.1 and 0.15M. Although accurate osmotic pressure values were predicted at pH 4.5, 5.4, and 7.4 by an ANN model consisting of one hidden layer with 8 neurons, the ANN model (3:8:1) failed to interpolate. Negative osmotic pressure values were predicted by the neural network (3:8:1) at pH 7 when protein concentration was below 300 g/L. The interpolation failure by the ANN model (3:8:1) was due to either limited available experimental data or network overtraining (Omidbakhsh et al., 2010). The overtraining of the neural network was assessed by mapping the absolute relative difference of the ANN models at different neuron numbers. A neuron network with 4 neurons (3:4:1) appeared to be an optimum structure with the lowest absolute relative difference. Absolute relative difference contour for a neural network consisting of one hidden layer with 4 neurons was plotted in Figure 8.
A neuron network with 4 neurons (3:4:1) predicted the osmotic pressure with a maximum absolute relative difference of 192%, 186%, and 182% at ionic strengths of 0.03, 0.1 and 0.15M. ANN model (3:4:1) was able to predict the osmotic pressure of BSA with a mean absolute relative difference of 41%, 30%, and 25%. It was observed that the maximum relative difference peaks fell in a region where the BSA concentration was lower than approximately 100 g/L. The maximum absolute relative difference between the ANN model (3:4:1) predictions and the colloidal interaction model predictions for a pH range of 4.5-7.4, ionic strength range of 0.03-0.15M, and BSA concentration range of 100-450 g/L was obtained less than 80%. It is worth mentioning again that colloidal interaction model prediction was subjected to a maximum absolute percent error of 120% (compared against experimental data) at pH 4.5, ionic strength 0.15M, and protein concentrations >300 g/L. Osmotic pressure values predicted by the ANN model (3:4:1) were compared against the experimental data, ANN (3:8:1) results, and the colloidal interaction model predictions, as tabulated in Table 3. Although ANN (3:4:1) had lower precision compared to the ANN (3:8:1) in predicting experimental data, ANN model (3:4:1) interpolated the BSA osmotic pressure with much less percent difference.
4.4 Relative Effect of Input Variables
The contribution of each input variable in the neural network on the BSA osmotic pressure was determined by a methodology proposed by Garson for partitioning the neuronal connection weights (Garson, 1991). The relative importance of input parameters, RI, was calculated using the input and output connection weight (Equation (13)). The higher correlation between any input variable and the output variable indicated greater significance of the variable on the magnitude of the dependent parameter.
where nH is the number of hidden neurons, nv the number of input neurons, ivj the absolute value of input connection weights, and Oj is the absolute value of connection weights between the hidden and output layers. Figure 9 illustrates the relative importance of ionic strength, pH, and BSA concentration on the osmotic pressure. Ionic strength and pH were the most important parameters affecting the osmotic pressure.
Osmotic pressure of BSA was predicted employing artificial neural network and colloidal interaction models. The neural network consisted of an input layer with 3 nodes (pH, ionic strength and BSA concentration), a hidden layer, and an output layer (osmotic pressure). The neural network was trained using the Levenberg-Marquardt optimization algorithm. In the colloidal interaction model, particle-particle interactions such as electrostatic, London-van der Waals, and hydration forces along with entropy pressure were considered. Physicochemical properties of the BSA solution were considered in the colloidal model in order to estimate the osmotic pressure with no adjustable parameters. The following main conclusions may be drawn from the results presented in this study:
Prediction of BSA osmotic pressure was made possible using ANN and the colloidal interaction model. The prediction performance of the proposed neural network was better than that of the colloidal interaction model. The colloidal model experienced low accuracy at pH 4.5 due to over prediction of BSA osmotic pressure. The maximum absolute percent errors of 120% and 29% were observed for the colloidal interaction model and ANN (3:8:1), respectively.
The key aspect of accurate prediction of BSA osmotic pressure is the structure of the neural network model. The optimum configuration for the ANN model within the ranges of input variables included 8 neurons in one hidden layer which was determined using a trial and error technique. The optimum ANN (3:8:1) predicted the BSA osmotic pressure with a mean absolute percent error of 4.5%.
The interpolation performance of the ANN model was investigated by calculating the absolute relative difference of the results obtained from the ANN model and the colloidal model. The ANN (3:8:1) failed to interpolate due to either limited available experimental data or network overtraining. A considerable improvement in the interpolation performance of the ANN model was achieved by reducing the number of neurons from 8 to 4.
The relative importance of the input variables was calculated using the connection weight partitioning method. Results showed that ionic strength and pH have the most effects on the magnitude of the osmotic pressure.
Although the proposed ANN model accurately predicts the BSA osmotic pressure, the model suffers from early convergence by the degeneracy of several dimensions; even if no local optima exist for the cases considered with this model. Hence, an efficient evolutionary algorithm is required to be combined with ANN which is a part of our future study.
R-Squared (R2), mean absolute percent error, and maximum/minimum absolute percent errors are the statistical parameters to test the accuracy of the ANN model compared with the colloidal interaction model and experimental data. The equations to compute the above parameters and also the corresponding description are as follows:
R2 is a statistic parameter which gives information on goodness of fit in a model. In regression analysis or fitting of a model, the R2 coefficient is a statistical measure of how well the model line approximates the real data points. An R2 of 1.0 indicates the model line perfectly fits the data. The following equation shows the mathematical definition of R2 (Montgomery and Runger, 2006; Montgomery, 2008):
where M and P are the measured and predicted osmotic pressure values, respectively. represents the average of the measured osmotic pressure data.
Mean absolute percentage error (MAPE) is the measure of accuracy in a fitted time series value in statistics (%) and is defined as (Montgomery and Runger, 2006; Montgomery, 2008):
The Mean Squared Error (MSE) is a measure of how close a fitted line or developed model is to data points. The smaller the Mean Squared Error the closer the fit (or model) is to the actual data. The MSE has the units squared of whatever is plotted on the vertical axis. MSE is described by the following relationship (Montgomery and Runger, 2006; Montgomery, 2008):
where n is the number of samples.
Other two performance measures that were used in this paper to assess the effectiveness of the training and testing data include Minimum Absolute Error (MIAE) and Maximum Absolute Error (MAAE) as follows (Montgomery and Runger, 2006; Montgomery, 2008):