This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Predicting the three-dimensional structure of a protein from its linear sequence is a great challenge in the current computational biology. The problem can be described as the prediction of the three-dimensional structure of a protein from its amino acid sequence or the prediction of a protein's tertiary structure from its primary structure.
There are two methods for protein structure prediction: the experimental methods and the computational methods.
In the meantime, there are two main experimental methods available for protein structure prediction: X-ray crystallography and nuclear magnetic resonance (NMR). Unfortunately, these methods are not efficient enough because they are expensive and time-consuming.1 As a result; there is a bad need for a fast and reliable computational method to predict structures from protein sequences, especially because the number of completely-sequenced genomes is growing very fast.
There are three main computational methods for protein structure prediction which depends mainly on the percentage of similarity of the input protein sequence with other existing sequences in the database. First is homology modeling, also known as comparative modeling, which is used when there is a similarity between the target sequence and the sequences of already exist proteins in protein database.2 Second is Fold recognition, also known as protein threading, which is an inverse of protein folding problem. It based on the fact that the number of new folded protein structure is not growing fast comparing to the number of new protein sequences, which leads to the observation that any new predicted structure will be almost folded to an existing structure in the database.
Ab initio is a prediction method that seeks to predict the tertiary structure of a protein from its amino acid sequence alone -without knowledge of similar folds. It has been called by several names like de novo modeling, free modeling or physics-based modeling.3 It based on the thermodynamic hypothesis which states that the tertiary structure of the protein is the conformation with the lowest free energy.4 Ab initio modeling, however, is challenging for the following reasons. First, there is a huge number of proteins that have no homology with any of the known structure proteins. Second, some proteins which show high homology with other proteins have different structures. Third, comparative modeling does not offer any perception of why a protein adopts a specific structure.5
A successful ab initio method for protein structure prediction depends on a powerful conformational search method to find the minimum energy for a given energy function. Molecular Dynamics (MD), Monte Carlo (MC) and Genetics Algorithm (GA) are common methods to explore protein conformational search space.
In this paper, we introduce an ab initio protein structure prediction method using an adapted harmony search algorithm as a conformational search tool which will be the first attempt to use HSA in this problem.
Materials and Methods
We used SMMP energy function which is a modern package for simulation of proteins.6 A set of energy minimization routines and modern Monte Carlo algorithms are used with two different parameter sets to calculate the internal energy: ECEPP/2 potential, ECEPP/3 potential.
We tested our algorithm on a small protein called Met-enkephalin which has five Amino acids.
We represented the protein sequence as a vector of torsion angles (Fig. 1) which includes both main chain angles (F, ?, ?) and side chain angles (X1, X2,...,Xn). Initialy, a conformation of angles is generated randomly within the interval [-p,p] and passed to the harmony memory to be optimized by the adapted Harmony Search Algorithm.
Harmony Search Algorithm
The HS algorithm is a metaheuristic algorithm mimicking the improvisation process of musicians. In the process, each musician plays a note for finding a best harmony all together.7 Likewise, each decision variable in optimization process has a value for finding a best vector all together.
The harmony search consists of four steps:
Step1. Initialize Harmony parameters like the HM, PAR and HMCR.
Step2. Improvise a new harmony from HM.
Step3. If the new harmony is better than minimum harmony in HM, include the new harmony in HM, and exclude the minimum harmony from HM.
Step4. If stopping criteria is not satisfied, go to Step 2.
Adapted Harmony Search Algorithm
We introduced an adapted Harmony Search algorithm (Fig. 2) which introduced a new scheme for selecting the two main parameters of HSA; PAR and HMCR using a simulated annealing.
Our new method starts by picking a protein sequence from the protein sequence database then represents this protein as a vector of torsion angles and pass it to the harmony memory which will be retrieved by AHSA for optimization to minimize the energy. A new harmony vector is improvised based on random selection, memory consideration and pitch adjustment.
The new scheme of selecting Harmony parametrs allows PAR to decrease and HMCR to increase during the optimazaion process of the adapted Algorithm. We proposed the following two equations to adapt PAR and HMCR respictively:
PAR = PAR * Exp [- abs (best energy) / Tn] HMCR = HMCR + (1-HMCR) * (1-Exp [- abs (best energy) / Tn])
Where Tn starts with a high value 1000000 and decreases by a small value a within the interval [0 .0005, 0.05].
Adapting the two parameters PAR and HMCR continues until reach the value of PAR =0.05 and HMCR=0.95.
Results and discussion
Testing our adapted algorithm on the Met-enkephalin protein shows good results compared to the works of the researchers working on the same protein. Table 1 shows the results with comparison to the previous works.
With parameters: harmony memory = 10, PAR = 0.20, HMCR = 0.85, our method can find the best energy after 1000000 iterations.
Adapting the two parameters PAR and HMCR during the optimization process help prevent the program stuck in local optima.