CoMFA, CoMSIA and Molecular Docking Studies of Saquinavir based Peptidomimetic Inhibitors of HIV-1 Protease

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

CoMFA, CoMSIA and Molecular Docking Studies of Saquinavir based Peptidomimetic Inhibitors of HIV-1 Protease


HIV protease has been one of the most look after target site to combat HIV infection and Saquinavir is the forerunner of all. There has always been a quest for new HIV-protease inhibitor for AIDS-treatment. An in-silico study was done as an attempt to develop a 3D-QSAR model based on CoMFA and CoMSIA studies, to design and evaluate new saquinavir based chemical entities for their anti-HIV Protease activity. Optimal CoMFA and CoMSIA models were generated using a set of saquinavir based 23 molecules (18 training and 05 test set). The leave-one out cross validation correlation coefficients were q2CoMFA= 0.681 and q2CoMSIA= 0.684. The correlation between the experimental activities and cross validated/predicted of the test set molecules were high and reflected robustness of the models (r2CoMFA= 0.967 and r2CoMSIA= 0.988). The CoMFA model suggested 76.4% steric and 23.6% electrostatic field contribution while the optimal CoMSIA model revealed electrostatic, hydrophobic and hydrogen bonding interactions were significantly required for HIV-protease inhibition. The models were subjected to molecular docking studies for in-silico validation using on different set of molecules derived from ZINC database with ≥ 95% similarity with saquinavir. 07 molecules having activity greater than saquinavir, as predicted, in common, by CoMFA and CoMSIA models, were docked against HIV-1 protease. The dock score and the predicted activity were observed to be significantly correlated with r2= -0.7142 (CoMFA); -0.6219 (CoMSIA) while the binding patterns were observed to be comparable to that of Saquinavir.

Keywords: AIDS, QSAR, Comparative Molecular Field Analysis, Comparative Molecular Similarity Indices Analysis


HIV infection is one of the dreaded threats to human health because of the absence of relevant vaccine and drugs to cure AIDS [1]. HIV-1 Protease has always been choice of target since the success of Saquanivir, the first Anti-HIV protease drug, for designing new chemotherapeutic agents. On the other hand, being a homodimer, it is exceedingly inclined to develop mutations and a single mutation of gene causes double mutation of enzyme [2]. Saquinavir, a peptide derivative, hinders the cleavage of gag and pol polyproteins of HIV genome brought about by HIV-1 amd HIV-2 protease. Thus, it restricts the post-translational processing and packaging of HIV capsid and thus restricts its spread [3]. Presently, there are ten FDA approved HIV protease inhibitors (PIs) as antiretroviral drugs used in the treatment of HIV infection [4]. Despite the fact that HIV protease inhibitors in the market are exceedingly specific yet they induce side effects such as lipodystrophy, hyperlipidaemia, insulin resistance [5-7] and rise of safe mutant upon delayed inhibition [8, 9]. Subsequently, there is a steady interest for new HIV protease inhibitors. Also, the use of marketed drug regimens is aggravated by different important issues like tolerance, adherence, chronic toxicity and cross‐resistance [10]. So, there is continuous need of designing, evaluating and developing new improved antiviral molecules with broad efficacy against PI‐resistant HIV mutants as next generation drugs [11]. Chemoinformatics has always been at the central stage of drug design and development process and starts from lead identification and characterization followed by its in-silico pharmacological evaluation. The QSAR methodologies are the forerunners of chemoinformatics tools. The present study was carried out to investigate into the molecular parameters required for designing novel HIV PI-based drugs for anti-AIDS therapy, using 3D-QSAR with Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Indices Analysis (CoMSIA) approach. The QSAR models, thus generated were used to obtain the HIV-protease inhibition activity of structural analogues of saquinavir for their potential candidature for AIDS therapy and were also evaluated for their validation and applicability using in-silico docking tools, as reported earlier [12].

2. Materials and Methods

2.1 Biological Dataset: A biological dataset of 23 compounds of Saquinavir derivatives [13] was chosen in the current study to carry out 3D-Quantitative Structure Activity Relationship (QSAR) CoMFA and CoMSIA analysis (Figure 1). Three dimensional structure building and modeling were performed using the SybylBase package. All compounds were subjected to energy minimization using standard Tripos force field with 0.001 kcal/mol distance dependent dielectric and conjugate gradient algorithm convergence criteria. The charges were computed using Gasteiger-Huckel method as a final step of minimization. The experimentally reported activity [IC50 (nM)] of all the compounds was converted into pIC50 [-log (IC50)] which was then subsequently used as dependent variable for CoMFA analysis (Table 1).

2.2 Alignment of Dataset: The reliability and efficiency of CoMFA results depend upon various alignment rules like probe atom type, orientation of the aligned molecules and lattice shifting step size [14]. In present study, the ALIGN Database command in Sybyl was used for molecular alignment. The compound with maximum activity (Saq11) was used as template alignment and other molecules of data set were aligned using maximum substructure. This conforms the orientation of the molecules such that their electrostatic and steric fields match the fields of the template molecule.

2.3 Generation of 3D-QSAR CoMFA/CoMSIA: The Lennard-Jones and the coulomb potential energy were calculated at every lattice intersection of a regularly spaced (2.0 Å) grid box. They represent steric and electrostatic CoMFA potential fields. The CoMFA regions were determined automatically and the region boundary of each structure extended outside 4 Å in all direction. The probe atom used to generate steric and electrostatic fields was sp3 hybridized carbon atom bearing +1 charge. The steric and electrostatic contributions were reduced to the default value of +30.0 kcal/mol.

During the CoMFA analysis, many grid points on the molecular surface were overlooked due to increment in van der Waals repulsion. Hence, CoMSIA method was also used for generation QSAR model, to avoid the sudden alterations of potential energy near the molecular surface. The CoMSIA analyses was capable of generating more stable models than COMFA by using Gaussian like function based on distance [15-17]. The grid used for field calculation in CoMSIA was same as that of CoMFA method. The descriptors and the probe atom were selected as per the earlier report [12]. The method of partial least square (PLS) was chosen for establishing correlation between the biological activity values and CoMFA/ CoMSIA fields using two steps process as per the earlier reports with minor modifications [18, 19, 20]. The CoMFA and CoMSIA results were finally interpreted by their respective contour-contribution maps.

2.4 Mining of ZINC Database: The Zinc Chemical databank was searched for structural analogues of saquinavir with 95% similarity and 40 molecules were obtained as hits. These molecules were subjected to their anti-HIV-1 protease activity prediction, using the CoMFA and selected CoMSIA models, after generating their sybyl_mol2 files and minimizing their 3D-strucutres. These compounds possessing activity values higher than Saquinavir (pIC50: 7.7939CoMFA; 7.7521CoMSIA) were chosen for further binding analysis against HIV-1 protease (PDB-3OXC).

2.5 Docking Studies: The 3D structure of HIV-1 Protease at resolution: 1.16 Å, was obtained from Protein Databank (PDB ID: 3OXC) (Figure 2) [21]. The water molecules, heteroatoms and ligands were removed from the protein crystal structure andwere further prepared for docking using Dockprep in UCSF Chimera [22]. For docking purpose, the GUI program Auto-Dock-[23] was used to prepare, run and analyze the docking simulations as per our earlier report [12] with minor modifications. The grid box was of cubic dimension (60 x 60 x 60 Å) that included all the amino acid residues present at the reported binding site [21] of receptor molecule while 500000 was set as the maximum number of energy evaluation for genetic algorithm (GA).

3. Result & Discussion

3.1 3D-QSAR CoMFA and CoMSIA model: An attempt was made to identify a novel CoMFA model depicting acceptable statistical correlation between structure and potential inhibitory activity of Saquinavir and its analoguesagainst HIV-protease. A set of 5 molecules, excluded in the construction of CoMFA models, were selected as test set for validating the stability and predictability of the obtained model. Thus the database of 23 molecules comprised of 18 and 5 molecules in training and test set respectively (Table 1). The statistical parameters obtained for the optimal model are given in Table 2. The scatter plot of activities - experimental vs predicted (Figure 3), clearly demonstrates that the pIC50 values as predicted by the model and experimental data are in good accordance. These statistical analyses of the selected CoMFA model reflect its acceptability.

The CoMFA model, when represented as a 3D-coefficient contour map, showed the electrostatic and steric fields from PLS analysis. The contour maps were plotted as percentage contribution to the QSAR equation and were associated with difference in biological activity. The contour plot reflected that electrostatic contribution of about 23.61% and steric contribution of 76.39% were involved in anti HIV-1 protease activity of saquinavir-analogues (Figure 4).

The CoMSIA model, similar to CoMFA, with hydrophobic, steric, electrostatic, Hydrogen-donor and acceptor fields resulted in acceptable statistical data as shown in Table 2 - Model I. After analysing model I, it was inferred that steric and acceptor fields did not contribute towards the affinity of ligands to the receptor protein significantly, in all-fields model with 5.6% and 8.5% contribution, respectively. They also increased the contour maps’ complexity and their interpretations. Hence, CoMSIA was approached with lesser subset of fields (Table 2, Model II-V). The internal predictivities (q2) of different field combinations in models II-V were observed to reduce slightly as compared to the all-fields model I. Among all the subset models generated, model IV, with the combination of electrostatic, hydrophobic and donor field had the maximum conventional non-cross validated r2 of 0.988 and the SEE value of 0.103, indicating it the most significant model, statistically. The donor field was observed to contribute most towards affinity with 43.9% contribution (Figure 5a), followed by hydrophobic and electrostatic field with 40.2% and 15.9% contribution respectively (Figure 5b-c).

3.2. Screening of ZINC Databank and Docking Analysis of Selected Analogues: A total of 40 molecules were obtained as hits against 95% structural similarity of Saquinavir as query. These molecules when subjected to activity prediction using CoMFA model, as generated above, showed 10 and 07 molecules having anti-HIV-1 protease activity (pIC50) better than that of Saquinavir i.e., pIC50: 7.7939CoMFA and 7.7521CoMSIA, respectively, as shown in Figure 6.

For validating the QSAR models as generated earlier through in-silico approach, molecular docking analyses were performed against HIV-protease for these selected 07 Saquinavir analogues, as screened and selected from ZINC database. The binding pockets of the docked ligands in the receptor were inferred to be either same or similar as that of Saquinavir (Figure 4), as observed in previously reported studies [21, 24]. The 3-residues sequence within the tunnel of HIV1-Protease i.e., Asp-Thr-Gly was observed to be common in all the binding sites of all the docked ligands, being in accordance to the earlier reports [21, 24]. The dock score (binding energy) and binding patterns of the selected analogues were observed to be comparable to that of Saquinavir (Figure 7) and thus provides us a promising reason to investigate them further in search of anti-HIV chemotherapy. Also, the dock score of these molecules were observed to be highly correlated to their anti-HIV-1 protease activity (pIC50) as predicted using generated CoMFA and selected CoMSIA models with a correlation coefficient (r2) of -0.7142 and -0.62199, respectively. The negative correlation is well justified by the fact that lower is the binding energy of ligand, better is the binding/affinity of ligands to the receptor (HIV-1 protease) and higher is the biological activity (pIC50) of the ligand.

4. Conclusion

The present study was undertaken to generate a 3D-QSAR model based on CoMFA analysis for evaluating chemical entities analogous to Saquinavir for designing new drugs against HIV. The model generated was found to have a good statistical acceptance and was further validated in-silico using entirely different methodology of molecular docking. The docking results were inferred to be in good accordance with the anti-HIV-protease activity as predicted by the QSAR CoMFA model. Thus, the study paves a good platform for further validation of the generalized CoMFA model for evaluating saquinavir based protease inhibition, using in-vitro studies, so that the same can be used with confidence in anti-HIV novel drug design process.