Homology Modelling And Validation Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The snake venom 5 Nucleotidase SV-5 NUC target sequence with accession number A6MFL8 was retrieved from Uniprot database and its physiochemical characterization was computed using the Expasy Protparam program. A similarity search for SV-5ʹ NUC in the Protein Data Bank (PDB) was performed using the BLAST server. Crystal structure of Human 5ʹ Nucleotidase (H-5ʹ NUC) PDB ID - 2J2C was selected as the template for the target SV-5ʹ NUC based on its sequence and functional homology. Alignment between target SV-5ʹ NUC sequence and the template H-5ʹ NUC sequence was performed and visualized using ES pript.

2.2. Homology modelling and validation

Homology modelling of the target protein was carried out with MODELLER9v7 and multiple models were generated. The generated models were ranked based upon their Discrete Optimized Protein Energy (DOPE) scores and MOLPDF scores. DOPE is a statistical potential used to assess homology models in protein structure prediction. DOPE is based on an improved reference state that corresponds to non-interacting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The energy of the protein model generated through many iterations by MODELLER is assessed by the DOPE score, to ascertain the satisfaction of spatial restraints. The target model having the least DOPE and MOLPDF scores with acceptable statistics from Ramachandran plot was selected further for all other studies. Validation studies was further performed on selected SV-5ʹ NUC target model using NIH SAVES server analysis.

2.3. Molecular dynamics simulation (MD) of chosen target model

MD stimulation was carried out using 43A1 force field of Gromacs96 enforced in the GROMACS program. A cubic box with the SPC water model was built and submitted to maximum 1000 steps of energy minimization using the steepest descent gradient algorithm. Leap-frog algorithm was used for integrating Newton's equations in MD simulation. The chosen target model was subjected to equilibration for 1000 steps. Further a MD simulation for 500 ps at 300 K was performed, using 2 fs step integration time. Constraints were used on all protein covalent bonds to maintain the constant bond length. Berendsen temperature and Parinello-Rahman pressure coupling were used to subdue the drift effect during equilibration and MD simulation. Co-ordinates and energy terms (potential energy for the whole system) were saved for every 10000 steps, with the aim of evaluating the protein system stabilization throughout MD simulations.

2.4 Binding-site prediction

The binding-site identification plays a major role in structure based drug design (SBDD). In our study, the binding-site region of the chosen predicted model was identified by using SiteMap program (v2.5) which identifies one or more regions suitable for ligand binding. Further, the hydrophobic and hydrophilic map (donor, acceptor and metal-binding regions) was produced using various contour maps and scored. The score was generated using default parameters implemented in SiteMap program (v2.5) to generate more than two sites.

2.5. Ligand preparation

The chemical molecules vanillin (CID: 1183) and vanillic acid (CID: 8468) were retrieved from Pubchem database. The ligands were prepared for docking by using LigPrep program (v2.5). The tautomers for each of these ligands were generated, optimized and also neutralized. Partial atomic charges were computed using the OPLS_2005 force field and the ligands were energy minimized.

2.6. Molecular docking analysis

The "Extra Precision" (XP) mode of Glide (v5.7) was used to perform all docking calculations using the OPLS-AA 2005 force field. In this work the bounding box of size 10 Å Ã-10 Å Ã-10 Å was defined and confined to the sitemap predicted active site region of SV-5ʹ NUC model for docking the ligands. The scale factor of 0.4 for van der Waals radii was applied to atoms of protein with absolute partial charges less than or equal to 0.25. Five thousand poses per ligand were generated during the initial phase of the docking calculation, out of which best 1000 poses per ligand were chosen for energy minimization. The dielectric constant of 4.0 and 1000 steps of conjugate gradient minimizations were included for energy minimization protocol. Upon completion of each docking calculation, 10000 poses per ligand were generated and the best docked structure was chosen using a Glide Score function. The choice of the best pose is made using a model energy score that combines the energy grid score, Glide score, and the internal strain of the ligand.

2.7. ADME analysis

The QikProp program (v3.4) was used to obtain the absorption, distribution, metabolism and excretion (ADME) properties of all molecules. It predicts details of physically significant bioactive principles and pharmaceutically relevant properties of a ligand. The program was processed in normal mode, and more than 40 chemical and biological properties were analyzed for all the ligands considered for this study. This program is also believed to evaluate the drug-likeliness of the compounds based on Lipinski's rule of five, which is essential for rational drug designing.

2.8. Energy-optimized pharmacophore mapping

The process of locating the energy-optimized pharmacophore (e-pharmacophores) regions based on the energy terms obtained from the Glide XP scoring function to accurately characterize the ligand-protein interaction. From the single-mode computed energy terms of the docking poses of the ligand to the protein, the e-pharmacophore sites of the ligands are generated. The generated details of the atom centers (donor, acceptor and aromatic ring pi-pi interactions) and their Glide XP energies that comprise each pharmacophore site are summed. The sites are then ranked based on these energies and the most favorable sites are then mapped to produce the common e-pharmacophore.

2.9. Molecular electrostatic potential analysis

The Molecular electrostatic potential (MEP) analysis at the functional binding pocket of the modeled target protein and the ligands were carried out using Pymol (v1.3) based on their surface level potential values. The Poisson-Boltzmann based molecular surface was generated and visualized using Pymol (v1.3).

3. Results and discussion

3.1 Sequence analysis and physicochemical characterization

Demansia Vestigiata SV-5ʹ NUC comprises of 559 aminoacids (Uniprot id: A6MFL8) with a molecular mass of 64,642 Da, and is said to contain hydrolase activity. The sequence analysis revealed that SV-5ʹ NUC belongs to superfamily of proteins. The physico-chemical characterization of the protein revealed the following: Theoretical pI : 5.61; Total number of negatively charged residues (Asp + Glu): 80; Total number of positively charged residues (Arg + Lys): 64; Extinction coefficient : 72700 M-1 CM-1 with a estimated half-life of 30 hours (mammalian reticulocytes, in vitro). The computed instability index score 40.77 of SV-5ʹ NUC revealed that the protein was unstable. The grand average of hydropathicity (GRAVY) and aliphatic index prediction of SV-5ʹ NUC aminoacids revealed a score of -0.400 and 77.92 respectively.

3.2 Homology modeling and validation

From sequence analysis and BLAST search against PDB database, the functional homolog of SV-5ʹ NUC in humans, H-5ʹ NUC (2J2C) with a resolved crystal structure of 2.2 Šwas identified as the template for homology modelling studies due to its lowest e-value of zero and high sequence coverage/identity of 95%. Fig.1 shows the existence of highly conserved residues at sequence level between them. It also illustrates the secondary structural information for SV-5ʹ NUC along with the surface accessibility of the residues. The last 77 residues are not modelled due to the lack of structural information and believed not to have any functional role in enzymatic activity of SV-5ʹ NUC. Four models for the modeled region of SV-5ʹ NUC were generated and the best model among them was chosen as SV-5ʹ NUC1 due to its lowest molpdf and DOPE scores (Table. S1). The modelled structure (SV-5ʹ NUC1) confirmed that it is a member of 5_nucleotidase superfamily of α/β hydrolases containing the HAD-IG-nucleotidase subfamily domain (1-480 residues). The modelled region of SV-5ʹ NUC1 structure (95% identity and 97% similarity towards 2J2C) is depicted in Fig. 1. The observed 3D structure of SV-5ʹ NUC1 shows the presence of a mixture of α/β folds with single Rossman-like domain with anti-parallel β sheets (Fig. 2). It also shows the presence of haloacid dehydrogenase (HAD) member like motifs {hhhhDxDx(T/V)}, {hhhh(T/S)}, and {hhhh(G/N)(D/E)x(3-4)(D/E)} (where "h" stands for a hydrophobic residue) in the sequence.

The validation of SV-5ʹ NUC1 model with PROCHECK based Ramachandran map statistics revealed 85.1% amino acid residues in the favorable region, 14.4% in additionally allowed region and 0.5% in the generously allowed region respectively. Moreover, none of the residues was observed in the disallowed region (Fig. 3). Thus, our SV-5ʹ NUC1 model is stereochemically significant with the reasonable distribution of backbone angle in the protein structure and acceptability of the built model. The G-factor values representing the dihedral, covalent and overall bond angles was found to be -0.38, 0.12 and -0.17 respectively. The main-chain and side-chain parameters assessed for SV-5ʹ NUC1 using PROCHECK revealed favorable stereochemical properties (Fig. S2 and S3). The ERRAT plot depicted the various non-bonded interactions between different atom types of amino acids. It provided the structure modifying guidance to improve the sterically hindered regions in the protein. The overall quality factor of homology model was 90.48% in ERRAT plot, with minor 'structure error' that reflects the steric hindrance between few amino acids (Fig. 3). As expected, Verify-3D also revealed that 91.67% of the amino acids in the current structure of SV-5ʹ NUC1 have compatible 1D-3D score greater than 0.2. The SV-5ʹ NUC1 model has Z-score value of -0.82 in the range of native conformations of crystal structures which further enhanced the confidence of accepting the SV-5ʹ NUC1 model (Fig. 3). The crystal structure of H-5ʹ NUC and SV-5ʹ NUC1 model was superimposed to confirm the striking conformational similarity between them. The RMSD value of 0.5 Šwas observed for the superimposed structure (Fig. S4). It further emphasized over the quality of the built model due to the minimum deviation with respect to backbones and side chains respectively.

3.3. MD simulation

In order to check the stability of SV-5ʹ NUC1, RMSD of backbone atoms from MD production run was plotted as time-dependent function as shown in Fig. 4. The graph clearly indicates that there is significant change in RMSD for the initial 200 ps and then the system stabilized with fluctuations less than 0.3 Å. The RMSD between energy minimized model of SV-5ʹ NUC1 and final structure from MD simulation was as low as 0.512 Å. Furthermore, structural comparison of energy minimized structure with structures generated throughout the MD production run indicates that the energy minimized SV-5ʹ NUC1 model represents a stable conformation. Structure validation results suggest that the energy minimized SV-5ʹ NUC1 model is precise for molecular docking process.

3.4. Binding-pocket prediction

In order to investigate the interaction between SV-5ʹ NUC1 and the pubchem ligands (Fig. 5), the binding site was defined based on the calculations predicted by the SiteMap module in Schrödinger and as well as based on the information available from the literature. The best binding site (siteMap1) residues revealed a higher binding site score (Table. S5) calculated based on effective dscore, size and volume of the cavity. The predicted site is comprised of amino acid residues Asp52, Asp54, Tyr65, Thr72, Phe155 and Asp346 believed to be important for the ligand-protein interaction. Based on the coinciding literature survey and our SiteMap1 results, this site has been chosen as the most favorable binding site to dock the ligands (Vanillin and vanillic acid) independently with SV-5ʹ NUC1.

3.5. Molecular docking analysis

The Glide XP mode docking was performed for both the energy minimized ligands in the validated binding pocket of SV-5ʹ NUC1 protein. Vanillin forms hydrogen bond interactions with side chain OH atom of Tyr65. Vanillic acid forms hydrogen bond interactions with side chain OH atoms of Tyr65 and Thr72. Their observed interaction binding poses and interaction maps are shown in Fig. S6. The observed binding pattern of vanillic acid makes us speculate that the -COOH group present in it could provide better interaction for inhibition compared to the -CHO group of vanillin. From the molecular docking results, we observed that Vanillic acid was found to be the better inhibitor than vanillin due to lower Glide XP score and Glide energy score (Table.1). Moreover Vanillic acid was said to contain only two hydrogen bond weak interactions with Tyr65 and Thr72 compared to only one hydrogen bond of vanillin with Tyr65. The observed bioinformatics results confirm the experimental results of vanillic acid as better inhibitor than vanillin, based on SV-5ʹ NUC IC50 values as reported in three other snakes Naja naja, Daboia russellii and Trimeresurus malabaricus.

3.5. ADME analysis

We analyzed more than 40 physical signifiers and pharmacologically relevant properties of the two lead compounds, including molecular weight, H-bond donors, H-bond acceptors, log P (octanol/water), QP log S, QPP caco, % of human oral absorption and their positions according to Lipinski's rule of five (Table 2). This is a rule of thumb to evaluate the property of drug likeness. The rule describes pharmacological or biological activity properties at molecular level that are important in the drug's pharmacokinetics in the human body, including its ADME. Nevertheless, the rule does not predict whether a compound is pharmacologically active. The two ligands Vanillin and vanillic acid on ADME analysis was found to exist in the acceptable range of Lipinski's rule of five. For the two lead compounds, the partition coefficient (QP log P (o/w)) and the water solubility (QP log S), which are crucial when estimating the absorption and distribution of drugs within the body, ranged between −1.690 to 1.727 and −1.582 to −4.691, respectively, while the cell permeability (QP PCaco), a key factor governing drug metabolism and its access to biological membranes, ranged from 0.345 to 95. Overall, the percentage of human oral absorptions for the compounds ranged from 65% to 100%. All of these pharmacokinetic parameters are within the acceptable range defined for human use, thereby indicating their potential for use as drug-like molecules.

3.6. E-Pharmacophore mapping

The structure based ligand docking approach or the ligand based pharmacophore approach are the two possible ways available for drug discovery and design of active molecules. Incorporating protein-ligand interactions into ligand-based pharmacophore approaches has been shown to produce enhanced improvements over using ligand information alone. By incorporating structural and energetic information using the scoring function in Glide XP three common pharmacophore sites were observed in vanillin and vanillic acid (Fig. 6). Using the above said principle, the energetically favorable pharmacophore sites were generated for both these ligands were found to consist of an acceptor group (A2), an aromatic ring (R6), and one H-bond donor (D4) (Fig. 6). The bond distances and angles between them was calculated Fig. 4. These energetically favorable sites encompass the specific interactions between the ligands and the SV-5ʹ NUC1 protein. This information should prove helpful in the development of new SV-5ʹ NUC inhibitors.

3.7. Molecular electrostatic potential (MELP) analysis

The molecular electrostatic interaction is a crucial part of the non-covalent interaction energy between the molecules. The MELP on a molecular surface can be used to visually compare two molecules, guide docking studies, and identify sites that interact with its ligands. Numerous studies have employed the MELP technique to relate the biological potency of different ligands based on potential values. The color grade for the MELP ranges from deep blue color representing the most negative potential to deep red color representing the most positive potential. This analysis can also provide the 3D spatial features of the binding cavity of the protein-ligand interactions.