Experiment-10: Homology Modeling
OBJECTIVE: To Perform Homology modeling for the given protein.
INSTRUCTION TO USERS
In order to clearly understand each experiment and make best use of content provided, we suggest you to proceed as per the following steps.
- Initially start with the theory section, recall technical knowledge over each steps, go through manual and define an overall design, workflow, to conduct a 2-DE experiment.
- In protocol section learn the minute integrity required to perform the experiment by going through the standardized protocol defined for each of the steps. ,
- Practice; operate through each step of the experiment in simulator section to visualize the entire process.
- Go through the video files to learn the real lab experience of performing experiments.
- In downloads sections, the content used are provide for user to export files, for analyses and in case to perform assignments completion.
- Now, user should be sufficiently equipped to face a Quiz for assessment. So answer the questions provided in the quiz and complete the assignment. (Go back to any modules that require better understanding based on your performance!)
- The articles, papers used for the content writing, books and the manuals used are cross referred for user information.
- Finally user feedback on overall experiments for us to improve.
Homology modeling is a computational approach for three-dimensional protein structure modeling and prediction. Proteins whose structures are still uncharacterized can be modeled using homology modeling. This method builds an atomic model based on experimentally determined known structures that have sequence homology of more than 40% with the target molecule. Modeling structures with less then 40% template similarity would result in less reliable models and hence ignored. Homology modeling is also known as comparative modeling.
The principle governing this approach is that if two proteins share a high sequence similarity, they are more likely to have very similar three-dimensional structures. If one of the protein sequences has a known structure, then this structure can be superimposed onto the unknown protein with a high degree of confidence. Protein sequences are more conserved than DNA and hence attribute to greater evolutionary significance.
While homology modeling predicts the positions of alpha carbons with moderate accuracy, it is not quite reliable in predicting side chains and loops. The others approaches are threading and ab- initio prediction.
Types of modeling:
- Basic Modeling - Modeling using a template with very high similarity with the target sequence.
- Advanced Modeling - In this case, the target is modeled using more than one template such that regions of the template proteins that share a high identity with portions of the target are used individually to model these sections.
- Iterative Modeling - This method generates models by using data gathered from previous target-template alignments to generate the probable range of values that can be considered significant for each criterion used in the modeling. This increases the accuracy of the model predicted.
The overall homology modeling procedure consists of six steps-
Step I - Template Selection
Template selection involves searching the Protein Data Bank (PDB) for homologous proteins with determined structures. The search can be performed using a heuristic pairwise alignment search program like BLAST or FASTA. As a rule of thumb, a database protein should have at least 40% sequence identity, high resolution and the most appropriate cofactors for it to be considered as a template sequence. The protein sequence whose 3D structure is to be predicted is called the "target sequence".
Step II – Sequence Alignment
Once the template is identified, the full-length sequences of the template and target proteins need to be realigned using refined alignment algorithms to obtain optimal alignment. The alignment gives specific alignment scores.
Step III - Backbone Model Building
Once optimal alignment is achieved the corresponding coordinate's residues from the template proteins can be simply copied onto the target protein. If the two aligned residues are identical, coordinates of the side chain atoms are copied along with the main chain atoms.
If multiple templates selected, then average coordinate values of the templates are used.
Step IV – Loop Modeling
After the sequence alignment, there are often regions created by insertions and deletions that lead to gaps in alignment. These gaps are modeled by loop modeling, which is less accurate, a major source of error. Currently, two main techniques are used to approach the problem:
- The database searching method - this involves finding loops from known protein structures and superimposing them onto the two stem regions (main chains mostly) of the target protein. Some specialized programs like FREAD and CODA can be used.
- The ab initio method - this generates many random loops and searches for one that has reasonably low energy and φ and ψ angles in the allowable regions in the Ramachandran plot.
Step V - Side Chain Refinement
After the main chain atoms are built, the positions of side chains must be determined. This is important in evaluating protein–ligand interactions at active sites and protein–protein interactions at the contact interface.
A side chain can be built by searching every possible conformation for every torsion angle of the side chain to select the one that has the lowest interaction energy with neighboring atoms. A rotamer library can also be used, which has all the favorable side chain torsion angles extracted from known protein crystal structures can also be used for this purpose.
Step VI - Model Refinement and Model Evaluation
This step carries out the energy minimization procedure on the entire model, which adjusts the relative position of the atoms so that the overall conformation of the molecule has the lowest possible energy potential. The goal of energy minimization is to relieve steric collisions without altering the overall structure. In these loop and side chain modeling steps, potential energy calculations are applied to improve the model. Model refinement can also be done by Molecular Dynamic simulation which moves the atoms toward a global minimum by applying various stimulation conditions (heating, cooling, considering water molecules) thus having a better chance at finding the true structure.
The final model has to be evaluated for checking the φ–ψ angles, chirality, bond lengths, close contacts and also the stereo chemical properties. Various online protein validation software packages are available such as Procheck, WHATIF, ANOLEA, Verify3D, PROSA.
- Various Comprehensive Modeling Programs are available like Modeller, SWISS MODEL, Schrodinger, 3D- JIGSAW.
- A successful model depends on template selection, algorithm used and the validation of the model.
- It can find the location of alpha carbons of key residues inside the folded protein.
- It can help to guide the mutagenesis experiments, or hypothesize structure-function relationships.
- The positions of conserved regions of the protein surface can help identify putative active sites, binding pockets and ligands.
- Homology models are unable to predict conformations of insertions or deletions, or side chain positions with a high level of accuracy.
- Homology models are not useful in modeling and ligand docking studies necessary for the drug designing and development process. However, it may be helpful for the same, if the sequence identity with the template is greater than 70%.