Let’s break the sequence of protein and model the full structure.
- April 10, 2018
- Posted by: rasa
- Category: Molecular Modelling & Simulation
To understand the mechanism behind the function of a protein, its three-dimensional structure plays a central role. Various experimental methods to determine protein three-dimensional structure like X-ray crystallography, NMR, electron microscopy, etc. provide more authentic structures than the theoretical models but they are very time consuming and have their limits, as in case of membrane proteins, and also only a small no. of proteins can be made to form crystals which is not a protein native environment. This is the reason that there a huge gap between the known protein sequences and their experimentally predicted structures in the databases. To overcome this gap, computational approach to generate protein models via comparative modeling is a handy and reliable method to predict 3D structure with accuracy level of a low-resolution experimentally predicted structure.
Many online servers available for automated protein modeling are Swiss-model, Robetta, Phyre which works on different principles of protein modeling like: homology modeling, fold-recognition or threading and ab-initio.
Comparative modeling or homology modeling is a method to predict the 3D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). It consist of four main steps: fold assignment, target-template alignment, model building, and model evaluation. Modeller is a software for comparative modeling by satisfaction of spatial restraints. Based on our target protein sequence identity with the template, we can choose which method basic or advanced modeling will be suitable to model the protein.
For example, if the target sequence has high identity with the template, with very less gaps and a good coverage, then we can go by Basic modeling taking only one best template from the list of all other templates. But in case, if we have got a list of templates with minimal gaps and average identity, and choosing the best template is a difficult job then we choose multiple templates with same coverage and belonging to same family and proceed via Advanced modeling.
Sometimes, the problem occur when we get a sequence whose templates are somewhat like,
Seq (target) : aaaaaabbbbcccccaaaaccccccc
Template1 : aaaaa……………………………….
Template2 : ………..bbbbb……………………
Template3 : ………………………..ccccccccc
In such cases, both basic and advanced modeling scripts are not sufficient enough to meet our requirements. So, we, here at RASA have played with the scripts and developed a new technique to handle such sequences. The idea was to split up our sequence (target) and look for the non-overlapped templates of these chopped sections to proceed for the modeling.
The last and most important step in protein modelling is to evaluate the model, which can be done using GNU-plot, RC-plot and a good protein model is one which makes chemical sense: normal bond lengths and angles, correct chirality, flat aromatic rings, flat sp2-hybridised carbons, makes physical sense: no non-bonded atoms, favourable crystal packing, related atoms display similar thermal disorder, occupancies of alternative conformations add up to one, makes statistical sense: the model is the best hypothesis to explain available experimental data, and makes structural sense: the model has a reasonable Ramachandran plot, not too many unusual side-chain conformations, or buried charges, residues are happy in their environment.
1). Xiang, Z. (2006). Advances in homology protein structure modeling. Current Protein and Peptide Science, 7(3), 217-227.
2). Ganugapati, J., Madhurapanthula, R. S., & Sai, K. S. (2013). In Silico Modeling of Human Toll like Receptor 1. International Journal of Computer Applications, 61(8).