About TSRI
Research & Faculty
News & Publications
Scientific Calendars
Scripps Florida
PhD Program
Campus Services
Work at TSRI
TSRI in the Community
Giving to TSRI
Directory
Library
Contact
Site Map & Search
TSRI Home

Scientific Report 2005


Molecular Biology




Computational Structural Proteomics and Ligand Discovery

R. Abagyan, J. An, A. Cheltsov, A. Bordner,* C. Cavasotto,* J. Kovacs, J. Fernandez-Recio,** M. Totrov,* X. Zhang,*** M. Dawson,*** A. McCluskey,**** B. Marsden*****

* Molsoft L.L.C., La Jolla, California
** Institut de Recerca Biomèdica, Barcelona, Spain
*** Burnham Institute, La Jolla, California
**** University of Newcastle, Callaghan, Australia
***** Structural Genomics Consortium, Oxford, England

Every day about 15 new crystal structures are deposited in the Protein Data Bank. The 30,000 molecular structures in the bank contain rich information about protein function and provide a unique opportunity for rational search for or design of small molecules that can be used as therapeutic agents. We use computational structural proteomics, bioinformatics, molecular mechanics, and cheminformatics to characterize the function of proteins and to design molecular structures.

Traditionally, we have focused on accurate docking and screening of small molecules and have used internal coordinate mechanics to predict protein association. In 2004, we focused on improving the information content of evolutionary sequence conservation; predicting and classifying ligand-binding pockets and protein-protein interfaces; improving sequence structure alignments for models by homology; and predicting effects of single-point mutations, loop conformations, and protein association geometry. We also improved protocols for predicting receptor flexibility in ligand docking and applied virtual screening to discover inhibitors of important biomedical targets.

Bioinformatics and Prediction of Protein Function

Functional characterization of tens of thousands of proteins is a key computational task. To build 3-dimensional models of structurally uncharacterized protein sequences, we developed a procedure to accurately align those sequences to their Protein Data Bank templates in the areas of weak alignment. The Structural Alignment Database of 1927 alignments was then used to develop improved alignment/threading parameters.

Every molecular biologist is confronted with the tasks of discovering and annotating the functions of a protein of interest. A strong evolutionary conservation measure in the context of a 3-dimensional model is a powerful source of functional information. However, the currently used measures have a strong dependence on the sequence composition biases of alignments. We developed mathematical formalism that gives a powerful measure of sequence conservation that does not depend on overrepresentation or underrepresentation of certain branches in the alignment. We also used this measure in an improved method to predict novel patches of protein-protein interactions on protein surfaces.

Specific association of proteins is a key biological mechanism. However, accurate prediction of interfaces and residues involved in an interaction, often an interaction with an unknown protein partner, is a great challenge for most proteins or domains with known 3-dimensional structure. The preference for any particular interface is subtle because the same surface is also happy to be exposed to water. We attempted to solve that problem by using more meaningful surface properties and more sophisticated numerical methods. Using the optimal docking area method, we showed that with optimized desolvation parameters and an adaptive algorithm of finding the optimal interaction patch, the desolvation signal itself without any other signals can be strong enough. In other studies, we combined a desolvation signal with the improved sequence conservation signal and used the method successfully with a benchmark of 1496 interfaces.

Predicting Protein Structure and Association

Predicting partial protein structure or molecular association is a critical task in computational biology and chemistry. This past year we proposed a method to predict both geometry and stabilization energy for single mutations, improved protocols for predicting protein loops, and developed a method to predict large-scale protein movements by using simplified protein models represented in internal coordinates.

If both partners of a protein complex are known and their “uncomplexed” 3-dimensional models exist or can be built, attempts can be made to predict the association geometry (also called protein docking). In 2004, we used the internal coordinate mechanics docking method successfully in the Critical Assessment of Prediction of Interactions competition, partially because of the improved docking energetics. Although in the first round we predicted only 3 of 7 complexes, in the second and the third rounds, we were correct in 8 of 9 tasks. We are working on further improvements of the method.

The Cell Pocketome

Proteins also bind small molecules, the natural substrates or cofactors of the proteins, or specially designed therapeutic agents. Many orphan receptors and uncharacterized surfaces exist. This past year, we further optimized a pocket prediction algorithm and used it successfully on as many as 17,000 pockets from the Protein Data Bank. In this algorithm, a mathematical transformation of the Lennard-Jones potential is used to generate a potential that, contoured at a certain level, specifically locates the potential binding sites with a rather low level of false-positives and false-negatives (Fig. 1).

Fig. 1. Several representatives of a predicted cell pocketome.

Using this algorithm, we predicted as many as 96.8% of experimental binding sites at an overlap level of better than 50%. Furthermore, 95% of the predicted sites from the apo receptors were predicted at the same level. We showed that conformational differences between the apo and bound pockets do not dramatically affect the prediction results. The algorithm can be used to predict ligand-binding pockets of uncharacterized protein structures, suggest new allosteric pockets, evaluate the feasibility of inhibition of protein-protein interactions, and prioritize molecular targets. Finally, we collected and classified data for the human cell pocketome, a database of the known and the predicted binding pockets for the human proteome structures.

The pocketome can be used for rapid evaluation of possible binding partners of a given chemical compound. We are using the predicted pockets to develop therapeutic molecules that target unexpected binding pockets. Our first result in using such a strategy was obtained in collaboration with D.A. Lomas, University of Cambridge, Cambridge, England; we identified the first small molecules that block the polymerization of the Z mutant of α1-antitrypsin.

Compound Docking and Virtual Ligand Screening

Small-molecule inhibitors or activators can be discovered rationally by carefully docking them to a target pocket and scoring the result according to the pose and interactions of the small molecule. The virtual screen can be performed against millions of available chemicals or against virtual chemically feasible molecules, and only several dozen computationally selected candidates need to be tested experimentally. We developed and improved different aspects of this strategy and applied it to different drug discovery projects. The docking technology can also help in understanding the structural mechanisms of the actions of small molecules and can be used to rationally design better molecules. Recently, we used the technology to explain the antagonistic effect of an important class of retinoid X receptor antagonists.

A major problem in small-molecule docking and screening is protein flexibility and conformational rearrangements of the binding pocket upon ligand binding. This past year we presented several scenarios for incorporating protein flexibility into docking calculations. In some instances, these protocols can be used to simultaneously predict the ligand-binding pose and the pocket rearrangements.

Publications

Abagyan, R. Problems in computational structural proteomics. In: Structural Proteomics. Sundstrom, M., Norin, M., Edwards, A. (Eds,). CRC Press, Boca Raton, FL, in press.

An, J., Totrov, M., Abagyan, R. Comprehensive identification of “druggable” protein ligand binding sites. Genome Inform. Ser. Workshop Genome Inform. 15:31, 2004.

An, J., Totrov, M., Abagyan, R. Pocketome via comprehensive identification and classification of ligand binding envelopes. Mol. Cell. Proteomics 4:752, 2005.

Bordner, A.J., Abagyan, R. REVCOM: a robust Bayesian method for evolutionary rate estimation. Bioinformatics 21:2315, 2005.

Bordner, A.J., Abagyan, R. Statistical analysis and prediction of protein-protein interfaces. Proteins 60:353, 2005.

Bordner, A.J., Abagyan, R.A. Large-scale prediction of protein geometry and stability changes for arbitrary single point mutations. Proteins 57:400, 2004.

Cavasotto, C.N., Kovacs, J.A., Abagyan, R.A. Representing receptor flexibility in ligand docking through relevant normal modes. J. Am. Chem. Soc. 127:9632, 2005.

Cavasotto, C.N., Liu, G., James, S.Y., Hobbs, P.D., Peterson, V.J., Bhattacharya, A.A., Kolluri, S.K., Zhang, X.K., Leid, M., Abagyan, R., Liddington, R.C., Dawson, M.I. Determinants of retinoid X receptor transcriptional antagonism. J. Med. Chem. 47:4360, 2004.

Cavasotto, C.N., Orry, A.J.W., Abagyan, R.A. The challenge of considering receptor flexibility in ligand docking and virtual screening. Curr. Comput. Aided Drug Des., in press.

Cavasotto, C.N., Orry, A.J.W., Abagyan, R. Receptor flexibility in ligand docking. In: Handbook of Theoretical and Computational Nanotechnology. Reith, M., Schommers, W. (Eds.). American Scientific Publishers, Stevenson Ranch, Calif, in press.

Fernandez-Recio, J., Abagyan, R., Totrov, M. Improving CAPRI predictions: optimized desolvation for rigid-body docking. Proteins 60:308, 2005.

Fernandez-Recio, J., Totrov, M., Skorodumov, C., Abagyan, R. Optimal docking area: a new method for predicting protein-protein interaction sites. Proteins 58:134, 2005.

Hill, T.A., Odell, L.R., Quan, A., Abagyan, R., Ferguson, G., Robinson, P.J., McCluskey, A. Long chain amines and long chain ammonium salts as novel inhibitors of dynamin GTPase activity. Bioorg. Med. Chem. Lett. 14:3275, 2004.

Kovacs, J.A., Cavasotto, C.N., Abagyan, R.A. Conformational sampling of protein flexibility in generalized coordinates: application to ligand docking. J. Comput. Theor. Nanosci., in press.

Marsden, B., Abagyan, R. SAD—a normalized structural alignment database: improving sequence-structure alignments. Bioinformatics 20:2333, 2004.

 

Ruben Abagyan, Ph.D.

Professor



Faculty