ABSTRACT
The focus on important diseases of our time has prompted many experimental labs to resolve and deposit functional structures of disease-causing or disease-participating proteins. At this point, many functional structures of wildtype and disease-involved variants of a protein exist in structural databases. The objective for computational approaches is to employ such information to discover features of the underlying energy landscape on which functional structures reside. Important questions about which subset of structures are most thermodynamically-stable remain unanswered. The challenge is how to transform an essentially discrete problem into one where continuous optimization is suitable and effective. In this paper, we present such a transformation, which allows adapting and applying evolution strategies to explore an underlying continuous variable space and locate the global optimum of a multimodal fitness landscape. The paper presents results on wildtype and mutant sequences of proteins implicated in human disorders, such as cancer and Amyotrophic lateral sclerosis. More generally, the paper offers a methodology for transforming a discrete problem into a continuous optimization one as a way to possibly address outstanding discrete problems in the evolutionary computation community.
- E. Anderson, Z. Bai, J. Dongarra, A. Greenbaum, A. McKenney, J. Du Croz, S. Hammerling, J. Demmel, C. Bischof, and D. Sorensen. Lapack: A portable linear algebra library for high-performance computers. In Proceedings of the 1990 ACM/IEEE Conference on Supercomputing, Supercomputing '90, pages 2--11, Los Alamitos, CA, USA, 1990. IEEE Computer Society Press. Google ScholarDigital Library
- A. Auger and N. Hansen. A restart cma evolution strategy with increasing population size. In IEEE Congress on Evolutionary Computation (CEC), pages 1769--1776. IEEE, 2005.Google ScholarCross Ref
- D. Becerra, A. Sandoval, D. Restrepo-Montoya, and L. Nino. A parallel multi-objective ab initio approach for protein structure prediction. In Bioinformatics and Biomedicine (BIBM), 2010 IEEE International Conference on, pages 137--141, 2010.Google ScholarCross Ref
- H. M. Berman, K. Henrick, and H. Nakamura. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol., 10(12):980--980, 2003.Google ScholarCross Ref
- P. Bradley, K. M. Misura, and D. Baker. Toward high-resolution de novo structure prediction for small proteins. Science, 309(5742):1868--1871, Sep 2005.Google ScholarCross Ref
- J. Calvo, J. Ortega, and M. Anguita. Comparison of parallel multi-objective approaches to protein structure prediction. In Journal of Supercomputing, pages 253--260. CITIC UGR Univ Granada, Dept Comp Architecture & Comp Technol, Granada, Spain, 2011. Google ScholarDigital Library
- C. Chira, D. Horvath, and D. Dumitrescu. An Evolutionary Model Based on Hill-Climbing Search Operators for Protein Structure Prediction. Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, pages 38--49, 2010. Google ScholarDigital Library
- Cutello, V, G. Morelli, G. Nicosia, M. Pavone, and G. Scollo. On discrete models and immunological algorithms for protein structure prediction. Natural Computing, 10(1):91--102, 2011. Google ScholarDigital Library
- B. L. de Groot, D. M. van Aalten, R. M. Scheek, A. Amadei, G. Vriend, and H. J. Berendsen. Prediction of protein conformational freedom from distance constraints. Proteins, 29(2):240--251, 1997.Google ScholarCross Ref
- R. Faccioli, I. N. da Silva, L. O. Bortot, and A. Delbem. A mono-objective evolutionary algorithm for Protein Structure Prediction in structural and energetic contexts. In IEEE Congress on Evolutionary Computation (CEC), pages 1--7. IEEE, 2012.Google ScholarCross Ref
- A. Fernández-Medarde and E. Santos. Ras in cancer and developmental diseases. Genes Cancer, 2(3):344--358, 2011.Google ScholarCross Ref
- C. Gagné, M. Beaulieu, J. NAND Parizeau, and S. Thibault. Human-competitive lens system design with evolution strategies. Applied Soft Computing, 8(4):1439--1452, 2008. Google ScholarDigital Library
- M. Garza-Fabre, G. Toscano-Pulido, and E. Rodriguez-Tello. Locality-based multiobjectivization for the HP model of protein structure prediction. In GECCO '12: Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference. ACM Request Permissions, July 2012. Google ScholarDigital Library
- M. M. Goldstein, E. E. Fredj, and R. B. R. Gerber. A new hybrid algorithm for finding the lowest minima of potential surfaces: approach and application to peptides. Journal of Computational Chemistry, 32(9):1785--1800, July 2011.Google ScholarCross Ref
- D. Gront, S. Kmiecik, and A. Kolinski. Backbone building from quadrilaterals: a fast and accurate algorithm for protein backbone reconstruction from alpha carbon coordinates. J. Comput. Chem., 28(29):1593--1597, 2007.Google ScholarCross Ref
- J. Handl, J. Knowles, R. Vernon, D. Baker, and S. C. Lovell. The dual role of fragments in fragment-assembly methods for de novo protein structure prediction. 80(2):490--504, 2012.Google Scholar
- N. Hansen and S. Kern. Evaluating the cma evolution strategy on multimodal test functions. In Intl Conf on Parallel Problem Solving from Nature (PPSN), pages 282--291, 2004.Google ScholarCross Ref
- N. Hansen and A. Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2):159--195, 2001. Google ScholarDigital Library
- O. Iba\ nez, L. Ballerini, O. Cordón, S. Damas, and J. Santamaria. An experimental study on the applicability of evolutionary algorithms to craniofacial superimposition in forensic identification. Information Sciences, 179(3):3998--4028, 2009. Google ScholarDigital Library
- M. K. Islam, M. Chetty, and M. Murshed. Novel Local Improvement Techniques in Clustered Memetic Algorithm for Protein Structure Prediction. In IEEE Congress on Evolutionary Computation (CEC), pages 1003--1011. IEEE, Apr. 2011.Google Scholar
- D. G. Luenberger. Introduction to Linear and Nonlinear Programming. Addison-Wesley, 1973.Google Scholar
- A. D. McLachlan. A mathematical procedure for superimposing atomic coordinates of proteins. Acta Crystallogr. A., 26(6):656--657, 1972.Google ScholarCross Ref
- K. Molloy, S. Saleh, and A. Shehu. Probabilistic search and energy guidance for biased decoy sampling in ab-initio protein structure prediction. IEEE/ACM Trans Bioinf and Comp Biol, 10(5):1162--1175, 2013. Google ScholarDigital Library
- B. Olson, K. A. D. Jong, and A. Shehu. Off-lattice protein structure prediction with homologous crossover. In Conf on Genetic and Evolutionary Computation (GECCO), pages 287--294, New York, NY, 2013. ACM. Google ScholarDigital Library
- B. Olson and A. Shehu. Multi-objective stochastic search for sampling local minima in the protein energy surface. In ACM Conf on Bioinf and Comp Biol (BCB), pages 430--439, Washington, D. C., September 2013. Google ScholarDigital Library
- J. Santos, P. Villot, and M. Dieguez. Emergent protein folding modeled with evolved neural cellular automata using the 3d hp model. J of Comp Biol, 21(11):823--845, 2014.Google ScholarCross Ref
- J. Schaub, K. Mauch, and M. Reuss. Metabolic flux analysis in escherichia coli by integrating isotopic dynamic and isotopic stationary 13c labeling data. Biotechnol Bioeng, 99(5):1170--1185, 2008.Google ScholarCross Ref
- L. Schrödinger. The PyMOL molecular graphics system, version 1.3r1, August 2010.Google Scholar
- A. Shehu, C. Clementi, and L. E. Kavraki. Modeling protein conformational ensembles: From missing loops to equilibrium fluctuations. Proteins: Struct. Funct. Bioinf., 65(1):164--179, 2006.Google ScholarCross Ref
- A. Shehu and B. Olson. Guiding the search for native-like protein conformations with an ab-initio tree-based exploration. Int. J. Robot. Res., 29(8):1106--1127, 2010. Google ScholarDigital Library
- A. Shmygelska and M. Levitt. Generalized ensemble methods for de novo structure prediction. Proc. Natl. Acad. Sci. USA, 106(5):94305--95126, 2009.Google ScholarCross Ref
- A.-A. Tantar, N. Melab, and E.-G. Talbi. A grid-based genetic algorithm combined with an adaptive simulated annealing for protein structure prediction. Soft Computing, 12(12):1185--1198, 2008. Google ScholarDigital Library
- S. A. Wells. Geometric simulation of flexible motion in proteins. Methods Mol Biol, 1084:173--192, 2014.Google ScholarCross Ref
- D. Xu and Y. Zhang. Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field. Proteins: Struct. Funct. Bioinf., 80(7):1715--1735, 2012.Google ScholarCross Ref
Index Terms
- Evolution Strategies for Exploring Protein Energy Landscapes
Recommendations
Effect of sequences on the shape of protein energy landscapes
BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational BiologyProtein folding is a long standing problem in biology, whose mechanism is still not completely understood. Funnel-shape energy landscape has been proposed as a plausible folding mechanism in which proteins can fold through multiple possible pathways ...
Mapping Multiple Minima in Protein Energy Landscapes with Evolutionary Algorithms
GECCO Companion '15: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary ComputationMany proteins involved in human proteinopathies exhibit complex energy landscapes with multiple thermodynamically-stable and semi-stable structural states. Landscape reconstruction is crucial to understanding functional modulations, but one is ...
A Novel EA-based Memetic Approach for Efficiently Mapping Complex Fitness Landscapes
GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016Recent work in computational structural biology focuses on modeling intrinsically dynamic proteins important to human biology and health. The energy landscapes of these proteins are rich in minima that correspond to alternative structures with which a ...
Comments