skip to main content
research-article

Automated reaction mapping

Published: 23 February 2009 Publication History

Abstract

Automated reaction mapping is a fundamental first step in the analysis of chemical reactions and opens the door to the development of sophisticated chemical kinetic tools. This article formulates the reaction mapping problem as an optimization problem. The problem is shown to be NP-Complete for general graphs. Five algorithms based on canonical graph naming and enumerative combinatoric techniques are developed to solve the problem. Unlike previous formulations based on limited configurations or classifications, our algorithms are uniquely capable of mapping any reaction that can be represented as a set of chemical graphs optimally. This is due to the direct use of Graph Isomorphism as the basis for these algorithms as opposed to the more commonly used Maximum Common Subgraph. Experimental results on chemical and biological reaction databases demonstrate the efficiency of our algorithms.

References

[1]
Aho, A., Hopcroft, J., and Ullman, J. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading, MA.
[2]
Akutsu, T. 2004. Efficient extraction of mapping rules of atoms from enzymatic reaction data. Journal of Computational Biology 11, 449--462.
[3]
Apostolakis, J., Sacher, O., Korner, R., and Gasteiger, J. 2008. Automatic determination of reaction mappings and reaction center information. 2. validation on a biochemical reaction database. Journal of Chemical Information and Modeling 48, 6, 1190--1198.
[4]
Babai, L. and Luks, E. 1983. Canonical labeling of graphs. In Proceedings of the 15th Annual ACM Symposium on Theory of Computing (STOC'83). ACM Press, New York, 171--183.
[5]
Balaz, V., Kvasnicka, V., and Pospichal, J. 1992. Two metrics in a graph theory modeling of organic chemistry. Discrete Applied Mathematics 35, 1, 1--19.
[6]
Blower, P. E. and Dana, R. C. 1986. Creation of a chemical reaction database from the primary literature. In Modern Approaches to Chemical Reaction Searching, P. Willett, Ed. Gower, 146--164.
[7]
Chen, W., Huang, J., and Gilson, M. K. 2004. Identification of symmetries in molecules and complexes. Journal of Chemical Information and Modeling 44, 4, 1301--1313.
[8]
Cieplak, T. and Wisniewski, J. 2001. A new effective algorithm for the unambiguous identification of the stereochemical characteristics of compounds during their registration in databases. Molecules 6, 915--926.
[9]
Conte, D., Foggia, P., Sansone, C., and Vento, M. 2004. Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence, 18, 3, 265--298.
[10]
Cook, S. 1971. The complexity of theorem-proving procedures. In Proceedings of the 3rd Annual ACM Symposium on Theory of Computing (STOC'71). ACM Press, New York, 151--158.
[11]
Curran, H. J., Gaffuri, P., Pitz, W. J., and Westbrook, C. K. 1998. A comprehensive modeling study of n-heptane oxidation. Combustion and Flame 114, 1-2 (July), 149--177.
[12]
Faulon, J.-L. 1998. Isomorphism, automorphism partitioning, and canonical labeling can be solved in polynomial-time for molecular graphs. Journal of Chemical Information and Modeling 38, 3, 432--444.
[13]
Faulon, J.-L. 2007a. http://www.cs.sandia.gov/~jfaulon/QSAR/downloads.html.
[14]
Faulon, J.-L. 2007b. Private email conversation with regarding time complexity of author's algorithms.
[15]
Faulon, J.-L., Collins, M. J., and Carr, R. D. 2004. The signature molecular descriptor. 4. canonizing molecules using extended valence sequences. Journal of Chemical Information and Modeling 44, 2, 427--436.
[16]
Felix, L. and Valiente, G. 2005. Efficient validation of metabolic pathway databases. In Proc. 6th Int. Symp. Computational Biology and Genome Informatics. 1209--1212.
[17]
Garey, M. R. and Johnson, D. S. 1990. Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York.
[18]
Gasteiger, J. and Engel, T., Eds. 2003. Chemoinformatics: A Textbook. Wiley.
[19]
Goto, S., Nishioka, T., and Kanehisa, M. 1998. Ligand: chemical database for enzyme reactions. Bioinformatics 14, 7, 591--599.
[20]
Hoffmann, C. M. 1982. Group-Theoretic Algorithms and Graph Isomorphism. Springer-Verlag Berlin and Heidelberg GmbH and Co. KG.
[21]
Hopcroft, J. E. and Wong, J. K. 1974. Linear time algorithm for isomorphism of planar graphs (preliminary report). In Proceedings of the 6th Annual ACM Symposium on Theory of Computing (STOC'74). ACM Press, New York, 172--184.
[22]
Institute, G. R. 2006. Gri-mech 3.0. http://www.me.berkeley.edu/gri-mech/.
[23]
Jochum, C., Gasteiger, J., and Ugi, I. 1980. The principle of minimal chemical distance (pmcd). Angew. Chem., Int. Ed. Engl. 19, 495--505.
[24]
Jochum, C., Gasteiger, J., Ugi, I., and Dugundji, J. 1982. The principle of minimal chemical distance and the principle of minimum structure change. Zeitschrift fuer Naturforschung 37B, 9, 1205--1215.
[25]
Knuth, D. E. 2005. The Art of Computer Programming, Volume 4, Fascicle 3: Generating All Combinations and Partitions. Addison-Wesley Professional.
[26]
Korner, R. and Apostolakis, J. 2008. Automatic determination of reaction mappings and reaction center information. 1. the imaginary transition state energy approach. Journal of Chemical Information and Modeling 48, 6, 1181--1189.
[27]
Kvasnicka, V. 1984. Mathematical model of organic history. iv. classification scheme of chemical reactions. Collection of Czechoslovak Chemical Communications 49, 5, 1090--1097.
[28]
Kvasnicka, V., Kratochvil, M., and Koca, J. 1983. Mathematical model of organic history. iii. reactions graphs. Collection of Czechoslovak Chemical Communications 48, 8, 2284--2304.
[29]
Kvasnicka, V. and Pospichal, J. 1991. Chemical and reaction metrics for graph-theoretical model of organic chemistry. THEOCHEM 73, 17--42.
[30]
Luks, E. 1982. Isomorphism of graphs of bounded valence can be tested in polynomial time. Journal of Computer and System Sciences 25, 1 (August).
[31]
Lynch, M. F. and Willett, P. 1978. The automatic detection of chemical reaction sites. J. Chem. Inf. Compu. Sci. 18, 3, 154--159.
[32]
McGregor, J. J. 1982. Backtrack search algorithms and the maximal common subgraph problem. Software Practice and Experience 12, 12, 23--34.
[33]
McGregor, J. J. and Willett, P. 1981. Use of a maximal common subgraph algorithm in the automatic identification of the ostensible bond changes occurring in chemical reactions. J. Chem. Inf. Compu. Sci. 21, 3, 137--140.
[34]
McKay, B. 1981. Practical graph isomorphism. Congressus Numerantium 30, 45--87.
[35]
McKay, B. 2004. No automorphisms, yes? http://cs.anu.edu.au/~bdm/nauty/.
[36]
Miyazaki, T. 1997. The complexity of McKay's canonical labeling algorithm. Groups and Computation II, DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 28, 239--256.
[37]
Moock, T. E., Nourse, J. G., Grier, D., and Hounshell, W. D. 1988. The implementation of atom-atom mapping and related features in the reaction access system (reaccs). In Chemical structures, W. A. Warr, Ed. Springer Verlag, 303--313.
[38]
Morgan, H. L. 1965. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc. 5, 2, 107--113.
[39]
Pemmaraju, S. and Skiena, S. 2003. Computational Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Cambridge University Press, New York.
[40]
Raymond, J. W., Gardiner, E. J., and Willett, P. 2002. Rascal: Calculation of graph similarity using maximum common edge subgraphs. The Computer Journal 45, 6, 631--644.
[41]
Raymond, J. W. and Willett, P. 2002. Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of Computer-Aided Molecular Design 16, 7 (July), 521--533.
[42]
Ugi, I., Bauer, J., Blomberger, C., Brandt, J., Dietz, A., Fontain, E., Gruber, B., Scholley-Pfab, A., Senff, A., and Stein, N. 1994. Models, concepts, theories, and formal languages in chemistry and their use as a basis for computer assistance in chemistry. J. Chem. Inf. Compu. Sci. 34, 1, 3--16.
[43]
Vleduts, G. E. 1963. Concerning one system of classification and codification of organic reactions. Information Storage and Retrieval, 117--146.

Cited By

View all
  • (2024)Partial Imaginary Transition State (ITS) Graphs: A Formal Framework for Research and Analysis of Atom-to-Atom Maps of Unbalanced Chemical Reactions and Their CompletionsSymmetry10.3390/sym1609121716:9(1217)Online publication date: 16-Sep-2024
  • (2024)Benchmarking machine-readable vectors of chemical reactions on computed activation barriersDigital Discovery10.1039/D3DD00175J3:5(932-943)Online publication date: 2024
  • (2024)Precise atom-to-atom mapping for organic reactions via human-in-the-loop machine learningNature Communications10.1038/s41467-024-46364-y15:1Online publication date: 13-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Journal of Experimental Algorithmics
ACM Journal of Experimental Algorithmics  Volume 13, Issue
2009
482 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/1412228
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 February 2009
Accepted: 01 November 2008
Received: 01 May 2008
Published in JEA Volume 13

Author Tags

  1. Cheminformatics
  2. mechanisms

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)2
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Partial Imaginary Transition State (ITS) Graphs: A Formal Framework for Research and Analysis of Atom-to-Atom Maps of Unbalanced Chemical Reactions and Their CompletionsSymmetry10.3390/sym1609121716:9(1217)Online publication date: 16-Sep-2024
  • (2024)Benchmarking machine-readable vectors of chemical reactions on computed activation barriersDigital Discovery10.1039/D3DD00175J3:5(932-943)Online publication date: 2024
  • (2024)Precise atom-to-atom mapping for organic reactions via human-in-the-loop machine learningNature Communications10.1038/s41467-024-46364-y15:1Online publication date: 13-Mar-2024
  • (2020)autodE: Automated Calculation of Reaction Energy Profiles— Application to Organic and Organometallic ReactionsAngewandte Chemie International Edition10.1002/anie.20201194160:8(4266-4274)Online publication date: 22-Dec-2020
  • (2020)autodE: Automated Calculation of Reaction Energy Profiles— Application to Organic and Organometallic ReactionsAngewandte Chemie10.1002/ange.202011941133:8(4312-4320)Online publication date: 22-Dec-2020
  • (2019)Automatic mapping of atoms across both simple and complex chemical reactionsNature Communications10.1038/s41467-019-09440-210:1Online publication date: 29-Mar-2019
  • (2017)Automated Transition State Search and Its Application to Diverse Types of Organic ReactionsJournal of Chemical Theory and Computation10.1021/acs.jctc.7b0076413:11(5780-5797)Online publication date: 17-Oct-2017
  • (2016)Chemical Kinetics: A CS PerspectiveComputing in Science & Engineering10.1109/MCSE.2016.1918:5(48-55)Online publication date: Sep-2016
  • (2015)Random Models and Analyses for Chemical GraphsInternational Journal of Foundations of Computer Science10.1142/s012905411550016126:02(269-291)Online publication date: Feb-2015
  • (2014)“Social” Network of Isomers Based on Bond Count Distance: AlgorithmsJournal of Chemical Information and Modeling10.1021/ci400517354:1(57-68)Online publication date: 13-Jan-2014
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media