skip to main content
article

Seeded Tree Alignment

Published: 01 October 2008 Publication History

Abstract

The optimal transformation of one tree into another by means of elementary edit operations is an important algorithmic problem that has several interesting applications to computational biology. Here we introduce a constrained form of this problem in which a partial mapping of a set of nodes (the "seeds") in one tree to a corresponding set of nodes in the other tree is given, and present efficient algorithms for both ordered and unordered trees. Whereas ordered tree matching based on seeded nodes has applications in pattern matching of RNA structures, unordered tree matching based on seeded nodes has applications in co-speciation and phylogeny reconciliation. The latter involves the solution of the planar tanglegram layout problem, for which a polynomial-time algorithm is given here.

References

[1]
M.J. Chung, "O(n2.5) Time Algorithms for the Subgraph Homeomorphism Problem on Trees," J. Algorithms, vol. 8, pp. 106-112, 1987.
[2]
B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp, and L. Zhang, "On Distances between Phylogenetic Trees," Proc. Eighth Ann. ACM-SIAM Symp. Discrete Algorithms (SODA '97), pp. 427-436, 1997.
[3]
J.-F. Dufayard, L. Duret, S. Penel, M. Gouy, F. Rechenmann, and G. Perrière, "Tree Pattern Matching in Phylogenetic Trees: Automatic Search for Orthologs or Paralogs in Homologous Gene Sequence Databases," Bioinformatics, vol. 21, no. 11, pp. 2596-2603, 2005.
[4]
J. Felsenstein, "Phylip--Phylogeny Inference Package (Version 3.2)," Cladistics, vol. 5, no. 1, pp. 164-166, 1989.
[5]
U. Fößmeier and M. Kaufmann, "Nice Drawings for Planar Bipartite Graphs," Proc. Third Italian Conf. Algorithms and Complexity (CIAC '97), vol. 1203, pp. 122-134, 1997.
[6]
J.A. Gallian, "A Dynamic Survey of Graph Labeling," Electronic J. Combinatorics, no. DS7, http://www.combinatorics.org/Surveys/, 2007.
[7]
K.J. Gardiner, T.L. Marsh, and N.R. Pace, "Ion Dependence of the Bacillus Subtilis RNase P Reaction," J. Biological Chemistry, vol. 260, no. 9, pp. 5415-5419, 1985.
[8]
E.S. Haas and J.W. Brown, "Evolutionary Variation in Bacterial RNase P RNAs," Nucleic Acids Research, vol. 26, no. 18, pp. 4093-4099, 1998.
[9]
J.K. Harris, E.S. Haas, D. Williams, D.N. Frank, and J.W. Brown, "New Insight into RNase P RNA Structure from Comparative Analysis of the Archaeal RNA," RNA, vol. 7, no. 2, pp. 220-232, 2001.
[10]
P. Hugenholtz, "Exploring Prokaryotic Diversity in the Genomic Era," Genome Biology, vol. 3, no. 2, pp. reviews0003.1- reviews0003.8, 2002.
[11]
B.D. James, G.J. Olsen, J. Liu, and N.R. Pace, "The Secondary Structure of Ribonuclease P RNA, the Catalytic Element of a Ribonucleoprotein Enzyme," Cell, vol. 52, no. 1, pp. 19-26, 1988.
[12]
J. Jansson, N.T. Hieu, and W.-K. Sung, "Local Gapped Subforest Alignment and Its Application in Finding RNA Structural Motifs," J. Computational Biology, vol. 13, no. 3, pp. 702-718, 2006.
[13]
S.-Y. Le, R. Nussinov, and J.V. Maizel, "Tree Graphs of RNA Secondary Structures and Their Comparisons," Computers in Biomedical Research, vol. 22, no. 5, pp. 461-473, 1989.
[14]
A. Lozano, R. Pinter, O. Rokhlenko, G. Valiente, and M. Ziv-Ukelson, "Seeded Tree Alignment and Planar Tanglegram Layout," Proc. Seventh Workshop Algorithms in Bioinformatics (WABI '07), vol. 4645, pp. 98-110, 2007.
[15]
B.L. Maidak, J.R. Cole, T.G. Lilburn, C.T. Parker, P.R. Saxman, J.M. Stredwick, G.M. Garrity, B. Li, G.J. Olsen, S. Pramanik, T.M. Schmidt, and J.M. Tiedje, "The RDP (Ribosomal Database Project) Continues," Nucleic Acids Research, vol. 28, pp. 173-174, 2000.
[16]
D.W. Matula, "An Algorithm for Subtree Identification," SIAM Rev., vol. 10, pp. 273-274, 1968.
[17]
D. Matula, "Subtree Isomorphism in O(n5/2)," Annals Discrete Math., vol. 2, pp. 91-106, 1978.
[18]
T.M. Nye, P. Lio, and W.R. Gilks, "A Novel Algorithm and Web-Based Tool for Comparing Two Alternative Phylogenetic Trees," Bioinformatics, vol. 22, no. 1, pp. 117-119, 2006.
[19]
N.R. Pace and J.W. Brown, "Evolutionary Perspective on the Structure and Function of Ribonuclease P, A Ribozyme," J. Bacteriology, vol. 177, no. 8, pp. 1919-1928, 1995.
[20]
R.D.M. Page, ed., Tangled Trees: Phylogeny, Cospeciation, and Coevolution. The Univ. of Chicago Press, 2002.
[21]
R.D.M. Page and G. Valiente, "An Edit Script for Taxonomic Classifications," BMC Bioinformatics, vol. 6, p. 208, 2005.
[22]
R.Y. Pinter, O. Rokhlenko, D. Tsur, and M. Ziv-Ukelson, "Approximate Labelled Subtree Homeomorphism," Proc. 15th Ann. Symp. Combinatorial Pattern Matching (CPM '04), vol. 3109, pp. 59-73, 2004.
[23]
S.W. Reyner, "An Analysis of a Good Algorithm for the Subtree Problem," SIAM J. Computing, vol. 6, no. 4, pp. 730-732, 1977.
[24]
R. Shamir and D. Tsur, "Faster Subtree Isomorphism," J. Algorithms, vol. 33, no. 2, pp. 267-280, 1999.
[25]
H. Shan, K.G. Herbert, W.H. Piel, D. Shasha, and J.T.L. Wang, "A Structure-Based Search Engine for Phylogenetic Databases," Proc. 14th Int'l Conf. Scientific and Statistical Database Management (SSDBM '02), pp. 7-10, 2002.
[26]
B.A. Shapiro and K. Zhang, "Comparing Multiple RNA Secondary Structures Using Tree Comparisons," Computer Applications in the Biosciences, vol. 6, no. 4, pp. 309-318, 1990.
[27]
D. Shasha, J.T.-L. Wang, and R. Giugno, "Algorithmics and Applications of Tree and Graph Searching," Proc. 21st ACM SIGACT-SIGMOD-SIGART Symp. Principles of Database Systems (PODS '02), pp. 39-52, 2002.
[28]
G. Valiente, Algorithms on Trees and Graphs. Springer, 2002.
[29]
G. Valiente, "Constrained Tree Inclusion," J. Discrete Algorithms, vol. 3, nos. 2-4, pp. 431-447, 2005.
[30]
G. Valiente, "A Fast Algorithmic Technique for Comparing Large Phylogenetic Trees," Proc. 12th Int'l Symp. String Processing and Information Retrieval (SPIRE '05), vol. 3772, pp. 371-376, 2005.
[31]
C.R. Woese and N.R. Pace, "Probing RNA Structure, Function, and History by Comparative Analysis," The RNA World, R.F. Gesteland and J.F. Atkins, eds., pp. 91-117, Cold Spring Harbor Laboratory Press, 1993.
[32]
W.N.W. Zainon and P. Calder, "Visualizing Phylogenetic Trees," Proc. Seventh Australasian User Interface Conf. (AUIC '06), pp. 145-152, 2006.
[33]
K. Zhang, L. Wang, and B. Ma, "Computing Similarity between RNA Structures," Proc. 10th Ann. Symp. Combinatorial Pattern Matching (CPM '99), vol. 1645, pp. 281-293, 1999.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 5, Issue 4
October 2008
158 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 October 2008
Published in TCBB Volume 5, Issue 4

Author Tags

  1. Biology and genetics
  2. Computer Applications
  3. Discrete Mathematics
  4. Graph Theory
  5. Graph algorithms
  6. Life and Medical Sciences
  7. Mathematics of Computing
  8. Trees

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2018)TanglegramsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2016.261304015:1(343-349)Online publication date: 1-Jan-2018
  • (2014)Algorithms for topology-free and alignment network queriesJournal of Discrete Algorithms10.1016/j.jda.2014.03.00227(29-53)Online publication date: 1-Jul-2014
  • (2013)Neighborhood-Preserving mapping between treesProceedings of the 13th international conference on Algorithms and Data Structures10.1007/978-3-642-40104-6_37(427-438)Online publication date: 12-Aug-2013
  • (2012)Efficient chaining of seeds in ordered treesJournal of Discrete Algorithms10.1016/j.jda.2011.12.01314(107-118)Online publication date: 1-Jul-2012
  • (2011)Hardness of approximation and integer programming frameworks for searching for caterpillar treesProceedings of the Seventeenth Computing on The Australasian Theory Symposium - Volume 11910.5555/2483191.2483209(145-150)Online publication date: 17-Jan-2011
  • (2011)Hardness of approximation and integer programming frameworks for searching for caterpillar treesProceedings of the Seventeenth Computing: The Australasian Theory Symposium - Volume 11910.5555/2461196.2461214(145-150)Online publication date: 17-Jan-2011
  • (2011)Forest alignment with affine gaps and anchorsProceedings of the 22nd annual conference on Combinatorial pattern matching10.5555/2018243.2018256(104-117)Online publication date: 27-Jun-2011
  • (2010)Efficient chaining of seeds in ordered treesProceedings of the 21st international conference on Combinatorial algorithms10.5555/1987042.1987069(260-273)Online publication date: 26-Jul-2010

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media