skip to main content
article

An Ω(n^2/ log n) Speed-Up of TBR Heuristics for the Gene-Duplication Problem

Published: 01 October 2008 Publication History

Abstract

The gene-duplication problem is to infer a species supertree from gene trees that are confounded by complex histories of gene duplications. This problem is NP-hard and thus requires efficient and effective heuristics. Existing heuristics perform a stepwise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. We improve on the time complexity of the local search problem by a factor of n2= log n, where n is the size of the resulting species supertree. Typically, several thousand instances of the local search problem are solved throughout a stepwise heuristic search. Hence, our improvement makes the gene-duplication problem much more tractable for large-scale phylogenetic analyses.

References

[1]
R. Guigó, I. Muchnik, and T.F. Smith, "Reconstruction of Ancient Molecular Phylogeny," Molecular Phylogenetics and Evolution, vol. 6, no. 2, pp. 189-213, 1996.
[2]
B. Ma, M. Li, and L. Zhang, "On Reconstructing Species Trees from Gene Trees in Term of Duplications and Losses," Proc. Second Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '98), S. Istrail, P. Pevzner, and M. Waterman, eds., pp. 182-191, 1998.
[3]
R.D.M. Page, "GeneTree: Comparing Gene and Species Phylogenies Using Reconciled Trees," Bioinformatics, vol. 14, no. 9, pp. 819-820, 1998.
[4]
J.B. Slowinski, A. Knight, and A.P. Rooney, "Inferring Species Trees from Gene Trees: A Phylogenetic Analysis of the Elapidae (Serpentes) Based on the Amino Acid Sequences of Venom Proteins," Molecular Phylogenetics and Evolution, vol. 8, no. 3, pp. 349-362, 1997.
[5]
R.D.M. Page, "Extracting Species Trees from Complex Gene Trees: Reconciled Trees and Vertebrate Phylogeny," Molecular Phylogenetics and Evolution, vol. 14, no. 1, pp. 89-106, 2000.
[6]
R.D.M. Page and J. Cotton, "Vertebrate Phylogenomics: Reconciled Trees and Gene Duplications," Proc. Seventh Pacific Symp. Biocomputing (PSB '02), R.B. Altman, A.K. Dunker, L. Hunter, K. Lauderdale and T.E. Klein, eds., pp. 536-547, Jan. 2002.
[7]
J.A. Cotton and R.D.M. Page, "Tangled Tales from Multiple Markers: Reconciling Conflict between Phylogenies to Build Molecular Supertrees," Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, O.R.P. Bininda-Emonds, ed., pp. 107- 125, Springer, 2004.
[8]
M.J. Sanderson and M.M. McMahon, "Inferring Angiosperm Phylogeny from EST Data with Widespread Gene Duplication," BMC Evolutionary Biology, vol. 7, supplementary 1:S3, 2007.
[9]
D.L. Swofford, G.J. Olsen, P.J. Waddel, and D.M. Hillis, "Phylogenetic Inference," Molecular Systematics, D.M. Hillis, C. Moritz, and B.K. Mable, eds., chapter 11, pp. 407-509, Sinauer Assoc., 1996.
[10]
B.L. Allen and M. Steel, "Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees," Annals of Combinatorics, vol. 5, pp. 1-13, 2001.
[11]
D. Chen, O. Eulenstein, D. Fernández-Baca, and J.G. Burleigh, "Improved Heuristics for Minimum-Flip Supertree Construction," Evolutionary Bioinformatics, vol. 2, 2006.
[12]
M. Goodman, J. Czelusniak, G.W. Moore, A.E. Romero-Herrera, and G. Matsuda, "Fitting the Gene Lineage into Its Species Lineage. A Parsimony Strategy Illustrated by Cladograms Constructed from Globin Sequences," Systematic Zoology, vol. 28, pp. 132-163, 1979.
[13]
R.D.M. Page, "Maps between Trees and Cladistic Analysis of Historical Associations among Genes, Organisms, and Areas," Systematic Biology, vol. 43, no. 1, pp. 58-77, 1994.
[14]
B. Mirkin, I. Muchnik, and T.F. Smith, "A Biologically Consistent Model for Comparing Molecular Phylogenies," J. Computational Biology, vol. 2, no. 4, pp. 493-507, 1995.
[15]
O. Eulenstein, "Predictions of Gene-Duplications and Their Phylogenetic Development," PhD dissertation, Univ. of Bonn, Germany. gMD Research Series No. 20/1998, ISSN: 1435-2699, 1998.
[16]
L. Zhang, "On a Mirkin-Muchnik-Smith Conjecture for Comparing Molecular Phylogenies," J. Computational Biology, vol. 4, no. 2, pp. 177-187, 1997.
[17]
K. Chen, D. Durand, and M. Farach-Colton, "NOTUNG: A Program for Dating Gene Duplications and Optimizing Gene Family Trees," J. Computational Biology, vol. 7, no. 3/4, 2000.
[18]
P. Bonizzoni, G.D. Vedova, and R. Dondi, "Reconciling a Gene Tree to a Species Tree under the Duplication Cost Model," Theoretical Computer Science, vol. 347, nos. 1-2, pp. 36-53, 2005.
[19]
P. Górecki and J. Tiuryn, "On the Structure of Reconciliations," Proc. Second RECOMB Comparative Genomics Satellite Workshop, J. Lagergren, ed., pp. 42-54, 2004.
[20]
M.A. Bender and M. Farach-Colton, "The LCA Problem Revisited," Proc. Fourth Latin Am. Symp. Theoretical Informatics (LATIN '00), G.H. Gonnet, D. Panario, and A. Viola, eds., pp. 88-94, 2000.
[21]
D. Harel and R.E. Tarjan, "Fast Algorithms for Finding Nearest Common Ancestors," SIAM J. Computing, vol. 13, no. 2, pp. 338-355, 1984.
[22]
M.R. Fellows, M.T. Hallett, C. Korostensky, and U. Stege, "Analogs and Duals of the MAST Problem for Sequences and Trees," Proc. Sixth Ann. European Symp. Algorithms (ESA '98), G. Bilardi, G.F. Italiano, A. Pietracaprina, and G. Pucci, eds., pp. 103-114, 1998.
[23]
U. Stege, "Gene Trees and Species Trees: The Gene-Duplication Problem Is Fixed-Parameter Tractable," Proc. Sixth Int'l Workshop Algorithms and Data Structures (WADS '99), F.K.H.A. Dehne, A. Gupta, J.-R. Sack, and R. Tamassia, eds., pp. 288-293, 1999.
[24]
M.T. Hallett and J. Lagergren, "New Algorithms for the Duplication-Loss Model," Proc. Fourth Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '00), R. Shamir, S. Miyano, S. Istrail, P. Pevzner, and M. Waterman, eds., pp. 138-146, 2000.
[25]
M. Bordewich and C. Semple, "On the Computational Complexity of the Rooted Subtree Prune and Regraft Distance," Annals of Combinatorics, vol. 8, pp. 409-423, 2004.
[26]
M.S. Bansal, J.G. Burleigh, O. Eulenstein, and A. Wehe, "Heuristics for the Gene-Duplication Problem: A ¿(n) Speed-Up for the Local Search," Proc. 11th Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '07), T.P. Speed and H. Huang, eds., pp. 238-252, 2007.
[27]
M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf, Computational Geometry: Algorithms and Applications. Springer, 2000.

Cited By

View all
  • (2013)On the Neighborhoods of TreesIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2013.6610:3(721-728)Online publication date: 1-May-2013
  • (2013)Efficient Algorithms for Knowledge-Enhanced Supertree and Supermatrix Phylogenetic ProblemsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2012.16210:6(1432-1441)Online publication date: 1-Nov-2013
  • (2012)An Efficient Method for Exploring the Space of Gene Tree/Species Tree Reconciliations in a Probabilistic FrameworkIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2011.649:1(26-39)Online publication date: 1-Jan-2012
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 5, Issue 4
October 2008
158 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 October 2008
Published in TCBB Volume 5, Issue 4

Author Tags

  1. Algorithms
  2. Computational Biology
  3. Gene Duplication
  4. Phylogenetics
  5. Supertrees

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2013)On the Neighborhoods of TreesIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2013.6610:3(721-728)Online publication date: 1-May-2013
  • (2013)Efficient Algorithms for Knowledge-Enhanced Supertree and Supermatrix Phylogenetic ProblemsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2012.16210:6(1432-1441)Online publication date: 1-Nov-2013
  • (2012)An Efficient Method for Exploring the Space of Gene Tree/Species Tree Reconciliations in a Probabilistic FrameworkIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2011.649:1(26-39)Online publication date: 1-Jan-2012
  • (2012)GTP supertrees from unrooted gene treesProceedings of the 8th international conference on Bioinformatics Research and Applications10.1007/978-3-642-30191-9_11(102-114)Online publication date: 21-May-2012
  • (2011)Linear-Time Algorithms for the Multiple Gene Duplication ProblemsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2009.528:1(260-265)Online publication date: 1-Jan-2011
  • (2009)The Gene-Duplication ProblemIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2009.76:2(221-231)Online publication date: 1-Apr-2009

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media