skip to main content
article

Computing Phylogenetic Diversity for Split Systems

Published: 01 April 2008 Publication History

Abstract

In conservation biology it is a central problem to measure, predict, and preserve biodiversity as species face extinction. In 1992 Faith proposed measuring the diversity of a collection of species in terms of their relationships on a phylogenetic tree, and to use this information to identify collections of species with high diversity. Here we are interested in some variants of the resulting optimization problem that arise when considering species whose evolution is better represented by a network rather than a tree. More specifically, we consider the problem of computing phylogenetic diversity relative to a split system on a collection of species of size $n$. We show that for general split systems this problem is NP-hard. In addition we provide some efficient algorithms for some special classes of split systems, in particular presenting an optimal $O(n)$ time algorithm for phylogenetic trees and an $O(n\log n + n k)$ time algorithm for choosing an optimal subset of size $k$ relative to a circular split system.

References

[1]
D.P. Faith, "Conservation Evaluation and Phylogenetic Diversity," Biological Conservation, vol. 61, pp. 1-10, 1992.
[2]
G.M. Barker, "Phylogenetic Diversity: A Quantitative Framework for Measurement of Priority and Achievement in Biodiversity Conservation," Biological J. Linnean Soc., vol. 76, pp. 165-194, 2002.
[3]
D.P. Faith and A.M. Baker, "Phylogenetic Diversity (PD) and Biodiversity Conservation: Some Bioinformatics Challenges," Evolutionary Bioinformatics Online, vol. 2, pp. 70-77, 2006.
[4]
K. Hartmann and M. Steel, "Phylogenetic Diversity: From Combinatorics to Ecology," New Math. Models in Evolution, O. Gascuel and M. Steel, eds., Oxford Univ. Press, 2006.
[5]
F. Pardi and N. Goldman, "Species Choice for Comparative Genomics: Being Greedy Works," PLoS Genetics, vol. 1, no. 6, 2005.
[6]
D. Faith and K. Williams, "Phylogenetic Diversity and Biodiversity Planning," McGraw-Hill Yearbook of Science and Technology, pp. 233-235, McGraw-Hill Professional, 2006.
[7]
M. Steel, "Phylogenetic Diversity and the Greedy Algorithm," Systematic Biology, vol. 54, pp. 527-529, 2005.
[8]
B.Q. Minh, S. Klaere, and A. von Haeseler, "Phylogenetic Diversity within Seconds," Systematic Biology, vol. 55, pp. 769- 773, 2006.
[9]
M.L. Weitzmann, "The Noah's Ark Problem," Econometrica, vol. 66, pp. 1279-1298, 1998.
[10]
K. Hartmann and M. Steel, "Maximizing Phylogenetic Diversity in Biodiversity Conservation: Greedy Solutions to the Noah's Ark Problem," Systematic Biology, vol. 55, pp. 644-651, 2006.
[11]
F. Pardi and N. Goldman, "Resource-Aware Taxon Selection for Maximizing Phylogenetic Diversity," Systematic Biology, vol. 56, pp. 431-444, 2007.
[12]
V. Moulton, C. Semple, and M. Steel, "Optimizing Phylogenetic Diversity under Constraints," J. Theoretical Biology, vol. 246, pp. 186-194, 2007.
[13]
D. Bryant and D. Huson, "Application of Phylogenetic Networks in Evolutionary Studies," Molecular Biology and Evolution, vol. 23, pp. 254-267, 2006.
[14]
M. Kotetishvili, O. Stine, A. Kreger, J. Morris, and A. Sulakvelidze, "Multilocus Sequence Typing for Characterization of Clinical and Environmental Salmonella Strains," J. Clinical Microbiology, vol. 51, pp. 1626-1635, 2002.
[15]
D. Bryant and V. Moulton, "Neighbornet: An Agglomerative Method for the Construction of Phylogenetic Networks," Molecular Biology and Evolution, vol. 21, pp. 255-265, 2004.
[16]
B.Q. Minh, S. Klaere, and A. von Haeseler, "Phylogenetic Diversity Algorithm," http://www.cibiv.at/software/pda/, Aug. 2007.
[17]
B.Q. Minh, S. Klaere, and A. von Haeseler, "Phylogenetic Diversity on Split Networks," unpublished manuscript, 2007.
[18]
B. Holland, K.T. Huber, V. Moulton, and P. Lockhart, "Using Consensus Networks to Visualize Contradictory Evidence for Species Phylogeny," Molecular Biology and Evolution, vol. 21, pp. 1459-1461, 2004.
[19]
P. Buneman, "The Recovery of Trees from Measures of Dissimilarity," Math. in the Archeological and Historical Sciences, F.R. Hodson, D.G. Kendall, and P. Tautu, eds., pp. 387-395, Edinburgh Univ. Press, 1971.
[20]
M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, 1979.
[21]
D. Hochbaum, "Approximating Covering and Packing Problems: Set Cover, Vertex Cover, Independent Set, and Related Problems," Approximation Algorithms for NP-Hard Problems, D. Hochbaum, ed., PWS Publishing, 1997.
[22]
M. Bordewich and C. Semple, "Nature Reserve Selection Problem: A Tight Approximation Algorithm," Technical Report UCDMS2007/1, NZ, Univ. of Canterbury, 2007.
[23]
P. Alimonti and V. Kann, "Hardness of Approximating Problems on Cubic Graphs," Proc. Italian Conf. Algorithms and Complexity, pp. 288-298, 1997.
[24]
C. Semple and M. Steel, Phylogenetics. Oxford Univ. Press, 2003.
[25]
A. Shioura and T. Uno, "A Linear Time Algorithm for Finding a k-Tree Core," J. Algorithms, vol. 23, pp. 281-290, 1997.
[26]
M. Blum, R.W. Floyd, V.R. Pratt, R.L. Rivest, and R.E. Tarjan, "Time Bounds for Selection," J. Computer and System Sciences, vol. 7, pp. 448-461, 1973.
[27]
R. Uehara and Y. Uno, "Efficient Algorithms for the Longest Path Problem," Proc. Int'l Symp. Algorithms and Computation, pp. 871- 883, 2004.
[28]
D. Bryant and A. Dress, "Linearly Independent Split Systems," European J. Combinatorics, vol. 28, pp. 1814-1831, 2007.
[29]
D. Bryant, V. Moulton, and A. Spillner, "Computing Planar Split Graphs," The Ann. New Zealand Phylogenetics Meeting, 2007.
[30]
D. Eppstein, M.H. Overmars, G. Rote, and G.J. Woeginger, "Finding Minimum Area k-Gons," Discrete and Computational Geometry, vol. 7, pp. 45-58, 1992.
[31]
H. Edelsbrunner, Algorithms in Combinatorial Geometry. Springer, 1987.
[32]
J.E. Boyce, D.P. Dobkin, R.L. Drysdale, and L.J. Guibas, "Finding Extremal Polygons," SIAM J. Computing, vol. 14, pp. 134-147, 1985.
[33]
H.J. Bandelt and A. Dress, "A Canonical Split Decomposition Theory for Metrics on a Finite Set," Advances in Math., vol. 92, pp. 47-105, 1992.
[34]
K. Kalmanson, "Edgeconvex Circuits and the Travelling Salesman Problem," Canadian J. Math., vol. 27, pp. 1000-1010, 1975.
[35]
V. Chepoi and B. Fichet, "A Note on Circular Decomposable Metrics," Geometriae Dedicata, vol. 69, pp. 237-240, 1998.
[36]
A. Aggarwal, M.M. Klawe, S. Moran, P. Shor, and R. Wilber, "Geometric Applications of a Matrix-Searching Algorithm," Algorithmica, vol. 2, pp. 195-208, 1987.
[37]
H. Schwöbbermeyer and J.T. Kim, "A Comparative Analysis of Biodiversity Measures," Proc. European Conf. Advances in Artificial Life, pp. 119-128, 1999.
[38]
A.S.L. Rodrigues and K.J. Gaston, "Maximising Phylogenetic Diversity in the Selection of Networks of Conservation Areas," Biological Conservation, vol. 105, pp. 103-111, 2002.
[39]
L. Lewis and P. Lewis, "Unearthing the Molecular Phylodiversity of Desert Soil Green Algae (Chlorophyta)," Systematic Biology, vol. 54, pp. 936-947, 2005.
[40]
A. Blum, P. Chalasani, D. Coppersmith, W.R. Pulleyblank, P. Raghavan, and M. Sudan, "The Minimum Latency Problem," Proc. ACM Symp. Theory of Computing, pp. 163-171, 1994.
[41]
B.R. Holland, "Evolutionary Analysis of Large Data Sets: Trees and Beyond," PhD dissertation, Massey Univ., 2001.
[42]
S.S. Ravi, D.J. Rosenkrantz, and G.K. Tayi, "Heuristic and Special Case Algorithms for Dispersion Problems," Operations Research, vol. 42, pp. 299-310, 1994.
[43]
B. Chandra and M. Halldórsson, "Approximation Algorithms for Dispersion Problems," J. Algorithms, vol. 38, pp. 438-465, 2001.
[44]
R. Chandrasekaran and A. Daughety, "Location on Tree Networks: p-Centre and n-Dispersion Problems," Math. Operations Research, vol. 6, pp. 50-57, 1981.
[45]
G.A. Waterson, "On the Number of Segregating Sites in Genetical Models without Recombination," Theoretical Population Biology, vol. 7, pp. 256-276, 1975.
[46]
G.M. Ziegler, Lectures on Polytopes. Springer, 1995.

Cited By

View all
  • (2011)An Approximation Algorithm for the Noah's Ark Problem with Random Feature LossIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2010.378:2(551-556)Online publication date: 1-Mar-2011
  • (2009)Budgeted Phylogenetic Diversity on Circular Split SystemsIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2008.546:1(22-29)Online publication date: 1-Jan-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 5, Issue 2
April 2008
158 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 April 2008
Published in TCBB Volume 5, Issue 2

Author Tags

  1. Biology and genetics
  2. Life and Medical Sciences

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2011)An Approximation Algorithm for the Noah's Ark Problem with Random Feature LossIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2010.378:2(551-556)Online publication date: 1-Mar-2011
  • (2009)Budgeted Phylogenetic Diversity on Circular Split SystemsIEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)10.1109/TCBB.2008.546:1(22-29)Online publication date: 1-Jan-2009

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media