skip to main content
10.1145/1854776.1854806acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Inferring species trees from gene duplication episodes

Published: 02 August 2010 Publication History

Abstract

Gene tree parsimony, which infers a species tree that implies the fewest gene duplications across a collection of gene trees, is a method for inferring phylogenetic trees from paralogous genes. However, it assumes that all duplications are independent, and therefore, it does not account for large-scale gene duplication events like whole genome duplications. We describe two methods to infer species trees based on gene duplication events that may involve multiple genes. First, gene episode parsimony seeks the species tree that implies the fewest possible gene duplication episodes. Second, adjusted gene tree parsimony corrects the number of gene duplications at each node in the species tree by treating the largest possible gene duplication episode as a single duplication. We test both new methods, as well as gene tree parsimony, using 7,091 gene trees representing 7 plant taxa. Gene tree parsimony and adjusted gene tree parsimony both perform well, returning the species tree after an exhaustive search of the tree space. By contrast, gene episode parsimony fails to rank the true species tree within the top third of all possible topologies. Furthermore, gene trees with randomly permuted leaf labels can imply fewer duplication episodes than gene trees with the correct leaf labels. Adjusted gene tree parsimony reflects a potentially more realistic and, at least for small data sets, computationally feasible model for counting gene duplication events than treating each duplication independently or minimizing the number of possible duplication episodes.

References

[1]
Guigó, R., Muchnik, I., and Smith, T. F. 1996 Reconstruction of ancient molecular phylogeny. Mol Phylogenet Evol. 6, 189--213.
[2]
Page, R. D. M. and Cotton, J. A. 2002 Vertebrate phylogenomics: reconciled trees and gene duplications. Pacific Symposium on Biocomputing 536--547.
[3]
Goodman, M., Czelusniak, J., Moore G. W., Romero-Herrera, A. E., and Matsuda, G. 1979 Fitting the gene lineage into its species lineage: a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst Zool. 28, 132--163.
[4]
Maddison, W. P. 1997 Gene trees in species trees. Syst Biol. 46, 523--536.
[5]
Page, R. D. M. and Charleston, M. A. 1997 From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Mol Phylogenet Evol. 7, 231--240.
[6]
Slowinski, J. B., Knight, A., and Rooney, A. P. 1997. Inferring species trees from gene trees: A phylogenetic analysis of the elapidae (serpentes) based on the amino acid sequences of venom proteins. Mol Phylogenet Evol. 8, 349--362.
[7]
Martin, A. P. and Burg, T. M. 2002 Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Syst Biol. 41, 570--587.
[8]
Page, R. D. M. 2000 Extracting species trees from complex gene trees: reconciled trees and vertebrate phylogeny. Mol Phylogenet Evol. 14, 89--106.
[9]
Cotton, J. A. and Page, R. D. M. 2002. Going nuclear: gene family evolution and vertebrate phylogeny reconciled. P Roy Soc Lond B. Biol. 269, 1555--1561.
[10]
Cotton, J. A. and Page, R. D. M. 2004. Tangled tales from multiple markers: reconciling conflict between phylogenies to build molecular supertrees. In: Bininda-Emonds ORP, editor. Phylogenetic supertrees: combining information to reveal the tree of life. Dordrecht, Netherlands: Springer-Verlag. p. 107--125.
[11]
McGowen, M. R., Clark, C., and Gatesy, J. 2008 The vestigial olfactory receptor subgenome of odontocete whales: phylogenetic congruence between gene-tree reconciliation and supermatrix methods. Syst. Biol. 57, 574--590.
[12]
Sanderson, M. J., and McMahon, M. M. 2007 Inferring angiosperm phylogeny from EST data with widespread gene duplication. BMC Evol Biol. 7, S3.
[13]
Bansal, M. S., Burleigh, J. G., Eulenstein, O., and Wehe, A. 2007 Heuristics for the gene-duplication problem: A θ(n) speed-up for the local search RECOMB 2007, LNCS 4453, 238--252.
[14]
Wehe A, Bansal MS, Burleigh JG, Eulenstein O. 2008. DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics. 24:1540--1541.
[15]
Simmons, M. P. and Freudenstein, J. V. 2002 Uninode coding vs gene tree parsimony for phylogenetic reconstruction using duplicate genes. Mol Phylogenet Evol. 23, 481--498.
[16]
Cotton, J. A. and Page, R. D. M. 2003 Gene tree parsimony vs. uninode coding for phylogenetic reconstruction. Mol Phylogenet Evol. 29, 298--308.
[17]
Wilkinson, M., Cotton, J. A., Creevey, C., Eulenstein, O., Harris, S. R., Lapointe, F. J., Levasseur, C., McInerney, J. O., Pisani, D., and Thorley, J. L. 2005 The shape of supertrees to come: tree shape related properties of fourteen supertree methods. Syst Biol. 54, 419--431.
[18]
Wood, T. E., Takebayashi, N., Barker, M. S., Mayrose, I., Greenspoon, P. B., and Rieseberg, L. H. 2009 The frequency of polyploidy speciation in vascular plants. Proc. Natl. Acad. Sci. USA 106, 13875--13879.
[19]
Maere, S., De Bodt, S., Raes, J., Casneuf, T., Van Montagu, M., Kuiper, M., and Van de Peer, Y. 2005 Modeling gene and genome duplications in eukaryotes. Proc Natl Acad Sci USA. 102, 5454--5459.
[20]
Bansal, M. S. and Eulenstein, O. 2008 The multiple gene duplication problem revisited. Bioinformatics. 24, i132--i138.
[21]
Luo, C. W., Chen, M. C., Chen, Y. C., Yang, R. W. L., Liu, H. F., and Chao, K. M. 2009 Linear-time algorithms for the multiple gene duplication problems. IEEE/ACM Transactions on Computational Biology and Bioinformatics 99, 5555.
[22]
Eulenstein, O. 1998 Predictions of gene-duplications and their phylogenetic development. PhD thesis, University of Bonn, Germany.
[23]
Fellows, M., Hallet, M., and Stege, U. 1998 On the multiple gene duplication problem. ISAAC'98, LNCS 1533, 347--356.
[24]
Doyon, J. P., Chauve, C., and Hamel, S. 2009 Space of gene/species trees reconciliations and parsimonious models. J. Comput. Biol. 16, 1399--1418.
[25]
APG III. 2009 An update of the angiosperm phylogeny group classification for the orders and families of flowering plants: APG III. Bot J Linn Soc. 161, 105--121.
[26]
Cui, L., Wall, P. K., Leebens-Mack, J. H., Lindsay, B. G., Soltis, D. E., Doyle, J. J., Soltis, P. S., Carlson, J. E., Arumuganathan, K., Barakat, A., Albert, V. A., Ma, H., and dePamphilis, C. W. 2006 Widespread genome duplications throughout the history of flowering plants. Genome Res. 16, 738--749.
[27]
Soltis, D. E., Albert, A. A., Leebens-Mack, J., Bell, C. D., Paterson, A. H., Zheng, C., Sankoff, D., dePamphilis, C. W., Wall, P. K., and P. S. Soltis. 2009 Polyploidy and angiosperm diversification. Am. J. Bot. 96, 336--348.
[28]
Hartmann, S., Lu, D., Phillips, J., and Vision, T. J. 2006 Phytome: a platform for plant comparative genomics. Nucleic Acids Res. 34, 724--730.
[29]
Hartmann, S., and Vision, T. J. 2008. Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment? BMC Evol Biol. 8, 95.
[30]
Stamatakis, A. 2006 RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 22, 2688--2690.
[31]
Jones, D. T., Taylor, W. R., and Thornton, J. M. 1992 The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8, 275--282.
[32]
Górecki, P. and Tiuryn, J. 2007 Urec: a system for unrooted reconciliation. Bioinformatics. 23, 511--512.
[33]
Hahn, M. 2007 Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution. Genome Biol. 8, R141.
[34]
Burleigh, J. G., Bansal, M. S., Wehe, A., and Eulenstein, O. 2009 Locating large-scale gene duplication events through reconciled trees: implications for identifying ancient polyploidy in plants. J. Comput. Biol. 16, 1071--1083.
[35]
Bansal, M. S. and Eulenstein, O 2007. An Ω(n 2/log n) speed-up of TBR heuristics for the gene-duplication problem. WABI 2007, LNCS 4645, 124--135.
[36]
Bansal, M. S. and Eulenstein O. 2008 The gene-duplication problem: near-linear time algorithms for NNI based local searches. ISBRA 2008, LNCS 4983, 14--25.
[37]
Wehe, A. and Burleigh, J. G. 2010 Scaling the gene duplication problem towards the tree of life: accelerating the rSPR heuristic search. BiCob 2010, LNCS, In press.

Cited By

View all
  • (2019)Polynomial-Time Algorithms for Phylogenetic Inference Problems involving duplication and reticulationIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2019.2934957(1-1)Online publication date: 2019
  • (2018)Inferring duplication episodes from unrooted gene treesBMC Genomics10.1186/s12864-018-4623-z19:S5Online publication date: 8-May-2018
  • (2018)Efficient Algorithms for Genomic Duplication ModelsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2017.2706679(1-1)Online publication date: 2018
  • Show More Cited By

Index Terms

  1. Inferring species trees from gene duplication episodes

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        BCB '10: Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
        August 2010
        705 pages
        ISBN:9781450304382
        DOI:10.1145/1854776
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 02 August 2010

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. duplication episode
        2. gene duplication
        3. gene tree parsimony
        4. gene tree/species tree reconciliation
        5. phylogeny
        6. whole genome duplication

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        BCB'10
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 254 of 885 submissions, 29%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 22 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2019)Polynomial-Time Algorithms for Phylogenetic Inference Problems involving duplication and reticulationIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2019.2934957(1-1)Online publication date: 2019
        • (2018)Inferring duplication episodes from unrooted gene treesBMC Genomics10.1186/s12864-018-4623-z19:S5Online publication date: 8-May-2018
        • (2018)Efficient Algorithms for Genomic Duplication ModelsIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2017.2706679(1-1)Online publication date: 2018
        • (2018)Polynomial-Time Algorithms for Phylogenetic Inference ProblemsAlgorithms for Computational Biology10.1007/978-3-319-91938-6_4(37-49)Online publication date: 17-May-2018
        • (2017)New Algorithms for the Genomic Duplication ProblemComparative Genomics10.1007/978-3-319-67979-2_6(101-115)Online publication date: 15-Sep-2017
        • (2012)Identifying the Phylogenetic Context of Whole-Genome Duplications in PlantsPolyploidy and Genome Evolution10.1007/978-3-642-31442-1_5(77-92)Online publication date: 3-Oct-2012

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media