|
ABSTRACT
In this paper, we study the problem of learning phylogenies and hidden Markov models. We call a Markov model nonsingular if all transition matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomial-time algorithm for learning nonsingular phylogenies and hidden Markov models.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
L. Addario-Berry, B. Chor, M. Hallett, J. Lagergren, A. Panconesi and Todd Wareham, Ancestral Maximum Likelihood of Evolutionary Trees Is Hard, Journal of Bioinformatics and Computational Biology, 2, 257--271, 2004.
|
| |
3
|
|
| |
4
|
J. Cavender, Taxonomy with confidence, Mathematical Biosciences, 40, 271--280, 1978.
|
| |
5
|
J. T. Chang, Full Reconstruction of Markov Models on Evolutionary Trees: Identifiability and Consistency, Mathematical Biosciences, 137, 51--73, 1996.
|
| |
6
|
Robert G. Cowell , Steffen L. Lauritzen , A. Philip David , David J. Spiegelhalter , V. Nair , J. Lawless , M. Jordan , David J. Spiegelhater, Probabilistic Networks and Expert Systems, Springer-Verlag New York, Inc., Secaucus, NJ, 1999
|
| |
7
|
|
| |
8
|
R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.
|
| |
9
|
R. Durrett, Probability: Theory and Examples, Duxbury, 1996.
|
| |
10
|
|
| |
11
|
|
| |
12
|
J. Felsenstein. Inferring Phylogenies. Sinauer, New York, New York, 2004.
|
 |
13
|
|
| |
14
|
J. Feldman and R. O'Donnell and R. Servedio. Learning mixtures of product distributions, preprint, 2004.
|
| |
15
|
J. Felsenstein, Cases in which parsimony or compatibility methods will be positively misleading, Systematic Zoology, 27, 410--410, 1978.
|
| |
16
|
G. H. Golub and C. H. Van Loan, Matrix Computations, Johns Hopkins University Press, 1996.
|
| |
17
|
R. L. Graham and L. R. Foulds, Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time, Mathematical Biosciences, 60, 133--142, 1982.
|
| |
18
|
|
 |
19
|
|
 |
20
|
Michael Kearns , Yishay Mansour , Dana Ron , Ronitt Rubinfeld , Robert E. Schapire , Linda Sellie, On the learnability of discrete distributions, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing, p.273-282, May 23-25, 1994, Montreal, Quebec, Canada
[doi> 10.1145/195058.195155]
|
| |
21
|
|
| |
22
|
E. Mossel, Distorted metrics on trees and phylogenetic forests, preprint, 2004 (available at http://arxiv.org/math.CO/0403508).
|
| |
23
|
E. Mossel and S.Roch, Learning Nonsingular Phylogenies and Hidden Markov Models, preprint, 2005 (available at http://arxiv.org/cs.LG/0502076).
|
| |
24
|
L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 77(2), 257--286, 1989.
|
| |
25
|
|
| |
26
|
|
| |
27
|
C. Semple and M. Steel. Phylogenetics, volume 22 of Mathematics and its Applications series. Oxford University Press, 2003.
|
| |
28
|
M. Steel, Recovering a tree from the leaf colourations it generates under a Markov model, Appl. Math. Lett. 7(2), 19--24, 1994.
|
| |
29
|
M. A. Steel, L. A. Szekely, and M. D. Hendy, Reconstructing trees when sequence sites evolve at variable rates, J. Comput. Biol. 1(2), 153--63, 1994.
|
 |
30
|
|
|