skip to main content
article

Computational Problems in Perfect Phylogeny Haplotyping: Typing without Calling the Allele

Published: 01 January 2008 Publication History

Abstract

A haplotype is an m-long binary vector. The xor-genotype of two haplotypes is the m-vector of their coordinate-wise xor. We study the following problem: Given a set of xor-genotypes, reconstruct their haplotypes so that the set of resulting haplotypes can be mapped onto a perfect phylogeny tree. The question is motivated by studying population evolution in human genetics, and is a variant of the perfect phylogeny haplotyping problem that has received intensive attention recently. Unlike the latter problem, in which the input is "full" genotypes, here we assume less informative input, and so may be more economical to obtain experimentally.Building on ideas of Gusfield, we show how to solve the problem in polynomial time, by a reduction to the graph realization problem. The actual haplotypes are not uniquely determined by that tree they map onto, and the tree itself may or may not be unique. We show that tree uniqueness implies uniquely determined haplotypes, up to inherent degrees of freedom, and give a sufficient condition for the uniqueness. To actually determine the haplotypes given the tree, additional information is necessary. We show that two or three full genotypes suffice to reconstruct all the haplotypes, and present a linear algorithm for identifying those genotypes.

References

[1]
E.M. Arkin and R. Hassin, "Multiple-Choice Minimum Diameter Problems," unpublished manuscript, 1992.
[2]
V. Bafna, D. Gusfield, G. Lancia, and S. Yooseph, "Haplotyping as Perfect Phylogeny: A Direct Approach," Technical Report UC Davis CSE-2002-21, 2002.
[3]
T. Barzuza, J.S. Beckmann, R. Shamir, and I. Pe'er, "Computational Problems in Perfect Phylogeny Haplotyping: XOR-Genotypes and Tag SNPs," Proc. Ann. Symp. Combinatorial Pattern Matching (CPM '04), pp. 14-31, 2004.
[4]
T. Barzuza, J.S. Beckmann, R. Shamir, and I. Pe'er, "Typing without Calling the Allele: A Strategy for Inferring SNP Haplotypes," The European J. Human Genetics, vol. 13, no. 8, pp. 898-901, 2005.
[5]
R.E. Bixby and D.K. Wagner, "An Almost Linear-Time Algorithm for Graph Realization," Math. Operations Research, vol. 13, pp. 99- 123, 1988.
[6]
A.G. Clark, "Inference of Haplotypes from PCR-Amplified Samples of Diploid Populations," Molecular Biology and Evolution, vol. 7, no. 2, pp. 111-122, 1990.
[7]
M.J. Daly, J.D. Rioux, S.F. Schaffner, T.J. Hudson, and E.S. Lander, "High Resolution Haplotype Structure in the Human Genome," Nature Genetics, vol. 29, no. 2, pp. 229-232, 2001.
[8]
Z. Ding, V. Filkov, and D. Gusfield, "A Linear-Time Algorithm for Perfect Phylogeny Haplotyping," Proc. Ninth Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '05), 2005.
[9]
G. Eastbrook, C. Johnson, and F. McMorris, "An Idealized Concept of the True Cladistic Character," Math. Biosciences, vol. 23, pp. 263-272, 1975.
[10]
E. Eskin, E. Halperin, and R.M. Karp, "Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny," J. Bioinformatics and Computational Biology, vol. 1, no. 1, Apr. 2003.
[11]
L. Excoffier and M. Slatkin, "Maximum-Likelihood Estimation of Molecular Haplotype Frequencies in a Diploid Population," Molecular Biology and Evolution, vol. 12, no. 5, pp. 921-927, 1995.
[12]
S.B. Gabriel et al., "The Structure of Haplotype Blocks in Human Genome," Science, vol. 296, pp. 2225-2229, 2002.
[13]
F. Gavril and R. Tamari, "An Algorithm for Constructing Edge-Trees from Hypergraphs," Networks, vol. 13, pp. 377-388, 1983.
[14]
D. Gusfield, Algorithms on Strings, Trees, and Sequences--Computer Science and Computational Biology. Cambridge Univ. Press, 1997.
[15]
D. Gusfield, "Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions," Proc. Sixth Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '02), pp. 166- 175, 2002.
[16]
J.E. Hopcroft and R.E. Tarjan, "Dividing a Graph into y Triconnected Components," SIAM J. Computing, vol. 2, pp. 135- 157, 1973.
[17]
A.J. Jeffreys, L. Kauppi, and R. Neumann, "Intensely Punctate Meiotic Recombination in the Class II Region of the Major Histocompatibility Complex," Nature Genetics, vol. 29, pp. 217- 222, 2001.
[18]
P.Y. Kwok, "Genetic Assoc. by Whole-Genome Analysis," Science, vol. 294, no. 5547, pp. 1669-1670, 2001.
[19]
S. Myers, L. Bottolo, C. Freeman, G. McVean, and P. Donnelly, "A Fine-Scale Map of Recombination Rates and Hotspots Across the Human Genome," Science, vol. 310, no. 5746, pp. 321-324, 2005.
[20]
M.W. Nachman and S.L. Crowell, "Estimate of the Mutation Rate Per Nucleotide in Humans," Genetics, vol. 156, pp. 297-304, 2000.
[21]
N. Patil et al., "Blocks of Limited Haplotype Diversity Revealed by High Resolution Scanning of Human Chromosome 21," Science, vol. 294, no. 5547, pp. 1719-1723, 2001.
[22]
I. Pe'er and J.S. Beckmann, "Resolution of Haplotypes and Haplotype Frequencies from SNP Genotypes of Pooled Samples," Proc. Seventh Ann. Int'l Conf. Research in Computational Molecular Biology (RECOMB '03), pp. 237-246, 2003.
[23]
R. Sachidanandam et al., "A Map of Human Genome Sequence Variation Containing 1.42 Million Single Nucleotide Polymorphisms," Nature, vol. 409, no. 6822, pp. 928-933, 2001.
[24]
W.T. Tutte, "An Algorithm for Determining whether a Given Binary Matroid Is Graphic," Proc. Am. Math. Soc., vol. 11, pp. 905- 917, 1960.
[25]
W.T. Whitney, "2-Isomorphic Graphs," Am. J. Math., vol. 55, pp. 245-254, 1933.
[26]
W. Xiao and P.J. Oefner, "Denaturing High-Performance Liquid Chromatography: A Review," Human Mutation, vol. 17, no. 6, pp. 439-474, 2001.

Cited By

View all
  • (2010)Xor perfect phylogeny haplotyping in pedigreesProceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing10.5555/1881227.1881233(27-34)Online publication date: 18-Aug-2010
  • (2010)Pure Parsimony Xor HaplotypingIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2010.527:4(598-610)Online publication date: 1-Oct-2010
  • (2010)Algorithm for haplotype resolution and block partitioning for partial XOR-genotype dataJournal of Biomedical Informatics10.1016/j.jbi.2009.08.00943:1(51-59)Online publication date: 1-Feb-2010

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 5, Issue 1
January 2008
159 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 January 2008
Published in TCBB Volume 5, Issue 1

Author Tags

  1. Graph Realization
  2. Haplotypes
  3. Perfect Phylogeny
  4. XOR-genotypes

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2010)Xor perfect phylogeny haplotyping in pedigreesProceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing10.5555/1881227.1881233(27-34)Online publication date: 18-Aug-2010
  • (2010)Pure Parsimony Xor HaplotypingIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2010.527:4(598-610)Online publication date: 1-Oct-2010
  • (2010)Algorithm for haplotype resolution and block partitioning for partial XOR-genotype dataJournal of Biomedical Informatics10.1016/j.jbi.2009.08.00943:1(51-59)Online publication date: 1-Feb-2010

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media