skip to main content
10.5555/1347082.1347156acmconferencesArticle/Chapter ViewAbstractPublication PagessodaConference Proceedingsconference-collections
research-article

Approximation algorithms for labeling hierarchical taxonomies

Published: 20 January 2008 Publication History

Abstract

We consider the following taxonomy labeling problem. Each node of an n-node tree has to be labeled with the values of k attributes. A partial labeling is given as part of the input. The goal is to complete this labeling, minimizing the maximum variation in labeling along an edge. A special case of this problem (which we call the label extension problem), where every node is either completely labeled or not labeled at all, has been considered previously.
We present an O(log2 k)-approximation algorithm based on a natural linear programming relaxation. Our results reduce the taxonomy labeling problem to another problem we introduce, called the multicut packing problem (on trees): given k multicommodity flow instances, find a multicut for each instance so as to minimize the maximum number of multicuts that use any single edge. Our algorithm yields an O(log2 k)-approximation algorithm for this more general problem. We show that the integrality gap of our relaxation is Ω(logk), even when applied to the taxonomy labeling problem with 0-1 labels.
For the label extension problem, we considerably improve the previous O(log n) approximation guarantee and give the first constant-factor approximation algorithm for this problem. Our work relies on relating the label extension problem to questions on Lipschitz extensions of functions into Banach spaces. In particular, our approximation algorithm builds upon Matoušek's tree metrics extension theorem. Our algorithm also works for other metrics on the label-set, such as edit distance with unit-cost operations, and more generally any shortest path metric induced by an unweighted graph.

References

[1]
A. Ben-Dor, G. Lancia, J. Perone, and R. Ravi. Banishing bias from consensus sequences. In Proceedings of the 8th CPM, pages 247--261, 1997.
[2]
G. Calinescu, H. Karloff, and Y. Rabani. An improved approximation algorithm for multiway cut. Journal of Computer and System Sciences, 60(3):564--574, 2000.
[3]
A. Caprara, A. Panconesi, and R. Rizzi. Packing cuts in undirected graphs. Networks, 44(1):1--11, 2004.
[4]
G. Calinescu, H. Karloff, and Y. Rabani. Approximation algorithms for the 0-extension problem. SIAM Journal on Computing, 34(2):358--372, 2004.
[5]
R. Carr and S. Vempala. Randomized metarounding. Rand. Struc. and Algorithms, 20(3):343--352, 2002.
[6]
C. Chekuri, S. Khanna, J. Naor, and L. Zosin. A linear programming formulation and approximation Algorithms for the metric labeling problem. SIAM J. on Discrete Mathematics, 18(3):608--625, 2004.
[7]
S. Deerwester, S. Dumais, T. Landauer, G. Furnas, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6):391--407, 1990.
[8]
M. Frances and A. Litman. On covering problems of codes. Theory of Comput. Syst., 30:113--119, 1997.
[9]
N. Garg, V. Vazirani, and M. Yannakakis. Approximate max-flow min-(multi)cut theorems and their applications. SIAM Journal on Computing, 25(2):235--251, 1996.
[10]
W. Johnson and J. Lindenstrauss. Extensions of Lipschitz maps into a Hilbert space. Contemporary Mathematics, 26:189--206, 1984.
[11]
W. Johnson, J. Lindenstrauss, and G. Schechtman. Extensions of Lipschitz maps into Banach spaces. Israel Journal of Mathematics, 54:129--138, 1986.
[12]
A. Karzanov. Minimum 0-extensions of graph metrics. European J. of Combinatorics, 19(1):71--101, 1998.
[13]
J. Kleinberg and É. Tardos. Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. Journal of the ACM, 49(5):616--639, 2002.
[14]
J. Lee and A. Naor. Extending Lipschitz functions via random metric partitions. Inventiones Mathematicae, 160(1):59--95, 2005.
[15]
M. Li, B. Ma, and L. Wang. On the closest string and substring problems. Journal of the ACM, 49(2):157--171, 2002.
[16]
J. Matoušek. Extension of Lipschitz mappings on metric trees. Commentationes Mathematicae Universitatis Carolinae, 31(1):99--104, 1990.
[17]
C. Papadimitriou, P. Raghavan, H. Tamaki, and S. Vempala. Latent semantic indexing: a probabilistic analysis. Journal of Computer and System Sciences, 61(2):217--235, 2000.
[18]
R. Ravi and J. Kececioglu. Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree. Disc. App. Mathematics, 88:355--366, 1998.
[19]
L. Wang and D. Gusfield. Improved approximation algorithms for tree alignment. Journal of Algorithms, 25(2):255--273, 1997.
[20]
L. Wang and T. Jiang. On the complexity of multiple sequence alignment. Journal of Computational Biology, 1(4):337--348, 1994.
[21]
L. Wang, T. Jiang, and D. Gusfield. A more efficient approximation scheme for tree alignment. SIAM Journal on Computing, 30(1):283--299, 2000.
[22]
L. Wang, T. Jiang, and E. Lawler. Approximation algorithms for tree alignment with a given phylogeny. Algorithmica, 16(3):302--315, 1996.

Cited By

View all
  • (2009)Packing multiway cuts in capacitated graphsProceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms10.5555/1496770.1496884(1048-1057)Online publication date: 4-Jan-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SODA '08: Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
January 2008
1289 pages
  • Program Chair:
  • Shang-Hua Teng

Sponsors

Publisher

Society for Industrial and Applied Mathematics

United States

Publication History

Published: 20 January 2008

Check for updates

Qualifiers

  • Research-article

Conference

SODA08
Sponsor:
SODA08: 19th ACM-SIAM Symposium on Discrete Algorithms
January 20 - 22, 2008
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 411 of 1,322 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2009)Packing multiway cuts in capacitated graphsProceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms10.5555/1496770.1496884(1048-1057)Online publication date: 4-Jan-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media