APPENDICES and SUPPLEMENTS
|
|
The software suite accompanying the article; this is a small Unix tar file, which includes the source code, a Makefile, and the test files used in the article.
|
ABSTRACT
We develop data structures for dynamic closest pair problems with arbitrary distance functions, that do not necessarily come from any geometric structure on the objects. Based on a technique previously used by the author for Euclidean closest pairs, we show how to insert and delete objects from an <i>n</i>-object set, maintaining the closest pair, in <i>O</i>(<i>n</i> log<sup>2</sup> <i>n</i>) time per update and <i>O</i>(<i>n</i>) space. With quadratic space, we can instead use a quadtree-like structure to achieve an optimal time bound, <i>O</i>(<i>n</i>) per update. We apply these data structures to hierarchical clustering, greedy matching, and TSP heuristics, and discuss other potential applications in machine learning, Gröbner bases, and local improvement algorithms for partition and placement problems. Experiments show our new methods to be faster in practice than previously used heuristics.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
ADAMS, W. W. AND LOUSTAUNAU, P. 1994. <i>An Introduction to Gröbner Bases</i>. Number 3 in Graduate Studes in Mathematics. AMS.
|
| |
2
|
|
| |
3
|
AICHHOLZER, O. 1997. <i>Combinatorial & Computational Properties of the Hypercube - New Results on Covering, Slicing, Clustering and Searching on the Hypercube</i>. Ph. D. thesis, Tech. Univ. Graz.
|
| |
4
|
ANDERBERG, M. R. 1973. <i>Cluster Analysis for Applications</i>. Number 19 in Probability and Mathematical Statistics. Academic Press, New York.
|
| |
5
|
BALL, G. H. AND HALL, D.J. 1965. ISODATA, a novel method of data analysis and pattern classification. Technical report, Stanford Research Inst.
|
| |
6
|
|
| |
7
|
|
| |
8
|
CHENG, X. AND WALLACE, J. M. 1993. Cluster analysis of the northern hemisphere wintertime 500-hPa height field: spatial patterns. <i>J. Atmospheric Sciences 50</i>, 16 (August), 2674-2696.
|
 |
9
|
Matthew Clegg , Jeffery Edmonds , Russell Impagliazzo, Using the Groebner basis algorithm to find proofs of unsatisfiability, Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, p.174-183, May 22-24, 1996, Philadelphia, Pennsylvania, United States
[doi> 10.1145/237814.237860]
|
| |
10
|
CORPET, F. 1988. Multiple sequence alignment with hierarchical clustering. <i>Nucleic Acids Research 16</i>, 22, 10881-10890.
|
 |
11
|
|
 |
12
|
|
| |
13
|
DURAN, B. S. AND ODELL, P.L. 1974. <i>Cluster Analysis: A Survey</i>. Number 100 in Lecture Notes in Economics and Mathematical Systems. Springer-Verlag, Berlin.
|
| |
14
|
EPPSTEIN, D. 1995. Dynamic Euclidean minimum spanning trees and extrema of binary functions. <i>Discrete & Computational Geometry 13</i>, 1 (January), 111-122.
|
| |
15
|
EPPSTEIN, D. AND ERICKSON, J. 1999. Raising roofs, crashing cycles, and playing pool: applications of a data structure for finding pairwise interactions. <i>Discrete & Computational Geometry 22</i>, 4, 569-592.
|
| |
16
|
FELSENSTEIN, J. 1995. PHYLIP (Phylogeny Inference Package) version 3.572c. Distributed by the author.
|
| |
17
|
|
 |
18
|
Alessandro Giovini , Teo Mora , Gianfranco Niesi , Lorenzo Robbiano , Carlo Traverso, “One sugar cube, please” or selection strategies in the Buchberger algorithm, Proceedings of the 1991 international symposium on Symbolic and algebraic computation, p.49-54, July 15-17, 1991, Bonn, West Germany
[doi> 10.1145/120694.120701]
|
| |
19
|
|
| |
20
|
GOTOH, O. 1994. Further improvements in methods of group-to-group sequence alignment with generalized profile operations. <i>CABIOS 10</i>, 4, 379-387.
|
| |
21
|
|
| |
22
|
KRZNARIC, D. AND LEVCOPOULOS, C. 1998. Fast algorithms for complete linkage clustering. <i>Discrete & Computational Geometry 19</i>, 1 (January), 131-145.
|
| |
23
|
MATIAS, Y. 1993. Semi-dynamic closest-pair algorithms. In <i>Proc. 5th Canad. Conf. Computational Geometry</i> (August 1993), pp. 264-271.
|
| |
24
|
MICHENER, C. D. AND SOKAL, R.R. 1957. A quantitative approach to a problem in classification. <i>Evolution 11</i>, 130-162.
|
| |
25
|
PAZZANI, M.J. 1997. Constructive induction of Cartesian product attributes. Manuscript.
|
| |
26
|
REINGOLD, E. M. AND TARJAN, R.E. 1981. On a greedy heuristic for complete matching. <i>SIAM J. Computing 10</i>, 4 (November), 676-681.
|
| |
27
|
ROSENKRANTZ, D. H., STEARNS, R. E., AND LEWIS, P. M., II. 1977. An analysis of several heuristics for the traveling salesman problem. <i>SIAM J. Computing 6</i>, 3 (September), 563- 581.
|
| |
28
|
SCHWARZ, C., SMID, M., AND SNOEYINK, J. 1994. An optimal algorithm for the on-line closest pair problem. <i>Algorithmica 12</i>, 1 (July), 18-29.
|
| |
29
|
|
| |
30
|
STURMFELS, B. 1996. Gröbner Bases and Convex Polytopes. Number 8 in University Lecture Ser. AMS.
|
| |
31
|
|
| |
32
|
THOMPSON, J. D., HIGGINS, D. G., AND GIBSON, T. J. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. <i>Nucleic Acids Research 22</i>, 4673- 4680.
|
| |
33
|
WARD, J. 1963. Hierarchical grouping to optimize an objective function. <i>J. Amer. Statistical Assoc. 58</i>, 301, 236-244.
|
| |
34
|
|
| |
35
|
ZUPAN, J. 1982. <i>Clustering of Large Data Sets</i>. Chemometrics Research Studies Ser. Research Studies Press, Chichester.
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|