ACM Home Page
Please provide us with feedback. Feedback
Indexing large metric spaces for similarity search queries
Full text PdfPdf (282 KB)
Source ACM Transactions on Database Systems (TODS) archive
Volume 24 ,  Issue 3  (September 1999) table of contents
Pages: 361 - 404  
Year of Publication: 1999
ISSN:0362-5915
Authors
Tolga Bozkaya  Oracle Corp., Redwood Shores, CA
Meral Ozsoyoglu  Case Western Reserve Univ., Cleveland, OH
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 113,   Citation Count: 23
Additional Information:

abstract   references   cited by   index terms   review   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/328939.328959
What is a DOI?

ABSTRACT

One of the common queries in many database applications is finding approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance-based index structures are proposed for applications where the distance computations between objects of the data domain are expensive (such as high-dimensional data) and the distance function is metric. In this paper we consider using distance-based index structures for similarity queries on large metric spaces. We elaborate on the approach that uses reference points (vantage points) to partition the data space into spherical shell-like regions in a hierarchical manner. We introduce the multivantage point tree structure (mvp-tree) that uses more than one vantage point to partiton the space into spherical cuts at each level. In answering similarity-based queries, the mvp-tree also utilizes the precomputed (at construction time) distances between the data points and the vantage points. We summarize the experiments comparing mvp-trees to vp-trees that have a similar partitioning strategy, but use only one vantage point at each level and do not make use of the precomputed distances. Empirical studies show that the mvp-tree outperforms the vp-tree by 20% to 80% for varying query ranges and different distance distributions. Next, we generalize the idea of using multiple vantage points and discuss the results of experiments we have made to see how varying the number of vantage points in a node affects affects performance and how much is gained in performance by making use of precomputed distances. The results show that, after all, it may be best to use a large number of vantage points in an internal node in order to end up with a single directory node and keep as many of the precomputed distances as possible to provide more efficient filtering during search operations. Finally, we provide some experimental results that compare mvp-trees with M-trees, which is a dynamic distance-based index structure for metric domains.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
3
4
 
5
6
 
7
 
8
 
9
CIACCIA,P.AND PATELLA, M. 1998a. Bulk loading the M-tree. In Proceedings of the Australasian Conference on Databases (ADC, Perth, Australia)
 
10
 
11
12
 
13
14
15
 
16
 
17
OTTERMAN, M. 1992. Approximate matching with high dimensionality R-trees. Department of Computer Science, University of Maryland, College Park, MD.
18
 
19
 
20
21
 
22
UHLMANN, J. K. 1991. Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett. 40, 4 (Nov.), 175-179.
 
23
 
24

CITED BY  23
 
 
 
 
 
 
 
 
 
 
 
 


REVIEW

"Yannis P. Manolopoulos : Reviewer"

A method generalizing the existing vp-trees for indexing in metric spaces is presented. The superiority of the new method over vp-trees, and over another dynamic indexing method—M-trees—is examined experimentally. The p  more...

Collaborative Colleagues:
Tolga Bozkaya: colleagues
Meral Ozsoyoglu: colleagues

Peer to Peer - Readers of this Article have also read: