ABSTRACT
Maintaining strict static score order of inverted lists is a heuristic used by search engines to improve the quality of query results when the entire inverted lists cannot be processed. This heuristic, however, increases the cost of index generation and requires complex index build algorithms. In this paper, we study a new index organization based on static score bucketing. We show that this new technique significantly improves in index build performance while having minimal impact on the quality of search results.
- S. Brin and L. Page. The anatomy of a large-scale hypertextual (web) search engine. Computer Networks and ISDN Systems, 30(1-7):107--117, 1998. Google ScholarDigital Library
- R. Fagin, R. Kumar, and D. Sivakumar. Comparing top k lists. SIAM Journal on Discrete Mathematics, 17(1):134--160, 2003. Google ScholarDigital Library
- M. Fontoura, A. Neumann, S. Rajagopalan, E. Shekita, and J. Zien. High performance index build algorithms for intranet search engines. In VLDB' 2004. Google ScholarDigital Library
- T. Haveliwala. Efficient encoding for document ranking vectors. In Proc. of 4th Int. Conference on Internet Computing, 2003.Google Scholar
- N. Lester, J. Zobel, and H. E. Williams. In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems. In CRPIT '2004. Google ScholarDigital Library
- X. Long and T. Suel. Optimized query execution in large search engines with global page ordering. In Proc. of the 29th Int. Conf. on Very Large Databases, 2003. Google ScholarDigital Library
Index Terms
- Static score bucketing in inverted indexes
Recommendations
Inverted index maintenance strategy for flashSSDs
An inverted index is a core data structure of Information Retrieval systems, especially in search engines. Since the search environments have become more dynamic, many on-line index maintenance strategies have been proposed. Previous strategies were ...
Inverted indexes vs. bitmap indexes in decision support systems
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementBitmap indexes are widely used in Decision Support Systems (DSSs) to improve query performance. In this paper, we evaluate the use of compressed inverted indexes with adapted query processing strategies from Information Retrieval as an alternative. In a ...
Improving Web search efficiency via a locality based static pruning method
WWW '05: Proceedings of the 14th international conference on World Wide WebThe unarguably fast, and continuous, growth of the volume of indexed (and indexable) documents on the Web poses a great challenge for search engines. This is true regarding not only search effectiveness but also time and space efficiency. In this paper ...
Comments