|
ABSTRACT
Prior research into search system scalability has primarily addressed query processing efficiency [1, 2, 3] or indexing efficiency [3], or has presented some arbitrary system architecture [4]. Little work has introduced any formal theoretical framework for evaluating architectures with regard to specific operational requirements, or for comparing architectures beyond simple timings [5] or basic simulations [6, 7]. In this paper, we present a framework based upon queuing network theory for analyzing search systems in terms of operational requirements. We use response time, throughput, and utilization as the key operational characteristics for evaluating performance. Within this framework, we present a scalability strategy that combines index partitioning and index replication to satisfy a given set of requirement.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
|
| |
5
|
A. Arvind, C. Junghoo, G. Hector, P. Andreas, R. Sriram, "Searching the Web", Stanford Technical Report, Dec. 2000, http://dbpubs.stanford.edu/pub/2000-37.
|
| |
6
|
B. Cahoon, K. McKinley, "Evaluating the Performance of Distributed Architectures for Information Retrieval using a Variety of Workloads", ACM Transactions on Information Systems, (1997).
|
| |
7
|
|
 |
8
|
|
| |
9
|
N. Goharian, T. El-Ghazawi, D. Grossman, A. Chowdhury, "On the Enhancements of a Sparse Matrix Information Retrieval Approach", PDPTA, Las Vegas, Nevada, 2000.
|
| |
10
|
H. Williams, J. Zobel, P. Anderson, "What's Next? Index Structures for Efficient Phrase Querying", Australasian Database Conference 1999: 141-152.
|
| |
11
|
D. Bahle, H. Williams, J. Zobel, "Compaction Techniques for Nextword Indexes", SPIRE 2001: 33--45.
|
 |
12
|
|
| |
13
|
|
| |
14
|
H. Williams, J. Zobel, "Compressing Integers for Fast File Access". The Computer Journal 42(3): 193--201 (1999).
|
 |
15
|
|
 |
16
|
C. Stanfill , R. Thau , D. Waltz, A parallel indexed algorithm for information retrieval, Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval, p.88-97, June 25-28, 1989, Cambridge, Massachusetts, United States
|
 |
17
|
|
| |
18
|
|
 |
19
|
Peter B. Danzig , Jongsuk Ahn , John Noll , Katia Obraczka, Distributed indexing: a scalable mechanism for distributed information retrieval, Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, p.220-229, October 13-16, 1991, Chicago, Illinois, United States
[doi> 10.1145/122860.122883]
|
| |
20
|
|
| |
21
|
|
| |
22
|
A. Moffat, J. Zobel: "Information Retrieval Systems for Large Document Collections". TREC 1994.
|
 |
23
|
Luis Gravano , Héctor García-Molina , Anthony Tomasic, The effectiveness of GIOSS for the text database discovery problem, Proceedings of the 1994 ACM SIGMOD international conference on Management of data, p.126-137, May 24-27, 1994, Minneapolis, Minnesota, United States
|
| |
24
|
|
 |
25
|
|
 |
26
|
James P. Callan , Zhihong Lu , W. Bruce Croft, Searching distributed collections with inference networks, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.21-28, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215328]
|
| |
27
|
T. R. Couvreur , R. N. Benzel , S. F. Miller , D. N. Zeitler , D. L. Lee , M. Singhal , N. Shivaratri , W. Y. P. Wong, An analysis of performance and cost factors in searching large text databases using parallel search systems, Journal of the American Society for Information Science, v.45 n.7, p.443-464, Aug. 1994
[doi> 10.1002/(SICI)1097-4571(199408)45:7<443::AID-ASI1>3.0.CO;2-O]
|
| |
28
|
|
 |
29
|
|
| |
30
|
|
| |
31
|
Peter Bailey and David Hawking, A Parallel Architecture for Query Processing Over a Terabyte of Text, Department of Computer Science, Technical Report TR-CS-96-04, (June 1996).
|
 |
32
|
|
| |
33
|
B. Cahoon K. McKinley, "Evaluating the Performance of Distributed Architectures for Information Retrieval using a Variety of Workloads", ACM Transactions on Information Systems, (1997).
|
 |
34
|
|
| |
35
|
|
| |
36
|
|
 |
37
|
|
 |
38
|
|
| |
39
|
R. Jain, "The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling," Wiley- Interscience, New York, NY, April 1991.
|
 |
40
|
|
 |
41
|
|
| |
42
|
|
CITED BY 7
|
Steven M. Beitzel , Eric C. Jensen , Abdur Chowdhury , David Grossman , Ophir Frieder, Hourly analysis of a very large topically categorized web query log, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
|
|
Steven M. Beitzel , Eric C. Jensen , Ophir Frieder , Abdur Chowdhury , Greg Pass, Surrogate scoring for improved metasearch precision, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
Claudine Badue , Ricardo Baeza-Yates , Berthier Ribeiro-Neto , Artur Ziviani , Nivio Ziviani, Modeling performance-driven workload characterization of web search systems, Proceedings of the 15th ACM international conference on Information and knowledge management, November 06-11, 2006, Arlington, Virginia, USA
|
|
|
|
|
C. S. Badue , R. Baeza-Yates , B. Ribeiro-Neto , A. Ziviani , N. Ziviani, Analyzing imbalance among homogeneous index servers in a web search system, Information Processing and Management: an International Journal, v.43 n.3, p.592-608, May, 2007
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
|