| Robustness of adaptive filtering methods in a cross-benchmark evaluation |
| Full text |
Pdf
(350 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Salvador, Brazil
SESSION: Filtering
table of contents
Pages: 98 - 105
Year of Publication: 2005
ISBN:1-59593-034-5
|
|
Authors
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 10, Downloads (12 Months): 76, Citation Count: 4
|
|
|
ABSTRACT
This paper reports a cross-benchmark evaluation of regularized logistic regression (LR) and incremental Rocchio for adaptive filtering. Using four corpora from the Topic Detection and Tracking (TDT) forum and the Text Retrieval Conferences (TREC) we evaluated these methods with non-stationary topics at various granularity levels, and measured performance with different utility settings. We found that LR performs strongly and robustly in optimizing T11SU (a TREC utility function) while Rocchio is better for optimizing Ctrk (the TDT tracking cost), a high-recall oriented objective function. Using systematic cross-corpus parameter optimization with both methods, we obtained the best results ever reported on TDT5, TREC10 and TREC11. Relevance feedback on a small portion (0.05~0.2%) of the TDT5 test documents yielded significant performance improvements, measuring up to a 54% reduction in Ctrk and a 20.9% increase in T11SU (with b=0.1), compared to the results of the top-performing system in TDT2004 without relevance feedback information.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
J. Fiscus and B. Wheatley. Overview of the TDT 2004 Evaluation and Results. In TDT-04, 2004.
|
| |
5
|
T. Hastie, R. Tibshirani and J. Friedman. Elements of Statistical Learning. Springer, 2001.
|
| |
6
|
S. Robertson and D. Hull. The TREC-9 filtering track final report. In TREC-9, 2000.
|
| |
7
|
S. Robertson and I. Soboroff. The TREC-10 filtering track final report. In TREC-10, 2001.
|
| |
8
|
S. Robertson and I. Soboroff. The TREC 2002 filtering track report. In TREC-11, 2002.
|
| |
9
|
S. Robertson and S. Walker. Microsoft Cambridge at TREC-9. In TREC-9, 2000.
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
|
CITED BY 4
|
|
|
Yiming Yang , Abhimanyu Lad , Ni Lao , Abhay Harpale , Bryan Kisiel , Monica Rogati, Utility-based information distillation over temporally sequenced documents, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
|
|
Daqing He , Peter Brusilovsky , Jaewook Ahn , Jonathan Grady , Rosta Farzan , Yefei Peng , Yiming Yang , Monica Rogati, An evaluation of adaptive filtering in the context of realistic task-based information exploration, Information Processing and Management: an International Journal, v.44 n.2, p.511-533, March, 2008
|
|