ACM Home Page
Please provide us with feedback. Feedback
Robustness of adaptive filtering methods in a cross-benchmark evaluation
Full text PdfPdf (350 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Salvador, Brazil
SESSION: Filtering table of contents
Pages: 98 - 105  
Year of Publication: 2005
ISBN:1-59593-034-5
Authors
Yiming Yang  Carnegie Mellon University, Pittsburgh, PA
Shinjae Yoo  Carnegie Mellon University, Pittsburgh, PA
Jian Zhang  Carnegie Mellon University, Pittsburgh, PA
Bryan Kisiel  Carnegie Mellon University, Pittsburgh, PA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 10,   Downloads (12 Months): 76,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1076034.1076054
What is a DOI?

ABSTRACT

This paper reports a cross-benchmark evaluation of regularized logistic regression (LR) and incremental Rocchio for adaptive filtering. Using four corpora from the Topic Detection and Tracking (TDT) forum and the Text Retrieval Conferences (TREC) we evaluated these methods with non-stationary topics at various granularity levels, and measured performance with different utility settings. We found that LR performs strongly and robustly in optimizing T11SU (a TREC utility function) while Rocchio is better for optimizing Ctrk (the TDT tracking cost), a high-recall oriented objective function. Using systematic cross-corpus parameter optimization with both methods, we obtained the best results ever reported on TDT5, TREC10 and TREC11. Relevance feedback on a small portion (0.05~0.2%) of the TDT5 test documents yielded significant performance improvements, measuring up to a 54% reduction in Ctrk and a 20.9% increase in T11SU (with b=0.1), compared to the results of the top-performing system in TDT2004 without relevance feedback information.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
 
4
J. Fiscus and B. Wheatley. Overview of the TDT 2004 Evaluation and Results. In TDT-04, 2004.
 
5
T. Hastie, R. Tibshirani and J. Friedman. Elements of Statistical Learning. Springer, 2001.
 
6
S. Robertson and D. Hull. The TREC-9 filtering track final report. In TREC-9, 2000.
 
7
S. Robertson and I. Soboroff. The TREC-10 filtering track final report. In TREC-10, 2001.
 
8
S. Robertson and I. Soboroff. The TREC 2002 filtering track report. In TREC-11, 2002.
 
9
S. Robertson and S. Walker. Microsoft Cambridge at TREC-9. In TREC-9, 2000.
10
11
12
13
14
 
15


Collaborative Colleagues:
Yiming Yang: colleagues
Shinjae Yoo: colleagues
Jian Zhang: colleagues
Bryan Kisiel: colleagues