poster

Extending test collection pools without manual runs

Authors:
Gaya K. Jayasinghe

RMIT University, Melbourne, Australia

RMIT University, Melbourne, Australia
View Profile

,
William Webber

William Webber Consulting, Melbourne, Australia

William Webber Consulting, Melbourne, Australia
View Profile

,
Mark Sanderson

RMIT University, Melbourne, Australia

RMIT University, Melbourne, Australia
View Profile

,
J. Shane Culpepper

RMIT University, Melbourne, Australia

RMIT University, Melbourne, Australia
View Profile

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrievalJuly 2014Pages 915–918https://doi.org/10.1145/2600428.2609473

Published:03 July 2014Publication History

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Pages 915–918

ABSTRACT

Information retrieval test collections traditionally use a combination of automatic and manual runs to create a pool of documents to be judged. The quality of the final judgments produced for a collection is a product of the variety across each of the runs submitted and the pool depth. In this work, we explore fully automated approaches to generating a pool. By combining a simple voting approach with machine learning from documents retrieved by automatic runs, we are able to identify a large portion of relevant documents that would normally only be found through manual runs. Our initial results are promising and can be extended in future studies to help test collection curators ensure proper judgment coverage is maintained across complete document collections.

References

C. Buckley, D. Dimmick, I. Soboroff, and E. Voorhees. Bias and the limits of pooling for large collections. Information Retrieval, 10(6): 491--508, 2007. Google ScholarDigital Library
S. Büttcher, C. L. A. Clarke, and I. Soboroff. The TREC 2006 terabyte track. In TREC-2006, volume 6, page 39, 2006.Google Scholar
S. Büttcher, C. L. A. Clarke, P. C. K. Yeung, and I. Soboroff. Reliable information retrieval evaluation with incomplete and biased judgements. In SIGIR, pages 63--70, 2007. Google ScholarDigital Library
B. Carterette, E. Gabrilovich, V. Josifovski, and D. Metzler. Measuring the reusability of test collections. In WSDM, pages 231--240, 2010. Google ScholarDigital Library
G. V. Cormack, C. R. Palmer, and C. L. A. Clarke. Efficient construction of large test collections. In SIGIR, pages 282--289, 1998. Google ScholarDigital Library
R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871--1874, June 2008. Google ScholarDigital Library
R. Krovetz. Viewing morphology as an inference process. In SIGIR, pages 191--202, Pittsburgh, Pennsylvania, USA, 1993. Google ScholarDigital Library
A. Moffat, W. Webber, and J. Zobel. Strategic system comparisons via targeted relevance judgments. In SIGIR, pages 375--382, 2007. Google ScholarDigital Library
T. Sakai. The unreusability of diversified search test collections. In EVIA, June 2013.Google Scholar
M. Sanderson. Test collection based evaluation of information retrieval systems. Foundations and Trends in Information Retrieval, 4 (4):247--375, 2010.Google ScholarCross Ref
I. Soboroff and S. Robertson. Building a filtering test collection for TREC 2002. In SIGIR, pages 243--250, 2003. Google ScholarDigital Library
K. Spärck Jones and C. J. Van Rijsbergen. Report on the need for and provision of an "ideal" information retrieval test collection. Technical report, British Library Research and Development Report 5266, 1975.Google Scholar
E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. Information processing & management, 36(5):697--716, 2000. Google ScholarDigital Library
E. M. Voorhees. The philosophy of information retrieval evaluation. In Evaluation of cross-language information retrieval systems, pages 355--370. Springer, 2002. Google ScholarCross Ref
E. M. Voorhees and D. K. Harman. TREC: Experiment and evaluation in information retrieval, volume 63. MIT press Cambridge, 2005. Google ScholarDigital Library
J. Zobel. How reliable are the results of large-scale information retrieval experiments? In SIGIR, pages 307--314, 1998. Google ScholarDigital Library

Index Terms

Extending test collection pools without manual runs
1. Information systems
  1. Information retrieval
  2. Information systems applications
    1. Data mining
      1. Clustering

Recommendations

Improving test collection pools with machine learning
ADCS '14: Proceedings of the 19th Australasian Document Computing Symposium

IR experiments typically use test collections for evaluation. Such test collections are formed by judging a pool of documents retrieved by a combination of automatic and manual runs for each topic. The proportion of relevant documents found for each ...
Read More
Constructing test collections by inferring document relevance via extracted relevant information
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

The goal of a typical information retrieval system is to satisfy a user's information need---e.g., by providing an answer or information "nugget"---while the actual search space of a typical information retrieval system consists of documents---i.e., ...
Read More
Efficient Test Collection Construction via Active Learning
ICTIR '20: Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval

To create a new IR test collection at low cost, it is valuable to carefully select which documents merit human relevance judgments. Shared task campaigns such as NIST TREC pool document rankings from many participating systems (and often interactive ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval
July 2014
1330 pages
ISBN:9781450322577
DOI:10.1145/2600428
General Chairs:
Shlomo Geva
Queensland University of Technology
,
Andrew Trotman
University of Dunedin
,
Program Chairs:
Peter Bruza
Queensland University of Technology
,
Charles L.A. Clarke
University of Waterloo
,
Kal Järvelin
University of Tampere
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 July 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
evaluation
information retrieval
test collection construction
Qualifiers
- poster
Conference

Acceptance Rates
SIGIR '14 Paper Acceptance Rate82of387submissions,21%Overall Acceptance Rate792of3,983submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 201
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Extending test collection pools without manual runs

SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Improving test collection pools with machine learning

Constructing test collections by inferring document relevance via extracted relevant information

Efficient Test Collection Construction via Active Learning