ACM Home Page
Please provide us with feedback. Feedback
User performance versus precision measures for simple search tasks
Full text PdfPdf (176 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Seattle, Washington, USA
SESSION: User behavior and modeling table of contents
Pages: 11 - 18  
Year of Publication: 2006
ISBN:1-59593-369-7
Authors
Andrew Turpin  RMIT University, Melbourne, Australia
Falk Scholer  RMIT University, Melbourne, Australia
Sponsors
SIGIR: ACM Special Interest Group on Information Retrieval
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 28,   Downloads (12 Months): 236,   Citation Count: 7
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1148170.1148176
What is a DOI?

ABSTRACT

Several recent studies have demonstrated that the type of improvements in information retrieval system effectiveness reported in forums such as SIGIR and TREC do not translate into a benefit for users. Two of the studies used an instance recall task, and a third used a question answering task, so perhaps it is unsurprising that the precision based measures of IR system effectiveness on one-shot query evaluation do not correlate with user performance on these tasks. In this study, we evaluate two different information retrieval tasks on TREC Web-track data: a precision-based user task, measured by the length of time that users need to find a single document that is relevant to a TREC topic; and, a simple recall-based task, represented by the total number of relevant documents that users can identify within five minutes. Users employ search engines with controlled mean average precision (MAP) of between 55% and 95%. Our results show that there is no significant relationship between system effectiveness measured by MAP and the precision-based task. A significant, but weak relationship is present for the precision at one document returned metric. A weak relationship is present between MAP and the simple recall-based task.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
4
5
 
6
 
7
 
8
N. Craswell, D. Hawking, R. Wilkinson, and M. Wu. Overview of the TREC 2003 web track. In The Twelfth Text REtrieval Conference (TREC 2003), pages 78--92, Gaithersburg, MD, 2003. NIST Special Publication 500--255.
 
9
M. Elsenberg and C. Barry. Order effects: A study of the possible influence of presentation order on user judgments of document relevance. Journal of the American Society for Information Science and Technology, 39:293--301, 1988.
 
10
D. K. Harman. The TREC test collection. In E. M. Voorhees and D. K. Harman, editors, TREC: experiment and evaluation in information retrieval. MIT Press, 2005.
 
11
 
12
W. Hersh and P. Over. TREC-9 interactive track report. In The Ninth Text REtrieval Conference (TREC-9), pages 41--50, Gaithersburg, MD, 2000. NIST Special Publication 500--249.
13
 
14
W. R. Hersh. Trec 2002 interactive track report. In The Eleventh Text REtrieval Conference (TREC 2002), Gaithersburg, MD, 2002. NIST Special Publication 500--251.
15
16
17
 
18
 
19
20

CITED BY  7
 
 
 

Collaborative Colleagues:
Andrew Turpin: colleagues
Falk Scholer: colleagues