| User performance versus precision measures for simple search tasks |
| Full text |
Pdf
(176 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Seattle, Washington, USA
SESSION: User behavior and modeling
table of contents
Pages: 11 - 18
Year of Publication: 2006
ISBN:1-59593-369-7
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 28, Downloads (12 Months): 236, Citation Count: 7
|
|
|
ABSTRACT
Several recent studies have demonstrated that the type of improvements in information retrieval system effectiveness reported in forums such as SIGIR and TREC do not translate into a benefit for users. Two of the studies used an instance recall task, and a third used a question answering task, so perhaps it is unsurprising that the precision based measures of IR system effectiveness on one-shot query evaluation do not correlate with user performance on these tasks. In this study, we evaluate two different information retrieval tasks on TREC Web-track data: a precision-based user task, measured by the length of time that users need to find a single document that is relevant to a TREC topic; and, a simple recall-based task, represented by the total number of relevant documents that users can identify within five minutes. Users employ search engines with controlled mean average precision (MAP) of between 55% and 95%. Our results show that there is no significant relationship between system effectiveness measured by MAP and the precision-based task. A significant, but weak relationship is present for the precision at one document returned metric. A weak relationship is present between MAP and the simple recall-based task.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
 |
3
|
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
N. Craswell, D. Hawking, R. Wilkinson, and M. Wu. Overview of the TREC 2003 web track. In The Twelfth Text REtrieval Conference (TREC 2003), pages 78--92, Gaithersburg, MD, 2003. NIST Special Publication 500--255.
|
| |
9
|
M. Elsenberg and C. Barry. Order effects: A study of the possible influence of presentation order on user judgments of document relevance. Journal of the American Society for Information Science and Technology, 39:293--301, 1988.
|
| |
10
|
D. K. Harman. The TREC test collection. In E. M. Voorhees and D. K. Harman, editors, TREC: experiment and evaluation in information retrieval. MIT Press, 2005.
|
| |
11
|
|
| |
12
|
W. Hersh and P. Over. TREC-9 interactive track report. In The Ninth Text REtrieval Conference (TREC-9), pages 41--50, Gaithersburg, MD, 2000. NIST Special Publication 500--249.
|
 |
13
|
William Hersh , Andrew Turpin , Susan Price , Benjamin Chan , Dale Kramer , Lynetta Sacherek , Daniel Olson, Do batch and user evaluations give the same results?, Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, p.17-24, July 24-28, 2000, Athens, Greece
[doi> 10.1145/345508.345539]
|
| |
14
|
W. R. Hersh. Trec 2002 interactive track report. In The Eleventh Text REtrieval Conference (TREC 2002), Gaithersburg, MD, 2002. NIST Special Publication 500--251.
|
 |
15
|
Thorsten Joachims , Laura Granka , Bing Pan , Helene Hembrooke , Geri Gay, Accurately interpreting clickthrough data as implicit feedback, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
[doi> 10.1145/1076034.1076063]
|
 |
16
|
|
 |
17
|
|
| |
18
|
|
| |
19
|
|
 |
20
|
|
|