skip to main content
10.1145/2637002.2637010acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiixConference Proceedingsconference-collections
research-article

A user defined taxonomy of factors that divide online information retrieval sessions

Published: 26 August 2014 Publication History

Abstract

Although research is increasingly interested in session-based retrieval, comparably little work has focused on how best to divide web histories into sessions. Most automated attempts to divide web histories into sessions have focused on dividing web logs using simplistic rules, including user identifiers and specific time gaps. This research, however, is focused on understanding the full range of factors that affect the division of sessions, so that we can begin to go beyond current naive techniques like fixed time periods of inactivity. To investigate these factors, 10,000 log items were manually analysed by their owners into 847 naturally occurring web sessions. During interviews, participants reviewed their own web histories to identify these sessions, and described the causes of divisions between sessions. This paper contributes a taxonomy of six factors that can be used to better model the divisions between sessions, along with initial insights into how the divided sessions manifested in web logs. The factors in our taxonomy provide focus for future work, including our own, for finding practical ways to more intelligently divide and identify sessions for improved session-based retrieval.

References

[1]
I. Adeyanju, F. M. Nardini, M. Albakour, D. Song, U. Kruschwitz, et al. Rgu-isti-essex at trec 2011 session track. In Proc. TREC 2011. NIST, 2011.
[2]
P. Bailey, L. Chen, S. Grosenick, L. Jiang, Y. Li, P. Reinholdtsen, C. Salada, H. Wang, and S. Wong. User task understanding: a web search engine perspective. In NII Shonan Meeting on Whole-Session Evaluation of Interactive Information Retrieval Systems, Kanagawa, Japan, October 2012.
[3]
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, Sept. 2002.
[4]
K. Byström and K. Järvelin. Task complexity affects information seeking and use. Inf. Process. Manage., 31(2):191--213, Mar. 1995.
[5]
L. D. Catledge and J. E. Pitkow. Characterizing browsing strategies in the World-Wide web. Computer Networks and ISDN Systems, 27(6):1065--1073, 1995.
[6]
T. Claiborne. Update to sessions in google analytics, 2011.
[7]
D. Donato, F. Bonchi, T. Chi, and Y. Maarek. Do you want to take notes?: identifying research missions in yahoo! search pad. In Proc. WWW2010, pages 321--330, New York, NY, USA, 2010. ACM.
[8]
D. Elsweiler, M. L. Wilson, and B. K. Lunn. Understanding casual-leisure information behaviour. In A. Spink and J. Heinstrom, editors, Library and Information Science, pages 211--241. Emerald Group Publishing Limited, 2011.
[9]
D. Gayo-Avello. A survey on session detection methods in query logs and a proposal for future evaluation. Information Sciences, 179(12):1822--1843, 2009.
[10]
D. Guan, S. Zhang, and H. Yang. Utilizing query change for session search. In Proc. SIGIR2013, pages 453--462. ACM, 2013.
[11]
D. He and A. Göker. Detecting session boundaries from Web user logs. Proc. 22nd BCS-IRSG, pages 57--66, 2000.
[12]
B. J. Jansen, A. Spink, C. Blakely, and S. Koshman. Defining a session on Web search engines. JASIST, 58(6):862--871, 2007.
[13]
N. Jhaveri and K.-J. Räihä. The advantages of a cross-session web workspace. In CHI2005 Ext. Abstracts, pages 1949--1952. ACM, 2005.
[14]
E. Kanoulas, B. Carterette, M. Hall, P. Clough, and M. Sanderson. Session track 2011 overview. In Proc. TREC 2011, 2011.
[15]
M. Kellar, C. Watters, and M. Shepherd. A field study characterizing web-based information- seeking tasks. JASIST, 58(7):999--1018, 2007.
[16]
A. Kotov, P. N. Bennett, R. W. White, S. T. Dumais, and J. Teevan. Modeling and analysis of cross-session search tasks. In SIGIR 2011, pages 5--14, New York, NY, USA, 2011. ACM.
[17]
J. R. Landis and G. G. Koch. The measurement of observer agreement for categorical data. biometrics, pages 159--174, 1977.
[18]
C. Liu, J. Gwizdka, J. Liu, T. Xu, and N. J. Belkin. Analysis and evaluation of query reformulations in different task types. Proc. ASIST 2010, 47(1):1--9, 2010.
[19]
B. Mackay and C. Watters. Exploring multi-session web tasks. In Proc. CHI2008, pages 1187--1196. ACM, 2008.
[20]
D. Morris, M. Ringel Morris, and G. Venolia. Searchbar: a search-centric web history for task resumption and information re-finding. In Proc. CHI2008, pages 1207--1216. ACM, 2008.
[21]
D. Nettleton, L. Calderon-benavides, and R. Baeza-yates. Baezayates, analysis of web search engine query sessions. In Proc. WebKDD 2006, 2006.
[22]
M. E. Newman. Fast algorithm for detecting community structure in networks. Physical review E, 69(6):066133, 2004.
[23]
S. Ozmutlu. Automatic new topic identification using multiple linear regression. Inf. Process. Manage., 42(4):934--950, 2006.
[24]
P. Qvarfordt, G. Golovchinsky, T. Dunnigan, and E. Agapie. Looking ahead: query preview in exploratory search. In Proc. SIGIR2013, pages 243--252. ACM, 2013.
[25]
K. Raman, P. N. Bennett, and K. Collins-Thompson. Toward whole-session relevance: exploring intrinsic diversity in web search. In Proc. SIGIR 2013, pages 463--472, New York, NY, USA, 2013. ACM.
[26]
G. Rugg and P. McGeorge. The sorting techniques: a tutorial paper on card sorts, picture sorts and item sorts. Expert Systems, 14(2):80--93, 1997.
[27]
D. M. Russell, M. J. Stefik, P. Pirolli, and S. K. Card. The cost structure of sensemaking. In Proc. CHI1993, pages 269--276. ACM, 1993.
[28]
A. J. Sellen, R. Murphy, and K. L. Shaw. How knowledge workers use the web. In Proc. CHI2002, pages 227--234. ACM.
[29]
A. Spink, M. Park, B. J. Jansen, and J. Pedersen. Multitasking during Web search sessions. Inf. Process. Manage., 42(1):264--275, 2006.
[30]
A. Strauss and J. Corbin. Grounded theory methodology. Handbook of qualitative research, pages 273--285, 1994.
[31]
R. K. Summit. Dialog: An operational on-line reference retrieval system. In ACM 1967, pages 51--56. ACM, 1967.
[32]
H. Weinreich, H. Obendorf, E. Herder, and M. Mayer. Not quite the average: An empirical study of web use. ACM Trans. Web, 2(1):5, 2008.
[33]
R. W. White and S. M. Drucker. Investigating behavioral variability in web search. In Proc. WWW2007, pages 21--30. ACM, 2007.
[34]
R. W. White and R. A. Roth. Exploratory search: Beyond the query-response paradigm. Synthesis Lectures on Information Concepts, Retrieval, and Services, 1(1):1--98, 2009.
[35]
D. Wolfram, A. Spink, B. J. Jansen, T. Saracevic, et al. Vox populi: The public searching of the web. JASIST, 52(12):1073--1074, 2001.
[36]
D. Wolfram, P. Wang, and J. Zhang. Identifying web search session patterns using cluster analysis: A comparison of three search environments. JASIST, 60(5):896--910, 2009.

Cited By

View all
  • (2020)The Curious Case of Session IdentificationExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-030-58219-7_6(69-74)Online publication date: 22-Sep-2020
  • (2017)Identification and Analysis of Multi-tasking Product Information Search Sessions with Query LogsJournal of Data and Information Science10.20309/jdis.2016211:3(79-94)Online publication date: 1-Sep-2017
  • (2017)A probabilistic graphical model for learning as search2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC)10.1109/CCWC.2017.7868379(1-4)Online publication date: Jan-2017
  • Show More Cited By

Index Terms

  1. A user defined taxonomy of factors that divide online information retrieval sessions

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    IIiX '14: Proceedings of the 5th Information Interaction in Context Symposium
    August 2014
    368 pages
    ISBN:9781450329767
    DOI:10.1145/2637002
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    • University of Regensburg: University of Regensburg

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 August 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. log analysis
    2. qualitative
    3. sessions
    4. web history

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    IIiX '14
    Sponsor:
    • University of Regensburg

    Acceptance Rates

    IIiX '14 Paper Acceptance Rate 21 of 45 submissions, 47%;
    Overall Acceptance Rate 21 of 45 submissions, 47%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)The Curious Case of Session IdentificationExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-030-58219-7_6(69-74)Online publication date: 22-Sep-2020
    • (2017)Identification and Analysis of Multi-tasking Product Information Search Sessions with Query LogsJournal of Data and Information Science10.20309/jdis.2016211:3(79-94)Online publication date: 1-Sep-2017
    • (2017)A probabilistic graphical model for learning as search2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC)10.1109/CCWC.2017.7868379(1-4)Online publication date: Jan-2017
    • (2016)Determining the Optimal Session Interval for Transaction Log Analysis of an Online Library CatalogueAdvances in Information Retrieval10.1007/978-3-319-30671-1_56(703-708)Online publication date: 2016

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media