skip to main content
10.1145/1772690.1772748acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

A characterization of online browsing behavior

Published: 26 April 2010 Publication History

Abstract

In this paper, we undertake a large-scale study of online user behavior based on search and toolbar logs. We propose a new CCS taxonomy of pageviews consisting of Content (news, portals, games, verticals, multimedia), Communication (email, social networking, forums, blogs, chat), and Search (Web search, item search, multimedia search). We show that roughly half of all pageviews online are content, one-third are communications, and the remaining one-sixth are search. We then give further breakdowns to characterize the pageviews within each high-level category.
We then study the extent to which pages of certain types are revisited by the same user over time, and the mechanisms by which users move from page to page, within and across hosts, and within and across page types. We consider robust schemes for assigning responsibility for a pageview to ancestors along the chain of referrals. We show that mail, news, and social networking pageviews are insular in nature, appearing primarily in homogeneous sessions of one type. Search pageviews, on the other hand, appear on the path to a disproportionate number of pageviews, but cannot be viewed as the principal mechanism by which those pageviews were reached.
Finally, we study the burstiness of pageviews associated with a URL, and show that by and large, online browsing behavior is not significantly affected by "breaking" material with non-uniform visit frequency.

References

[1]
E. Adar, J. Teevan, and S. T. Dumais. Resonance on the web: Web dynamics and revisitation patterns. In Proc. 27th CHI, pages 1381--1390, 2009.
[2]
E. Baykan, M. R. Henzinger, L. Marian, and I. Weber. Purely URL-based topic classification. In Proc. 18th WWW, pages 1109--1110, 2009.
[3]
M. Bilenko and R. W. White. Mining the search trails of surfing crowds: Identifying relevant websites from user activity. In Proc. 17th WWW, pages 51--60, 2008.
[4]
M. Bilenko, R. W. White, M. Richardson, and G. C. Murray. Talking the talk vs. walking the walk: Salience of information needs in querying vs. browsing. In Proc. 31st SIGIR, pages 705--706, 2008.
[5]
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002.
[6]
A. G. Büchner, M. Baumgarten, S. S. Anand, M. D. Mulvenna, and J. G. Highes. User-driven navigation pattern discovery from internet data. In Proc. WebKDD, pages 74--91, 1999.
[7]
R. E. Bucklin and C. Sismeiro. A model of web site browsing behavior estimated on clickstream data. Journal of Marketing Research, 11:249--267, 2003.
[8]
I. V. Cadez, D. Heckerman, C. Meek, P. Smyth, and S. White. Model-based clustering and visualization of navigation patterns on a web site. DMKD, 7(4):399--424, 2003.
[9]
L. D. Catledge and J. E. Pitkow. Characterizing browsing strategies in the World--Wide Web. Computer Networks and ISDN Systems, 27(6):1065--1073, 1995.
[10]
O. Chappelle and Y. Zhang. A dynamic Bayesian network click model for web search ranking. In Proc. 18th WWW, pages 1--10, 2009.
[11]
A. Cockburn and B. McKenzie. What do Web users do? An empirical analysis of Web use. Intl. J. of Human-Computer Studies, 54(6):903--922, 2001.
[12]
H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Probabilistic query expansion using query logs. In Proc. 11th WWW, pages 325--332, 2002.
[13]
D. Downey, S. Dumais, and E. Horvitz. Models of searching and browsing: Languages, studies, and applications. JASIST, 58(6):862--871, 2007.
[14]
D. Downey, S. Dumais, D. Liebling, and E. Horvitz. Understanding the relationship between searchers' queries and information goals. In Proc. 17th CIKM, pages 449--458, 2008.
[15]
F. Guo, C. Liu, A. Kannan, T. Minka, M. Taylor, Y.-M. Wang, and C. Faloutsos. Click chain model in web search. In Proc. 18th WWW, pages 11--20, 2009.
[16]
E. Herder. Characterizations of user web revisit behavior. In Proc. Workshop on Adaptivity and User Modeling in Interactive Systems, 2005.
[17]
B. J. Jansen, A. Spink, and T. Saracevic. Real life, real users, and real needs: A study and analysis of user queries on the web. Information Processing and Management, 36:207--227, 2000.
[18]
E. J. Johnson, W. M. Moe, P. S. Fader, S. Bellman, and G. L. Lohse. On the depth and dynamics of online search behavior. Management Science, 50(3):299--308, 2004.
[19]
R. Jones and D. Fain. Query word deletion prediction. In Proc. 26th SIGIR, pages 435--436, 2003.
[20]
R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proc. 15th WWW, pages 387--396, 2006.
[21]
R. Kumar and A. Tomkins. A characterization of online search behavior. IEEE Data Eng. Bull., 32(2):3--11, 2009.
[22]
T. Lau and E. Horvitz. Patterns of search: Analyzing and modeling web query refinement. In Proc. 7th UMAP, pages 119--128, 1999.
[23]
Y. Liu, B. Gao, T.-Y. Liu, Y. Zhang, Z. Ma, S. He, and H. Li. Browserank: Letting web users vote for page importance. In Proc. 31st SIGIR, pages 451--458, 2008.
[24]
P. Mayr. Website entries from a web log file perspective - a new log file measure. In Proc. AoIR-ASIST Workshop on Web Science Research Methods, 2004.
[25]
Q. Mei, K. Klinkner, R. Kumar, and A. Tomkins. An analysis framework for search sequences. In Proc. 18th CIKM, 2009.
[26]
A. L. Montgomery and C. Faloutsos. Identifying web browsing trends and patterns. IEEE Computer, 34(7):94--95, 2001.
[27]
J. Morrison, P. Pirolli, and S. K. Card. A taxonomic analysis of what World Wide Web activities significantly impact people's decisions and actions. In Proc. CHI, pages 163--164, 2001.
[28]
H. Obendorf, H. Weinreich, E. Herder, and M. Mayer. Web page revisitation revisited: Implications of a long-term click-stream study of browser usage. In Proc. CHI, pages 597--606, 2007.
[29]
Y.-H. Park and P. S. Fader. Modeling browsing behavior at multiple websites. Marketing Science, 23(3):280--303, 2004.
[30]
F. Radlinski and T. Joachims. Query chains: Learning to rank from implicit feedback. In Proc. 11th KDD, pages 239--248, 2005.
[31]
A. Spink, M. Park, B. J. Jansen, and J. Pedersen. Multitasking during web search sessions. Information Processing and Management, 42(1):264--275, 2006.
[32]
L. Tauscher and S. Grennberg. How people revisit web pages: Empirical findings and implications for the design of history systems. Intl. J. of Human-Computer Studies, 47(1):97--137, 1997.
[33]
J. Teevan, E. Adar, R. Jones, and M. Potts. Information re-retrieval: Repeat queries in Yahoo's logs. In Proc. 30th SIGIR, pages 151--158, 2007.

Cited By

View all
  • (2024)Toward Generating a New Cloud-Based Distributed Denial of Service (DDoS) Dataset and Cloud Intrusion Traffic CharacterizationInformation10.3390/info1504019515:4(195)Online publication date: 31-Mar-2024
  • (2024)Exploratory and directed search strategies at a social science data archiveIASSIST Quarterly10.29173/iq108748:1Online publication date: 28-Mar-2024
  • (2024)Naturalistic Digital Behavior Predicts Cognitive AbilitiesACM Transactions on Computer-Human Interaction10.1145/366034131:3(1-32)Online publication date: 7-May-2024
  • Show More Cited By

Index Terms

  1. A characterization of online browsing behavior

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '10: Proceedings of the 19th international conference on World wide web
      April 2010
      1407 pages
      ISBN:9781605587998
      DOI:10.1145/1772690

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 April 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. browsing
      2. pageviews
      3. toolbar analysis

      Qualifiers

      • Research-article

      Conference

      WWW '10
      WWW '10: The 19th International World Wide Web Conference
      April 26 - 30, 2010
      North Carolina, Raleigh, USA

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)55
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Toward Generating a New Cloud-Based Distributed Denial of Service (DDoS) Dataset and Cloud Intrusion Traffic CharacterizationInformation10.3390/info1504019515:4(195)Online publication date: 31-Mar-2024
      • (2024)Exploratory and directed search strategies at a social science data archiveIASSIST Quarterly10.29173/iq108748:1Online publication date: 28-Mar-2024
      • (2024)Naturalistic Digital Behavior Predicts Cognitive AbilitiesACM Transactions on Computer-Human Interaction10.1145/366034131:3(1-32)Online publication date: 7-May-2024
      • (2024)Mouse Dynamics Behavioral Biometrics: A SurveyACM Computing Surveys10.1145/364031156:6(1-33)Online publication date: 24-Jan-2024
      • (2024)A Public and Reproducible Assessment of the Topics API on Real Data2024 IEEE Security and Privacy Workshops (SPW)10.1109/SPW63631.2024.00005(1-8)Online publication date: 23-May-2024
      • (2024)How We Browse: Measurement and Analysis of Browsing Behavior2024 IEEE 6th International Conference on Cognitive Machine Intelligence (CogMI)10.1109/CogMI62246.2024.00041(257-264)Online publication date: 28-Oct-2024
      • (2024)Do Cookie Banners Respect My Browsing Privacy? Measuring the Effectiveness of Cookie Rejection for Limiting Behavioral AdvertisingIEEE Access10.1109/ACCESS.2024.349453912(174539-174550)Online publication date: 2024
      • (2022)CODEC: Complex Document and Entity CollectionProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531712(3067-3077)Online publication date: 6-Jul-2022
      • (2022)ANDESComputer Standards & Interfaces10.1016/j.csi.2022.10363382:COnline publication date: 1-Aug-2022
      • (2021)The Living Lab on Media Content and Platforms: Results from six months of web browser trackingO Living Lab on Media Content and Platforms: Resultados de seis meses de web browser trackingComunicação pública10.4000/cp.12665Online publication date: 30-Jun-2021
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      EPUB

      View this article in ePub.

      ePub

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media