skip to main content
10.5555/1351542.1351950acmconferencesArticle/Chapter ViewAbstractPublication PageswscConference Proceedingsconference-collections
research-article

Using Monte-Carlo simulation for automatic new topic identification of search engine transaction logs

Published: 09 December 2007 Publication History

Abstract

One of the most important dimensions of search engine user information seeking behavior and search engine research is content-based behavior, and limited research has focused on content-based behavior of search engine users. The purpose of this study is to present a simulation application on information science, by performing automatic new topic identification in search engine transaction logs using Monte Carlo simulation. Sample data logs from FAST and Excite are used in the study. Findings show that Monte Carlo simulation for new topic identification yields satisfactory results in terms of identifying topic continuations, however the performance measures regarding topic shifts should be improved.

References

[1]
Beeferman D. and A. Berger. 2004. Agglomerative clustering of a search engine query log, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. 407 -- 416. Boston, MA, USA.
[2]
He, D. and A. Goker. 2000. Detecting session boundaries from Web user logs, Proceedings of the BCS-IRSG 22nd annual colloquium on information retrieval research, 57--66. Cambridge, UK.
[3]
Muresan, G. and D. J. Harper. 2004. Topic Modeling for Mediated Access to Very Large Document Collections, Journal of the American Society for Information Science and Technology 55: 892--910.
[4]
Ozmutlu H. C. and F. Cavdur. 2005a. Application of automatic topic identification on excite web search engine data logs. Information Processing and Management 41: 1243--1262.
[5]
Ozmutlu, S. and F. Cavdur. 2005b. Neural Network Applications for Automatic New Topic Identification. Online Information Review 29: 34--53.
[6]
Ozmutlu, H. C., F. Cavdur, and S. Ozmutlu. 2006. Automatic New Topic Identification in Search Engine Data-logs, Internet Research: Electronic Networking Applications and Policy 16: 323--338.
[7]
Ozmutlu, H. C., F. Cavdur, A. Spink and S. Ozmutlu. 2004. Neural network applications for automatic new topic identification on excite web search engine data logs. Proceedings of ASIST 2004: 67th Annual Meeting of the American Society for Information Science and Technology. 317--323. Providence, RI, USA.
[8]
Ozmutlu, S. H. C. Ozmutlu and A. Spink. 2004. A day in the life of Web searching: an exploratory study, Information Processing and Management 40: 319--345.
[9]
Ozmutlu, S, H. C. Ozmutlu and A. Spink, Multitasking Web searching and implications for design, Proceedings of ASIST 2003, Annual Meeting of the American Society for Information Science and Technology. 416--421. Long Beach CA.
[10]
Ozmutlu, S., A. Spink and H. C. Ozmutlu. 2002. Analysis of large data logs: an application of Poisson sampling on excite web queries. Information Processing and Management 38: 473--490.
[11]
Pu, H. T., S-L Chuang, and C. Yang. 2002. Subject Categorization of Query Terms for Exploring Web Users' Search Interests, Journal of the American Society for Information Science and Technology 53: 617--630.
[12]
Shafer, G. 1976. A mathematical theory of evidence. Princeton, NJ: Princeton University Press
[13]
Silverstein, C., M. Henzinger, H. Marais and M. Moricz. 1999. Analysis of a very large Web search engine query log. ACM SIGIR Forum, 33: 6--12
[14]
Spink, A., H. C. Ozmutlu and S. Ozmutlu. 2002. Multitasking information seeking and searching processes, Journal of the American Society for Information Science and Technology 53: 639--652.
[15]
Spink, A., D. Wolfram, B. J. Jansen and T. Saracevic. 2001. Searching the Web: The public and their queries, Journal of the American Society for Information Science and Technology, 53: 226--234.
[16]
Talja, S, H. Keso, and T. Pietilainen. 1999. The production of 'context 'in information seeking research: a metatheoretical view", Information Processing and Management 35: 751--763.

Cited By

View all
  • (2009)A survey on session detection methods in query logs and a proposal for future evaluationInformation Sciences: an International Journal10.1016/j.ins.2009.01.026179:12(1822-1843)Online publication date: 1-May-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSC '07: Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
December 2007
2659 pages
ISBN:1424413060

Sponsors

  • IIE: Institute of Industrial Engineers
  • INFORMS-SIM: Institute for Operations Research and the Management Sciences: Simulation Society
  • ASA: American Statistical Association
  • IEEE/SMC: Institute of Electrical and Electronics Engineers: Systems, Man, and Cybernetics Society
  • SIGSIM: ACM Special Interest Group on Simulation and Modeling
  • NIST: National Institute of Standards and Technology
  • (SCS): The Society for Modeling and Simulation International

Publisher

IEEE Press

Publication History

Published: 09 December 2007

Check for updates

Qualifiers

  • Research-article

Conference

WSC07
Sponsor:
  • IIE
  • INFORMS-SIM
  • ASA
  • IEEE/SMC
  • SIGSIM
  • NIST
  • (SCS)
WSC07: Winter Simulation Conference
December 9 - 12, 2007
Washington D.C.

Acceptance Rates

WSC '07 Paper Acceptance Rate 152 of 244 submissions, 62%;
Overall Acceptance Rate 3,413 of 5,075 submissions, 67%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2009)A survey on session detection methods in query logs and a proposal for future evaluationInformation Sciences: an International Journal10.1016/j.ins.2009.01.026179:12(1822-1843)Online publication date: 1-May-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media