skip to main content
10.1145/1076034.1076175acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Scalable hierarchical topic detection: exploring a sample based approach

Published: 15 August 2005 Publication History

Abstract

Hierarchical topic detection is a new task in the TDT 2004 evaluation program, which aims to organize an unstructured news collection in a directed acyclic graph (DAG) structure, reflecting the topics discussed. We present a scalable architecture for HTD and compare several alternative choices for agglomerative clustering and DAG optimization in order to minimize the HTD cost metric.

References

[1]
J. Allan, A. Feng, and A. Bolivar. Flexible intrinsic evaluation of hierarchical clustering for TDT. In Proceedings of the twelfth international conference on Information and knowledge management, pages 263--270. ACM Press, 2003.
[2]
D. R. Cutting, D. R. Karger, J. O. Pedersen, and J. W. Tukey. Scatter/gather: a cluster-based approach to browsing large document collections. In Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, pages 318--329. ACM Press, 1992.
[3]
A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Computing Surveys, 31(3):264--323, 1999.
[4]
W. Kraaij. Variations on language modeling for information retrieval. PhD thesis, University of Twente, May 2004.
[5]
NIST. The 2004 Topic Detection and Tracking (TDT2004) task definition and evaluation plan. http://www.nist.gov/speech/tests/tdt/index.htm.
[6]
P. Pantel and D. Lin. Document clustering with committees. In Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pages 199--206. ACM Press, 2002.
[7]
A. F. Smeaton, M. Burnett, F. Crimmins, and G. Quinn. An architecture for efficient document clustering and retrieval on a dynamic collection of newspaper texts. In Proceedings of the 20th BCS-IRSG Annual Colloquium, 1998.
[8]
M. Spitters and W. Kraaij. Unsupervised event clustering in multilingual news streams. Proceedings of the LREC2002 Workshop on Event Modeling for Multilingual Document Linking, pages 42--46, 2002.
[9]
D. Trieschnigg and W. Kraaij. TNO hierarchical topic detection report at TDT 2004, 2004.

Cited By

View all
  • (2023)Topic Detection and Tracking in Social Media PlatformsPervasive Knowledge and Collective Intelligence on Web and Social Media10.1007/978-3-031-31469-8_3(41-56)Online publication date: 28-Apr-2023
  • (2021)Multimodal Topic Detection in Social Networks with Graph FusionWeb Information Systems and Applications10.1007/978-3-030-87571-8_3(28-38)Online publication date: 17-Sep-2021
  • (2017)Exploring open information via event networkNatural Language Engineering10.1017/S135132491700039024:2(199-220)Online publication date: 26-Oct-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
August 2005
708 pages
ISBN:1595930345
DOI:10.1145/1076034
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 August 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. TDT
  2. hierarchical topic detection
  3. information retrieval

Qualifiers

  • Article

Conference

SIGIR05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Topic Detection and Tracking in Social Media PlatformsPervasive Knowledge and Collective Intelligence on Web and Social Media10.1007/978-3-031-31469-8_3(41-56)Online publication date: 28-Apr-2023
  • (2021)Multimodal Topic Detection in Social Networks with Graph FusionWeb Information Systems and Applications10.1007/978-3-030-87571-8_3(28-38)Online publication date: 17-Sep-2021
  • (2017)Exploring open information via event networkNatural Language Engineering10.1017/S135132491700039024:2(199-220)Online publication date: 26-Oct-2017
  • (2017)Understanding-Oriented Multimedia News SummarizationUnderstanding-Oriented Multimedia Content Analysis10.1007/978-981-10-3689-7_6(131-153)Online publication date: 27-May-2017
  • (2016)Multimedia News Summarization in SearchACM Transactions on Intelligent Systems and Technology10.1145/28229077:3(1-20)Online publication date: 1-Feb-2016
  • (2015)o-HETM: An Online Hierarchical Entity Topic Model for News StreamsAdvances in Knowledge Discovery and Data Mining10.1007/978-3-319-18038-0_54(696-707)Online publication date: 17-Apr-2015
  • (2014)Hierarchy Topic Detection and Hot Topic IdentificationApplied Mechanics and Materials10.4028/www.scientific.net/AMM.701-702.180701-702(180-186)Online publication date: Dec-2014
  • (2012)Fine-grained topic detection in news search resultsProceedings of the 27th Annual ACM Symposium on Applied Computing10.1145/2245276.2245454(912-917)Online publication date: 26-Mar-2012
  • (2012)Microblog Topic Detection Based on LDA Model and Single-Pass ClusteringRough Sets and Current Trends in Computing10.1007/978-3-642-32115-3_19(166-171)Online publication date: 2012
  • (2011)Implementing a News Browsing Support System based on Interactive Event TrackingTransactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.26.22826(228-236)Online publication date: 2011
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media