skip to main content
10.1145/1390334.1390423acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Real-time automatic tag recommendation

Published: 20 July 2008 Publication History

Abstract

Tags are user-generated labels for entities. Existing research on tag recommendation either focuses on improving its accuracy or on automating the process, while ignoring the efficiency issue. We propose a highly-automated novel framework for real-time tag recommendation. The tagged training documents are treated as triplets of (words, docs, tags), and represented in two bipartite graphs, which are partitioned into clusters by Spectral Recursive Embedding (SRE). Tags in each topical cluster are ranked by our novel ranking algorithm. A two-way Poisson Mixture Model (PMM) is proposed to model the document distribution into mixture components within each cluster and aggregate words into word clusters simultaneously. A new document is classified by the mixture model based on its posterior probabilities so that tags are recommended according to their ranks. Experiments on large-scale tagging datasets of scientific documents (CiteULike) and web pages del.icio.us) indicate that our framework is capable of making tag recommendation efficiently and effectively. The average tagging time for testing a document is around 1 second, with over 88% test documents correctly labeled with the top nine tags we suggested.

References

[1]
R. Baeza-Yates, C. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In International Workshop on Clustering Information over the Web (in conjunction with EDBT), 2004.
[2]
G. Begelman, P. Keller, and F. Smadja. Automated tag clustering: Improving search and exploration in the tag space. In Collaborative Web Tagging Workshop at WWW2006, Edinburgh, Scotland, 2006.
[3]
J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Uncertainty in Artificial Intelligence. Proceedings of the Fourteenth Conference (1998), pages 43--52, 1998.
[4]
P. A. Chirita, S. Costache, W. Nejdl, and S. Handschuh. P-tag: large scale automatic generation of personalized annotation tags for the web. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 845--854, New York, NY, USA, 2007. ACM Press.
[5]
I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 269--274, New York, NY, USA, 2001. ACM Press.
[6]
C. H. Q. Ding, X. He, H. Zha, M. Gu, and H. D. Simon. A min-max cut algorithm for graph partitioning and data clustering. In ICDM '01: Proceedings of the 2001 IEEE International Conference on Data Mining, pages 107--114, Washington, DC, USA, 2001. IEEE Computer Society.
[7]
P. Drineas, R. Kannan, and M. W. Mahoney. Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition. SIAM J. Comput., 36(1):184--206, 2006.
[8]
U. Farooq, Y. Song, J. M. Carroll, and C. L. Giles. Social bookmarking for scholarly digital libraries. IEEE Internet Computing, pages 29--35, Nov,2007.
[9]
M. A. T. Figueiredo and A. K. Jain. Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell., 24(3):381--396, 2002.
[10]
B. Gao, T.-Y. Liu, X. Zheng, Q.-S. Cheng, and W.-Y. Ma. Consistent bipartite graph co-partitioning for star structured high-order heterogeneous data co-clustering. In KDD '05: Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 41--50, New York, NY, USA, 2005.
[11]
S. Golder and B. Huberman. Usage patterns of collaborative tagging systems. J. Inf. Sci., 2006.
[12]
G. H. Golub and C. F. V. Loan. Matrix computations (3rd ed.). Johns Hopkins University Press, 1996.
[13]
M. Kendall. A new measure of rank correlation. Biometrika, 30:81--89, 1938.
[14]
J. Li and H. Zha. Two-way poisson mixture models for simultaneous document classification and word clustering. Computational Statistics & Data Analysis, 2006.
[15]
W. Xi, E. A. Fox, W. Fan, B. Zhang, Z. Chen, J. Yan, and D. Zhuang. Simfusion: measuring similarity using unified relationship matrix. In SIGIR '05, pages 130--137, New York, NY, USA, 2005. ACM Press.
[16]
H. Zha, X. He, C. Ding, H. Simon, and M. Gu. Bipartite graph partitioning and data clustering. In CIKM '01: Proceedings of the tenth international conference on Information and knowledge management, pages 25--32, New York, NY, USA, 2001. ACM Press.

Cited By

View all
  • (2024)Mapping APIs in Dynamic-typed Programs by Leveraging Transfer LearningACM Transactions on Software Engineering and Methodology10.1145/364184833:4(1-29)Online publication date: 18-Apr-2024
  • (2024)GPT4Rec: Graph Prompt Tuning for Streaming RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657720(1774-1784)Online publication date: 10-Jul-2024
  • (2023)Dynamic Embedding Size Search with Minimum Regret for Streaming Recommender SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615135(741-750)Online publication date: 21-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
July 2008
934 pages
ISBN:9781605581644
DOI:10.1145/1390334
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph partitioning
  2. mixture model
  3. tagging system

Qualifiers

  • Research-article

Conference

SIGIR '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)4
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Mapping APIs in Dynamic-typed Programs by Leveraging Transfer LearningACM Transactions on Software Engineering and Methodology10.1145/364184833:4(1-29)Online publication date: 18-Apr-2024
  • (2024)GPT4Rec: Graph Prompt Tuning for Streaming RecommendationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657720(1774-1784)Online publication date: 10-Jul-2024
  • (2023)Dynamic Embedding Size Search with Minimum Regret for Streaming Recommender SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615135(741-750)Online publication date: 21-Oct-2023
  • (2023)Dynamically Expandable Graph Convolution for Streaming RecommendationProceedings of the ACM Web Conference 202310.1145/3543507.3583237(1457-1467)Online publication date: 30-Apr-2023
  • (2023)Textual tag recommendation with multi-tag topical attentionNeurocomputing10.1016/j.neucom.2023.03.051537(73-84)Online publication date: Jun-2023
  • (2021)Learning natural ordering of tags in domain-specific Q&A sites特定领域问答网站中的标签自然顺序研究Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.190064522:2(170-184)Online publication date: 19-Jan-2021
  • (2021)Personalized Social Image Tag Recommendation Algorithm Based on Tensor Decomposition2021 2nd International Conference on Smart Electronics and Communication (ICOSEC)10.1109/ICOSEC51865.2021.9591909(1025-1028)Online publication date: 7-Oct-2021
  • (2020)Recommender Systems Using Collaborative TaggingInternational Journal of Data Warehousing and Mining10.4018/IJDWM.202007011016:3(183-200)Online publication date: 1-Jul-2020
  • (2020)Optimal Learning Behavior Prediction System Based on Cognitive Style Using Adaptive Optimization-Based Neural NetworkComplexity10.1155/2020/60971672020Online publication date: 1-Jan-2020
  • (2020)Automatic image annotation via category labelsMultimedia Tools and Applications10.1007/s11042-019-07929-yOnline publication date: 6-Jan-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media