short-paper

Quality models for microblog retrieval

Authors:
Jaeho Choi

NHN Corporation, Seongnam, South Korea

NHN Corporation, Seongnam, South Korea
View Profile

,
W. Bruce Croft

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

,
Jin Young Kim

University of Massachusetts Amherst, Amherst, MA, USA

University of Massachusetts Amherst, Amherst, MA, USA
View Profile

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementOctober 2012Pages 1834–1838https://doi.org/10.1145/2396761.2398527

Published:29 October 2012Publication History

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Pages 1834–1838

ABSTRACT

Microblog services typically contain very short documents (e.g., tweets) containing comments about the latest news and events. Many of these documents are not informative or have very little content due to their personal and ephemeral nature. Providing effective retrieval in a microblog service will require addressing the challenge of distinguishing the high-quality, informative documents from the others. Recent work has focused on finding features that indicate the quality of microblog documents, but the impact these quality features on retrieval is not clear. In this paper, we suggest a low-cost quality model using surrogate judgments based on user behavior (i.e., retweets) that can be collected automatically. We analyze the relationship between document informativeness and relevance judgments for microblog retrieval. Then we demonstrate that our behavior-based quality metric has a high correlation with manual judgments. Also, we perform experiments to study the impact of the quality model on microblog retrieval. The results based on the TREC Microblog track show that the proposed quality model, combined with a variety of retrieval models, can improve retrieval performance and is competitive with a model trained using manual relevance judgments.

References

O. Alonso, C. Carson, D. Gerster, X. Ji, and S. U. Nabar. Detecting uninteresting content in text streams. In SIGIR'10 Crowdsourcing for Search Evaluation Workshop, 2010.Google Scholar
M. Bendersky, W. B. Croft, and Y. Diao. Quality-biased ranking of web documents. In WSDM'11, 2011. Google ScholarDigital Library
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, 1998. Google ScholarDigital Library
C. Castillo, M. Mendoza, and B. Poblete. Information Credibility on Twitter. In WWW'11, 2011. Google ScholarDigital Library
Y. Duan, L. Jiang, T. Qin, M. Zhou, H. Shum. An empirical study on learning to rank of tweets. In Coling'10, 2010. Google ScholarDigital Library
L. Hong, O. Dan, and B. D. Davison. Predicting popular messages in twitter. In WWW'11, 2011. Google ScholarDigital Library
M. Huang, Y. Yang, and X. Zhu. Quality-biased ranking of short texts in microblogging services, In IJCNLP'11, 2011.Google Scholar
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 1999. Google ScholarDigital Library
V. Lavrenko, W. B. Croft. Relevance-based language models. In SIGIR'01, 2001. Google ScholarDigital Library
K. Massoudi, E. Tsagkias, M. de Rijke, and W. Weerkamp. Incorporating query expansion and quality indicators in searching microblog posts. In ECIR'11, 2011. Google ScholarDigital Library
D. Metzler, W. B. Croft. A Markov random field model for term dependencies. In SIGIR'05, 2005. Google ScholarDigital Library
D. Metzler and W. B. Croft. Linear feature-based models for information retrieval. Information Retrieval, 10(3), 2007. Google ScholarDigital Library
D. Metzler and C. Cai, USC/ISI at TREC 2011: Microblog Track, In TREC'11, 2012.Google Scholar
N. Naveed, T. Gottron, J. Kunegis, and A. Che Alhadi. Bad news travel fast: A content-based analysis of interestingness on twitter. In WebSci'11, 2011.Google ScholarDigital Library
N. Naveed, T. Gottron, J. Kunegis, and A. Che Alhadi. Searching microblogs: Coping with sparsity and document quality. In CIKM'11, 2011. Google ScholarDigital Library
H.-K. Peng, J. Zhu, D. Piao, R. Yan and J. Y. Zhang. Retweet Modeling Using Conditional Random Fields. ICDM Workshops, 2011. Google ScholarDigital Library
J. Seo and W. B. Croft. Unsupervised estimation of dirichlet smoothing parameters. In SIGIR'10, 2010. Google ScholarDigital Library
M. D. Smucker, J. Allan, and B. Carterette, A Comparison of Statistical Significance Tests for Information Retrieval Evaluation, CIKM'07, 2007. Google ScholarDigital Library
J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval, In SIGIR'98, 1998. Google ScholarDigital Library
J. Teevan, D. Ramage, and M. Morris. #Twittersearch: A comparison of microblog search and web search. In WSDM'11, 2011. Google ScholarDigital Library
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR'01, 2001. Google ScholarDigital Library
Y. Zhou and W. B. Croft. Document quality models for web ad hoc retrieval. In CIKM'05, 2005. Google ScholarDigital Library

Index Terms

Quality models for microblog retrieval
1. Information systems
  1. Information retrieval

Recommendations

Behavior Analysis of Microblog Users Based on Transitions in Posting Activities
IIWAS '13: Proceedings of International Conference on Information Integration and Web-based Applications & Services

In recent years, such microblogs as Twitter have spread widely over the world. Twitter, which enables instant text communications among users, was launched in 2006. In 2012, its Japanese users exceeded 29.9 million. Useful functions related to posting a ...
Read More
Adding semantics to microblog posts
WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining

Microblogs have become an important source of information for the purpose of marketing, intelligence, and reputation management. Streams of microblogs are of great value because of their direct and real-time nature. Determining what an individual ...
Read More
Research on User Influence in Microblog Based on Interest Graph
ICIE '17: Proceedings of the 6th International Conference on Information Engineering

Microblog1 is currently China's largest social networking platform. In recent years, as a social media, microblog influence continues to expand. The users who have large influence play a guiding role in the spread of microblog, and even lead to public ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
October 2012
2840 pages
ISBN:9781450311564
DOI:10.1145/2396761
General Chair:
Xuewen Chen
Wayne State University, USA
,
Program Chairs:
Guy Lebanon
Georgia Institute of Technology
,
Haixun Wang
Microsoft Research Asia
,
Mohammed J. Zaki
Rensselaer Polytechnic Institute
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
microblogs
quality model
quality-biased ranking
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 428
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Quality models for microblog retrieval

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Behavior Analysis of Microblog Users Based on Transitions in Posting Activities

Adding semantics to microblog posts

Research on User Influence in Microblog Based on Interest Graph

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Quality models for microblog retrieval

CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Behavior Analysis of Microblog Users Based on Transitions in Posting Activities

Adding semantics to microblog posts

Research on User Influence in Microblog Based on Interest Graph

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media