skip to main content
10.1145/2009916.2010018acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Ranking related news predictions

Published: 24 July 2011 Publication History

Abstract

We estimate that nearly one third of news articles contain references to future events. While this information can prove crucial to understanding news stories and how events will develop for a given topic, there is currently no easy way to access this information. We propose a new task to address the problem of retrieving and ranking sentences that contain mentions to future events, which we call ranking related news predictions. In this paper, we formally define this task and propose a learning to rank approach based on 4 classes of features: term similarity, entity-based similarity, topic similarity, and temporal similarity. Through extensive evaluations using a corpus consisting of 1.8 millions news articles and 6,000 manually judged relevance pairs, we show that our approach is able to retrieve a significant number of relevant predictions related to a given topic.

References

[1]
O. Alonso, M. Gertz, and R. Baeza-Yates. On the value of temporal information in information retrieval. ACM SIGIR Forum, 41(2):35--41, 2007.
[2]
A. Asuncion, M. Welling, P. Smyth, and Y. W. Teh. On smoothing and inference for topic models. In Proceedings of UAI'2009, 2009.
[3]
R. Baeza-Yates. Searching the future. In Proceedings of ACM SIGIR workshop MF/IR 2005, 2005.
[4]
K. Balog, L. Azzopardi, and M. de Rijke. A language modeling framework for expert finding. Inf. Process. Manage., 45(1):1--19, 2009.
[5]
K. Berberich, S. Bedathur, O. Alonso, and G. Weikum. A language modeling approach for temporal information needs. In Proceedings of ECIR'2010, 2010.
[6]
R. Blanco and H. Zaragoza. Finding support sentences for entities. In Proceeding of SIGIR'2010, 2010.
[7]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, March 2003.
[8]
J. Canton. The Extreme Future: The Top Trends That Will Reshape the World in the Next 20 Years. Plume, 2007.
[9]
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. J. Mach. Learn. Res., 7:551--585, 2006.
[10]
G. Demartini, A. P. Vries, T. Iofciu, and J. Zhu. Overview of the INEX 2008 Entity Ranking Track. 2009.
[11]
F. Diaz and R. Jones. Using temporal profiles of queries for precision prediction. In Proceedings of SIGIR'2004, 2004.
[12]
T. L. Griffiths. Finding scientific topics. Proceedings of the National Academy of Science, 101:5228--5235, Jan. 2004.
[13]
A. Jatowt, K. Kanazawa, S. Oyama, and K. Tanaka. Supporting analysis of future-related information in news archives and the web. In Proceedings of JCDL'2009, 2009.
[14]
T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of KDD'2002, 2002.
[15]
P. J. Kalczynski and A. Chou. Temporal document retrieval model for business news archives. Inf. Process. Manage., 41, 2005.
[16]
N. Kanhabua and K. Nørvåg. Determining time of queries for re-ranking search results. In Proceedings of ECDL'2010, 2010.
[17]
N. Lathia, S. Hailes, L. Capra, and X. Amatriain. Temporal diversity in recommender systems. In Proceeding of SIGIR'2010, 2010.
[18]
X. Li and W. B. Croft. Time-based language models. In Proceedings of CIKM'2003, 2003.
[19]
X. Li and W. B. Croft. Improving novelty detection for general topics using sentence level information patterns. In Proceedings of CIKM'2006, 2006.
[20]
T.-Y. Liu. Learning to rank for information retrieval. Found. Trends Inf. Retr., 3(3):225--331, 2009.
[21]
C. Macdonald and I. Ounis. Searching for expertise: Experiments with the voting model. Comput. J., 52(7):729--748, 2009.
[22]
M. Matthews, P. Tolchinsky, R. Blanco, J. Atserias, P. Mika, and H. Zaragoza. Searching through time in the new york times. In Bridging Human-Computer Interaction and Information Retrieval, 2010.
[23]
D. Metzler, R. Jones, F. Peng, and R. Zhang. Improving search relevance for implicitly temporal queries. In Proceedings of SIGIR'2009, 2009.
[24]
V. Murdock. Exploring Sentence Retrieval. VDM Verlag Dr. Mueller e.K., 2008.
[25]
M. J. Pazzani and D. Billsus. The adaptive web. pages 325--341, 2007.
[26]
S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of SIGIR'1994, 1994.
[27]
R. P. Schumaker and H. Chen. Textual analysis of stock market prediction using breaking financial news: The azfin text system. ACM Trans. Inf. Syst., 27:12:1--12:19, March 2009.
[28]
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of ICML'2007, 2007.
[29]
Y. Song, S. Pan, S. Liu, M. X. Zhou, and W. Qian. Topic and keyword re-ranking for lda-based topic modeling. In Proceeding of CIKM'2009, 2009.
[30]
M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online qa collections. In Proceedings of ACL-08: HLT, 2008.
[31]
X. Wang and A. McCallum. Topics over time: a non-markov continuous-time model of topical trends. In Proceedings of KDD'2006, 2006.
[32]
X. Wei and W. B. Croft. Lda-based document models for ad-hoc retrieval. In Proceedings of SIGIR'2006, 2006.
[33]
D. Wu, G. P. C. Fung, J. X. Yu, and Q. Pan. Stock prediction: an event-driven approach based on bursty keywords. Frontiers of Computer Science in China, 3(2):145--157, 2009.
[34]
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In Proceedings of SIGIR'2007, 2007.
[35]
H. Zaragoza, H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on wikipedia. In Proceedings of CIKM'2007, 2007.
[36]
T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of ICML'2004, 2004.

Cited By

View all
  • (2024)Future Timelines: Extraction and Visualization of Future-related Content From News ArticlesProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635693(1082-1085)Online publication date: 4-Mar-2024
  • (2022)Early Stage Sparse Retrieval with Entity LinkingProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557588(4464-4469)Online publication date: 17-Oct-2022
  • (2021)Prioritizing Original News on FacebookProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3481943(4046-4054)Online publication date: 26-Oct-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
July 2011
1374 pages
ISBN:9781450307574
DOI:10.1145/2009916
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. future events
  2. news predictions
  3. sentence retrieval and ranking

Qualifiers

  • Research-article

Conference

SIGIR '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Future Timelines: Extraction and Visualization of Future-related Content From News ArticlesProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635693(1082-1085)Online publication date: 4-Mar-2024
  • (2022)Early Stage Sparse Retrieval with Entity LinkingProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557588(4464-4469)Online publication date: 17-Oct-2022
  • (2021)Prioritizing Original News on FacebookProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3481943(4046-4054)Online publication date: 26-Oct-2021
  • (2019)Future Prediction with Automatically Extracted Morphosemantic PatternsCognitive Systems Research10.1016/j.cogsys.2019.09.004Online publication date: Sep-2019
  • (2019)Ad hoc retrieval via entity linking and semantic similarityKnowledge and Information Systems10.1007/s10115-018-1190-158:3(551-583)Online publication date: 1-Mar-2019
  • (2018)Tempo-HindiWordNetACM Transactions on Asian and Low-Resource Language Information Processing10.1145/327750418:2(1-22)Online publication date: 14-Dec-2018
  • (2017)Towards Exploiting Social Networks for Detecting Epidemic OutbreaksGlobal Journal of Flexible Systems Management10.1007/s40171-016-0148-y18:1(61-71)Online publication date: 11-Jan-2017
  • (2017)Identifying top relevant dates for implicit time sensitive queriesInformation Retrieval Journal10.1007/s10791-017-9302-120:4(363-398)Online publication date: 5-May-2017
  • (2016)A Method for Extraction of Future Reference Sentences Based on Semantic Role LabelingIEICE Transactions on Information and Systems10.1587/transinf.2015EDP7115E99.D:2(514-524)Online publication date: 2016
  • (2016)Temporal Information RetrievalProceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2914805(1235-1238)Online publication date: 7-Jul-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media