research-article

Balancing Novelty and Salience: Adaptive Learning to Rank Entities for Timeline Summarization of High-impact Events

Authors:

Claudia Niederee,

Nattiya Kanhabua,

Ujwal Gadiraju,

Avishek AnandAuthors Info & Claims

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

Pages 1201 - 1210

https://doi.org/10.1145/2806416.2806486

Published: 17 October 2015 Publication History

Abstract

Long-running, high-impact events such as the Boston Marathon bombing often develop through many stages and involve a large number of entities in their unfolding. Timeline summarization of an event by key sentences eases story digestion, but does not distinguish between what a user remembers and what she might want to re-check. In this work, we present a novel approach for timeline summarization of high-impact events, which uses entities instead of sentences for summarizing the event at each individual point in time. Such entity summaries can serve as both (1) important memory cues in a retrospective event consideration and (2) pointers for personalized event exploration. In order to automatically create such summaries, it is crucial to identify the "right" entities for inclusion. We propose to learn a ranking function for entities, with a dynamically adapted trade-off between the in-document salience of entities and the informativeness of entities across documents, i.e., the level of new information associated with an entity for a time point under consideration. Furthermore, for capturing collective attention for an entity we use an innovative soft labeling approach based on Wikipedia. Our experiments on a real large news datasets confirm the effectiveness of the proposed methods.

References

[1]

P. André, J. Teevan, and S. T. Dumais. From x-rays to silly putty via uranus: serendipity and its role in web search. In CHI, 2009.

Digital Library

[2]

D. Berntsen. Involuntary autobiographical memories: An introduction to the unbidden past. Cambridge University Press, 2009.

[3]

J. Bian, X. Li, F. Li, Z. Zheng, and H. Zha. Ranking specialization for web search: a divide-and-conquer approach by using topical ranksvm. In WWW, 2010.

Digital Library

[4]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 2003.

Digital Library

[5]

B. Boguraev and C. Kennedy. Salience-based content characterisation of text documents. ACL, 1997.

[6]

I. Bordino, Y. Mejova, and M. Lalmas. Penguins in sweaters, or serendipitous entity search on user-generated content. In CIKM, 2013.

Digital Library

[7]

M. Ciglan and K. Nørvåg. Wikipop: Personalized event detection system based on wikipedia page view statistics. In CIKM, 2010.

Digital Library

[8]

G. Demartini, M. M. S. Missen, R. Blanco, and H. Zaragoza. Taer: time-aware entity retrieval-exploiting the past to find relevant entities in news articles. In CIKM, 2010.

Digital Library

[9]

Q. Do, D. Roth, M. Sammons, Y. Tu, and V. Vydiswaran. Robust, light-weight approaches to compute lexical similarity. Computer Science Research and Technical Reports, University of Illinois, 2009.

[10]

J. Dunietz and D. Gillick. A new entity salience task with millions of training examples. EACL, 2014.

[11]

G. Erkan and D. R. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res.(JAIR), 22(1):457--479, 2004.

Digital Library

[12]

M. Gamon, T. Yano, X. Song, J. Apacible, and P. Pantel. Identifying salient entities in web pages. In CIKM, 2013.

Digital Library

[13]

M. Ge, C. Delgado-Battenfeld, and D. Jannach. Beyond accuracy: evaluating recommender systems by coverage and serendipity. In RecSys, 2010.

Digital Library

[14]

X. Geng, T.-Y. Liu, T. Qin, A. Arnold, H. Li, and H.-Y. Shum. Query dependent ranking using k-nearest neighbor. In SIGIR, 2008.

Digital Library

[15]

A. Gionis, P. Indyk, R. Motwani, et al. Similarity search in high dimensions via hashing. In VLDB, volume 99, 1999.

Digital Library

[16]

R. Gunning. Judges scold lawyers for bad writing, 1952.

[17]

J. Hoffart, S. Seufert, D. B. Nguyen, M. Theobald, and G. Weikum. Kore: keyphrase overlap relatedness for entity disambiguation. In CIKM, 2012.

Digital Library

[18]

H. Imran and A. Sharan. Improving effectiveness of query expansion using information theoretic approach. In Trends in Applied Intelligent Systems. 2010.

Digital Library

[19]

T. Joachims. Optimizing search engines using clickthrough data. In KDD, 2002.

Digital Library

[20]

J. P. Kincaid, R. P. Fishburne Jr, R. L. Rogers, and B. S. Chissom. Derivation of new readability formulas (automated readability index, fog count and esch reading ease formula) for navy enlisted personnel. Technical report, DTIC Document, 1975.

[21]

C. Kohlschütter, P. Fankhauser, and W. Nejdl. Boilerplate detection using shallow text features. In WSDM. ACM, 2010.

Digital Library

[22]

H. Lee, M. Recasens, A. Chang, M. Surdeanu, and D. Jurafsky. Joint entity and event coreference resolution across documents. In EMNLP, 2012.

Digital Library

[23]

R. McCreadie, C. Macdonald, and I. Ounis. Incremental update summarization: Adaptive sentence selection based on prevalence and novelty. In CIKM, 2014.

Digital Library

[24]

X. Meng, F. Wei, X. Liu, M. Zhou, S. Li, and H. Wang. Entity-centric topic-oriented opinion summarization in twitter. In KDD, 2012.

Digital Library

[25]

Y. Moshfeghi, M. Matthews, R. Blanco, and J. M. Jose. Influence of timeline and named-entity components on user engagement. In ECIR. 2013.

Digital Library

[26]

A. Nenkova and R. Passonneau. Evaluating content selection in summarization: The pyramid method. In NAACL-HLT, 2004.

[27]

D. Shahaf, C. Guestrin, and E. Horvitz. Trains of thought: Generating information maps. In WWW, 2012.

Digital Library

[28]

G. B. Tran, T. Tran, N.-K. Tran, M. Alrifai, and N. Kanhabua. Leverage learning to rank in an optimization framework for timeline summarization. In TAIA Workshop at SIGIR, 2013.

[29]

E. van den Hoven and B. Egge. The cue is key - design for real-life remembering. Zeitschrift für Psychologie., 222(2):110--117, 2014.

[30]

L. Vanderwende, H. Suzuki, C. Brockett, and A. Nenkova. Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion. Information Processing & Management, 43(6), 2007.

Digital Library

[31]

L. Wang, H. Raghavan, V. Castelli, R. Florian, and C. Cardie. A sentence compression based framework to query-focused multi-document summarization. In ACL, 2013.

[32]

S. Whiting, J. Jose, and O. Alonso. Wikipedia as a time machine. In WWW, pages 857--862, 2014.

Digital Library

[33]

Z. Wu and C. L. Giles. Measuring term informativeness in context. In NAACL-HLT, 2013.

[34]

R. Yan, X. Wan, J. Otterbacher, L. Kong, X. Li, and Y. Zhang. Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In SIGIR, 2011.

Digital Library

[35]

X. W. Zhao, Y. Guo, R. Yan, Y. He, and X. Li. Timeline generation with social attention. In SIGIR, 2013

Digital Library

Cited By

You JLi DKamigaito HFunakoshi KOkumura M(2023)Joint Learning-based Heterogeneous Graph Attention Network for Timeline SummarizationJournal of Natural Language Processing10.5715/jnlp.30.18430:1(184-214)Online publication date: 2023
https://doi.org/10.5715/jnlp.30.184
Kato MImrattanatrai WYamamoto TOhshima HTanaka K(2020)Context-Guided Learning to Rank EntitiesAdvances in Information Retrieval10.1007/978-3-030-45439-5_6(83-96)Online publication date: 8-Apr-2020
https://doi.org/10.1007/978-3-030-45439-5_6
Wu CKanoulas Ede Rijke M(2019)It all starts with entities: A Salient entity topic modelNatural Language Engineering10.1017/S135132491900058526:5(531-549)Online publication date: 22-Nov-2019
https://doi.org/10.1017/S1351324919000585
Show More Cited By

Index Terms

Balancing Novelty and Salience: Adaptive Learning to Rank Entities for Timeline Summarization of High-impact Events
1. Information systems
  1. Information retrieval

Recommendations

Follow the Timeline! Generating an Abstractive and Extractive Timeline Summary in Chronological Order
Today, timestamped web documents related to a general news query flood the Internet, and timeline summarization targets this concisely by summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, ...
Linking topics of news and blogs with wikipedia for complementary navigation
BlogTalk'08/09: Proceedings of the 2008/2009 international conference on Social software: recent trends and developments in social software

We study complementary navigation of news and blog, where Wikipedia entries are utilized as fundamental knowledge source for linking news articles and blog feeds/posts. In the proposed framework, given a topic as the title of a Wikipedia entry, its ...
PIECE: Protagonist Identification and Event Chronology Extraction for Enhanced Timeline Summarization
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management

Timeline summarization involves condensing events from news articles to illustrate the temporal development of a specific topic. Traditional methods often extract events based on the number of related reports but tend to overlook the movement of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

October 2015

1998 pages

ISBN:9781450337946

DOI:10.1145/2806416

General Chairs:
James Bailey
The University of Melbourne
,
Alistair Moffat
The University of Melbourne
,
Program Chairs:
Charu C. Aggarwal
IBM
,
Maarten de Rijke
University of Amsterdam
,
Ravi Kumar
Google
,
Vanessa Murdock
Microsoft
,
Timos Sellis
RMIT University
,
Jeffrey Xu Yu
Chinese University of Hong Kong

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ForgetIT EU FP7 Project
ERC Advanced Grant ALEXANDRIA project

Conference

CIKM'15

Sponsor:

CIKM'15: 24th ACM International Conference on Information and Knowledge Management

October 18 - 23, 2015

Melbourne, Australia

Acceptance Rates

CIKM '15 Paper Acceptance Rate 165 of 646 submissions, 26%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
455
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

You JLi DKamigaito HFunakoshi KOkumura M(2023)Joint Learning-based Heterogeneous Graph Attention Network for Timeline SummarizationJournal of Natural Language Processing10.5715/jnlp.30.18430:1(184-214)Online publication date: 2023
https://doi.org/10.5715/jnlp.30.184
Kato MImrattanatrai WYamamoto TOhshima HTanaka K(2020)Context-Guided Learning to Rank EntitiesAdvances in Information Retrieval10.1007/978-3-030-45439-5_6(83-96)Online publication date: 8-Apr-2020
https://doi.org/10.1007/978-3-030-45439-5_6
Wu CKanoulas Ede Rijke M(2019)It all starts with entities: A Salient entity topic modelNatural Language Engineering10.1017/S135132491900058526:5(531-549)Online publication date: 22-Nov-2019
https://doi.org/10.1017/S1351324919000585
Duan YJatowt ATanaka K(2019)Discovering Latent Threads in Entity HistoriesData Science and Engineering10.1007/s41019-019-00108-x4:4(336-351)Online publication date: 15-Nov-2019
https://doi.org/10.1007/s41019-019-00108-x
Yuan CBao ZSanderson MTang Y(2019)Incorporating word attention with convolutional neural networks for abstractive summarizationWorld Wide Web10.1007/s11280-019-00709-623:1(267-287)Online publication date: 6-Aug-2019
https://doi.org/10.1007/s11280-019-00709-6
Lu WMa PYu JZhou YWei B(2019)Metro maps for efficient knowledge learning by summarizing massive electronic textbooksInternational Journal on Document Analysis and Recognition10.1007/s10032-019-00319-y22:2(99-111)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s10032-019-00319-y
Cucchiarelli AMorbidoni CStilo GVelardi PHaddad HWainwright RChbeir R(2018)What to write and whyProceedings of the 33rd Annual ACM Symposium on Applied Computing10.1145/3167132.3167274(1321-1330)Online publication date: 9-Apr-2018
https://dl.acm.org/doi/10.1145/3167132.3167274
McCreadie RSantos RMacdonald COunis I(2018)Explicit Diversification of Event Aspects for Temporal SummarizationACM Transactions on Information Systems10.1145/315867136:3(1-31)Online publication date: 2-Feb-2018
https://dl.acm.org/doi/10.1145/3158671
Fafalios PIosifidis VStefanidis KNtoutsi E(2018)Tracking the history and evolution of entities: entity-centric temporal analysis of large social media archivesInternational Journal on Digital Libraries10.1007/s00799-018-0257-721:1(5-17)Online publication date: 26-Oct-2018
https://dl.acm.org/doi/10.1007/s00799-018-0257-7
Duan YJatowt ATanaka KDolog PVojtas PBonchi FHelic D(2017)Discovering Typical Histories of Entities by Multi-Timeline SummarizationProceedings of the 28th ACM Conference on Hypertext and Social Media10.1145/3078714.3078725(105-114)Online publication date: 4-Jul-2017
https://dl.acm.org/doi/10.1145/3078714.3078725
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten