|
ABSTRACT
Many daily activities present information in the form of a stream of text, and often people can benefit from additional information on the topic discussed. TV broadcast news can be treated as one such stream of text; in this paper we discuss finding news articles on the web that are relevant to news currently being broadcast.We evaluated a variety of algorithms for this problem, looking at the impact of inverse document frequency, stemming, compounds, history, and query length on the relevance and coverage of news articles returned in real time during a broadcast. We also evaluated several postprocessing techniques for improving the precision, including reranking using additional terms, reranking by document similarity, and filtering on document similarity. For the best algorithm, 84%-91% of the articles found were relevant, with at least 64% of the articles being on the exact topic of the broadcast. In addition, a relevant article was found for at least 70% of the topics.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
| |
3
|
|
| |
4
|
S. Brin, R. Motwani, L. Page, and T. Winograd. What can you do with a web in your pocket? Data Engineering Bulletin, 21(2):37--47, 1998.
|
| |
5
|
J. Budzik, K. Hammond, and L. Birnbaum. Information access in context. Knowledge based systems, 14(1-2):37--53, 2001.
|
| |
6
|
J. Davis. Intercast dying of neglect. CNET News, January 29, 1997.
|
| |
7
|
|
| |
8
|
|
| |
9
|
B. Krulwich and C. Burkey. Learning user information interests through the extraction of semantically significant phrases. In AAAI 1996 Spring Symposium on Machine Learning in Information Access, 1996.
|
| |
10
|
H. Lieberman. Letizia: An agent that assists web browsing. In C. S. Mellish, editor, Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95), pages 924-929, 1995.
|
 |
11
|
Paul P. Maglio , Rob Barrett , Christopher S. Campbell , Ted Selker, SUITOR: an attentive information system, Proceedings of the 5th international conference on Intelligent user interfaces, p.169-176, January 09-12, 2000, New Orleans, Louisiana, United States
[doi> 10.1145/325737.325821]
|
| |
12
|
A. Munoz. Compound key word generation from document databases using a hierarchical clustering art model. Intelligent Data Analysis, 1(1), 1997.
|
 |
13
|
Morgan N. Price , Gene Golovchinsky , Bill N. Schilit, Linking by inking: trailblazing in a paper-like hypertext, Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems, p.30-39, June 20-24, 1998, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/276627.276631]
|
| |
14
|
|
| |
15
|
|
| |
16
|
S. Robertson, S. Walker, and M. Beaulieu. Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive track. In Proceedings of the 7th International Text Retrieval Conference (TREC), pages 253--264, 1999.
|
| |
17
|
G.D. Robson. Closed captions, V-chip, and other VBI data. Nuts and Volts, 2000.
|
| |
18
|
|
| |
19
|
A.M. Steier and R.K. Belew. Exporting phrases: A statistical analysis of topical language. In Second Symposium on Document Analysis and Information Retrieval, pages 179--190, 1993.
|
| |
20
|
|
CITED BY 20
|
Steven M. Beitzel , Eric C. Jensen , Abdur Chowdhury , David Grossman , Ophir Frieder, Evaluation of filtering current news search results, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
|
|
|
|
|
|
|
|
|
|
|
Jian-Tao Sun , Xuanhui Wang , Dou Shen , Hua-Jun Zeng , Zheng Chen, CWS: a comparative web search system, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
Hua Li , Duo Zhang , Jian Hu , Hua-Jun Zeng , Zheng Chen, Finding keyword from online broadcasting content for targeted advertising, Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising, p.55-62, August 12-12, 2007, San Jose, California
|
|
|
|
|
|
|
|
|
|
|
|
|
|
R. Guha , Ravi Kumar , D. Sivakumar , Ravi Sundaram, Unweaving a web of documents, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
|
Einat Amitay , David Carmel , Adam Darlow , Ronny Lempel , Aya Soffer, The connectivity sonar: detecting site functionality by structural patterns, Proceedings of the fourteenth ACM conference on Hypertext and hypermedia, August 26-30, 2003, Nottingham, UK
|
|
|
|
|
|
|
|