ACM Home Page
Please provide us with feedback. Feedback
Exploring in the weblog space by detecting informative and affective articles
Full text PdfPdf (474 KB)
Source
International World Wide Web Conference archive
Proceedings of the 16th international conference on World Wide Web table of contents
Banff, Alberta, Canada
SESSION: Industrial practice & experience table of contents
Pages: 281 - 290  
Year of Publication: 2007
ISBN:978-1-59593-654-7
Authors
Xiaochuan Ni  Shanghai Jiao-Tong University, Shanghai, China
Gui-Rong Xue  Shanghai Jiao-Tong University, Shanghai, China
Xiao Ling  Shanghai Jiao-Tong University, Shanghai, China
Yong Yu  Shanghai Jiao-Tong University, Shanghai, China
Qiang Yang  Hong Kong University of Science and Technology, Hong Kong
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 31,   Downloads (12 Months): 357,   Citation Count: 1
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1242572.1242611
What is a DOI?

ABSTRACT

Weblogs have become a prevalent source of information for people to express themselves. In general, there are two genres of contents in weblogs. The first kind is about the webloggers' personal feelings, thoughts or emotions. We call this kind of weblogs affective articles. The second kind of weblogs is about technologies and different kinds of informative news. In this paper, we present a machine learning method for classifying informative and affective articles among weblogs. We consider this problem as a binary classification problem. By using machine learning approaches, we achieve about 92% on information retrieval performance measures including precision, recall and F1. We set up three studies on the applications of above classification approach in both research and industrial fields. The above classification approach is used to improve the performance of classification of emotions from weblog articles. We also develop an intent-driven weblog-search engine based on the classification techniques to improve the satisfaction of Web users. Finally, our approach is applied to search for weblogs with a great deal of informative articles.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
 
3
K. T. Durant and M. D. Smith. Mining Sentiment Classification from Political Web Logs. In Proceedings of Workshop on Web Mining and Web Usage Analysis of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (WebKDD-2006). August, 2006.
 
4
N. Glance, M. Hurst, and T. Tornkiyo. Blogpulse: Automated Trend Discovery for Weblogs. In Proceedings of WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004.
5
 
6
M. Hodder. Live Web Search. http://www2.sims.berkeley.edu/courses/is141/f05/schedule.html
 
7
 
8
 
9
 
10
 
11
 
12
 
13
14
15
 
16
J.D. Lasica, Weblogs: A New Source of Information. In We've got blog: How weblogs are changing our culture, John Rodzvilla (ed). Perseus Publishing, Cambridge, MA, 2002. Also http://www.ojr.org/ojr/lasica/p1019165278.php
17
 
18
A. Mccallum and K. Nigam, A Comparison of Event Models for Naive Byaes Text Classification", In Proceedings of AAAI-98 Workshop on "Learning for Text Categorization", pages 41--48, 1998.
19
 
20
G. Mishne. Experiments with Mood Classification in Blog Posts. In Style 2005- 1st Workshop on Stylistic Analysis of Text for Information Access, at SIGIR 2005, 2005.
 
21
Pew Internet and the American Life Project. http://www.pewinternet.org/PPF/r/186/report_display.asp
 
22
Pew Internet and the American Life Project. 2005. http://www.pewinternet.org/trends/Internet_Activities_12.05.05.htm.
 
23
Pew Internet and the American Life Project. 2006. http://www.pewinternet.org/trends/Internet_Activities_7.19.06.htm
 
24
J. D. M. Rennie, L. Shih, J. Teevan, and D. R. Karger, Tackling the Poor Assumption of Naive Bayes Text Classifiers, In Proceedings of the 20th International Conference on Machine Learning (ICML-2003), Washington DC, USA, 2003.
 
25
 
26
 
27
 
28
V. Vapnik, Principles of Risk Minimization for Learning Theory, In D.S. Lippman, J.E. Moody, and D.S. Touretzky, editors, Advances in Neural Information Processding Systems, Morgan Kaufmann, pages 831--838, 1992.
 
29
 
30
 
31
 
32


Collaborative Colleagues:
Xiaochuan Ni: colleagues
Gui-Rong Xue: colleagues
Xiao Ling: colleagues
Yong Yu: colleagues
Qiang Yang: colleagues