skip to main content
10.1145/1298406.1298427acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
Article

Information acquisition using multiple classifications

Published: 28 October 2007 Publication History

Abstract

Given a large collection of documents, we often need to extract various aspects of information that may be integrated to form a coherent overall picture. Especially for subjective documents addressing a single topic, traditional summarization techniques are limited in differentiating and clustering similar information. We apply multiple classifications to handle diverse aspects, including subtopic identification, keyword extraction, argument structure analysis, and opinion classification, in order to provide a summarized overview of the collection, complete with distributional information. From this overall summary, system users can effectively obtain more fine-grained information. Our methods for individual modules significantly outperform the baseline and achieve human-level agreement.

References

[1]
Altman, D. Practical Statistics for Medical Research. Chapman and Hall. (1991).
[2]
Baker, C.F., Fillmore, C.J., and Lowe, J.B. The Berkeley FrameNet Project. In Proceedings of COLING-ACL. Monteral, Canada. (1998)
[3]
Barzilay, R. and Elhadad, M. Using Lexical Chains for Text Summarization. In Proceedings of the Intelligent Scalable Text Summarization Workshop (ISTS'97), ACL, Madrid, Spain. (1997)
[4]
Bikel, D., Schwartz R., and Weischedel, R. M. An Algorithm that Learns What's in a Name. Machine Learning, 34 (1--3), pp. 211--231. (1999).
[5]
Galley, M. and McKeown, K. Improving Word Sense Disambiguation in Lexical Chaining. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI-03), Poster paper, Acapulco, Mexico. (2003).
[6]
General Inquirer. <http://www.wjh.harvard.edu/inquirer/> (2002).
[7]
Hearst, M. TextTiling: Segmenting Text into Multi--Paragraph Subtopic Passages. Computational Linguistics, 23 (1), pp. 33--64. (1997).
[8]
Hirst, G. and StOnge, D. Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms. In WordNet: An Electronic Lexical Database. MIT press. (1998).
[9]
Joachims, T. Optimizing Search Engines Using Click-through Data, In Proceeding of the ACM Conference on Knowledge Discovery and Data Mining (KDD), ACM, Edmonton, Alberta, Canada. (2002).
[10]
Meir, R. and Ratsch, G. An Introduction to Boosting and Leveraging. Advanced Lectures on Machine Learning. Springer-Verlag New York, Inc. (2003).
[11]
Morris, J. and Hirst, G. Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text. Computational Linguistics. 17(1):21--48. (1991).
[12]
Miller, G. WordNet: An online lexical database. International Journal of Lexicography, 3(4):235--312. (1990).
[13]
Nenkova, A. and Passonneau, R. Evaluating Content Selection in Summarization: the Pyramid Method. In Proceedings of NAACL-HLT, Boston, MA. (2004).
[14]
Och, F.J. and Ney, H. A Systematic Comparison of Various Statistical Alignment Models, Computational Linguistics, 29(1):19--51. (2003).
[15]
Och, F.J. and Ney, H. The Alignment Template Approach to Statistical Machine Translation, Computational Linguistics. 30(4). (2004).
[16]
Pang, B., Lee L., and Vaithayanathan, S. Thumbs up? Sentiment Classification using Machine Learning Techniques. In Proceedings of EMNLP, Philadelphia, PA. (2002).
[17]
Schapire, R. and Singer, Y. BoosTexter: A Boosting--Based System for Text Categorization. Machine Learning. 39(2/3):135--168. (2003).
[18]
Shulman, S.W. E-Rulemaking: Issues in Current Research and Practice. International Journal of Public Administration 28: 621--641. (2005).
[19]
Silber, G. and McCoy, K. Efficiently Computed Lexical Chains as an Intermediate Representation for Automatic Text Summarization. Computational Linguistics, 29(1). (2003).
[20]
Stairmand, M. A Computational Analysis of Lexical Cohesion with Applications in Information Retrieval. Ph.D. Dissertation, Center for Computational Linguistics UMIST, Manchester. (1996).
[21]
Turney, P., and Littman, M. Measuring Praise and Criticism: Inference of Semantic Orientation from Association. ACM Transactions of Information Systems (TOIS) 21
[22]
Vapnik, V. N. The nature of Statistical Learning Theory, Springer. (1995).
[23]
Wilson, T., Wiebe, J., and Hoffmann, P. Recognizing Contextual Polarity in Phrase--Level Sentiment Analysis. In Proceedings of HLT-EMNLP, Vancouver, Canada. (2005).
[24]
Zhou, L., Lin, C., Munteanu, D.S., and Hovy, E. <http://www.isi.edu/~liangz/DEMO/PARA> (2006).

Cited By

View all
  • (2023)Making Sense of Citizens’ Input through Artificial Intelligence: A Review of Methods for Computational Text Analysis to Support the Evaluation of Contributions in Public ParticipationDigital Government: Research and Practice10.1145/36032545:1(1-30)Online publication date: 3-Jun-2023
  • (2015)Understanding Citizens' Direct Policy Suggestions to the Federal GovernmentProceedings of the 2015 48th Hawaii International Conference on System Sciences10.1109/HICSS.2015.257(2134-2143)Online publication date: 5-Jan-2015
  • (2008)A study in rule-specific issue categorization for e-rulemakingProceedings of the 2008 international conference on Digital government research10.5555/1367832.1367874(244-253)Online publication date: 18-May-2008
  • Show More Cited By

Index Terms

  1. Information acquisition using multiple classifications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    K-CAP '07: Proceedings of the 4th international conference on Knowledge capture
    October 2007
    216 pages
    ISBN:9781595936431
    DOI:10.1145/1298406
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. argument structure
    2. classification
    3. keyword extraction
    4. opinion classification
    5. topic identification

    Qualifiers

    • Article

    Conference

    K-CAP07
    Sponsor:
    K-CAP07: International Conference on Knowledge Capture 2007
    October 28 - 31, 2007
    BC, Whistler, Canada

    Acceptance Rates

    Overall Acceptance Rate 55 of 198 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Making Sense of Citizens’ Input through Artificial Intelligence: A Review of Methods for Computational Text Analysis to Support the Evaluation of Contributions in Public ParticipationDigital Government: Research and Practice10.1145/36032545:1(1-30)Online publication date: 3-Jun-2023
    • (2015)Understanding Citizens' Direct Policy Suggestions to the Federal GovernmentProceedings of the 2015 48th Hawaii International Conference on System Sciences10.1109/HICSS.2015.257(2134-2143)Online publication date: 5-Jan-2015
    • (2008)A study in rule-specific issue categorization for e-rulemakingProceedings of the 2008 international conference on Digital government research10.5555/1367832.1367874(244-253)Online publication date: 18-May-2008
    • (2008)Active learning for e-rulemakingProceedings of the 2008 international conference on Digital government research10.5555/1367832.1367873(234-243)Online publication date: 18-May-2008

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media