skip to main content
10.1145/1835449.1835462acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Acquisition of instance attributes via labeled and related instances

Published: 19 July 2010 Publication History

Abstract

This paper presents a method for increasing the quality of automatically extracted instance attributes by exploiting weakly-supervised and unsupervised instance relatedness data. This data consists of (a) class labels for instances and (b) distributional similarity scores. The method organizes the text-derived data into a graph, and automatically propagates attributes among related instances, through random walks over the graph. Experiments on various graph topologies illustrate the advantage of the method over both the original attribute lists and a per-class attribute extractor, both in terms of the number of attributes extracted per instance and the accuracy of the top-ranked attributes.

References

[1]
E. Agirre, E. Alfonseca, K. Hall, J. Kravalova, M. Pasca, and A. Soroa. A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches. In Proceedings of NAACL-2009, pages 19--27, 2009.
[2]
K. Bellare, P. Talukdar, G. Kumaran, F. Pereira, M. Liberman, A. McCallum, and M. Dredze. Lightly-Supervised Attribute Extraction. In NIPS Workshop on Machine Learning for Web Search, 2007.
[3]
T. Brants. TnT - a statistical part of speech tagger. In Proceedings of the 6th Conference on Applied Natural Language Processing (ANLP-00), pages 224--231, Seattle, Washington, 2000.
[4]
M. Cafarella, A. Halevy, D. Wang, and Y. Zhang. Webtables: Exploring the Power of Tables on the Eeb. Proceedings of the VLDB Endowment archive, (1):538--549, 2008.
[5]
N. Chinchor. Overview of MUC-7/MET-2. In Proceedings of the Seventh Message Understanding Conference (MUC-7), volume 1, 1998.
[6]
T. Chklovski and Y. Gil. An Analysis of Knowledge Collected from Volunteer Contributors. In Proceedings of the National Conference on Artificial Intelligence, page 564, 2005.
[7]
G. Cui, Q. Lu, W. Li, and Y. Chen. Automatic Acquisition of Attributes for Ontology Construction. In Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy, pages 248--259, 2009.
[8]
O. Etzioni, M. Banko, S. Soderland, and S. Weld. Open Information Extraction from the Web. Communications of the ACM, 51(12), December 2008.
[9]
N. Guarino. Concepts, Attributes and Arbitrary Relations. Data and Knowledge Engineering, 8:249--261, 1992.
[10]
T. Hasegawa, S. Sekine, and R. Grishman. Discovering relations among named entities from large corpora. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pages 415--422, Barcelona, Spain, 2004.
[11]
M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics, pages 539--545, Nantes, France, 1992.
[12]
L. Lee. Measures of Distributional Similarity. In Proceedings of the 37th annual meeting of the Association for Computational Linguistics, pages 25--32, 1999.
[13]
D. Lin and P. Pantel. Concept Discovery from Text. In Proceedings of COLING, volume 2, pages 577--583, 2002.
[14]
R. Mooney and R. Bunescu. Mining knowledge from text using information extraction. SIGKDD Explorations, 7(1):3--10, 2005.
[15]
V. Nastase and M. Strube. Decoding wikipedia categories for knowledge acquisition. In Proceedings of the 23rd National Conference on Artificial Intelligence (AAAI-08), pages 1219--1224, Chicago, Illinois, 2008.
[16]
M. Pasca. Organizing and searching the World Wide Web of facts - step two: Harnessing the wisdom of the crowds. In Proceedings of the 16th World Wide Web Conference (WWW-07), pages 101--110, 2007.
[17]
M. Pasca and B. Van Durme. What you seek is what you get: Extraction of class attributes from query logs. In Proceedings of IJCAI-07, pages 2832--2837, 2007.
[18]
M. Pasca and B. Van Durme. Weakly-supervised acquisition of open-domain classes and class attributes from web documents and query logs. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL-08), pages 19--27, Columbus, Ohio, 2008.
[19]
K. Probst, R. Ghani, M. Krema, A. Fano, and Y. Liu. Semi-Supervised Learning of Attribute-Value Pairs from Product Descriptions. IJCAI-07, 2007.
[20]
J. Pustejovsky. The Generative Lexicon: a Theory of Computational Lexical Semantics, 1991.
[21]
S. Raju, P. Pingali, and V. Varma. An Unsupervised Approach to Product Attribute Extraction. In Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, pages 796--800, 2009.
[22]
F. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge unifying WordNet and Wikipedia. In Proceedings of WWW-2007, pages 697--706, 2007.
[23]
K. Tokunaga, J. Kazama, and K. Torisawa. Automatic discovery of attribute words from Web documents. In Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP-05), pages 106--118, 2005.
[24]
T. Wong and W. Lam. An Unsupervised Method for Joint Information Extraction and Feature Mining Across Different Web Sites. Data & Knowledge Engineering, 68(1):107--125, 2009.
[25]
F. Wu, R. Hoffmann, and D. Weld. Information extraction from Wikipedia: Moving down the long tail. In Proceedings of the 14th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-08), pages 731--739, 2008.
[26]
N. Yoshinaga and K. Torisawa. Open-Domain Attribute-Value Acquisition from Semi-Structured Texts. In Proceedings of the Workshop on Ontolex, pages 55--66, 2007.

Cited By

View all
  • (2022)A Pattern Driven Graph Ranking Approach to Attribute Extraction for Knowledge GraphIEEE Transactions on Industrial Informatics10.1109/TII.2021.307372618:2(1250-1259)Online publication date: Feb-2022
  • (2016)A joint model for Entity Set Expansion and Attribute Extraction from web search queriesProceedings of the Thirtieth AAAI Conference on Artificial Intelligence10.5555/3016100.3016336(3101-3107)Online publication date: 12-Feb-2016
  • (2016)Unsupervised Extraction of Popular Product Attributes from E-Commerce Web Sites by Considering Customer ReviewsACM Transactions on Internet Technology10.1145/285705416:2(1-17)Online publication date: 15-Apr-2016
  • Show More Cited By

Index Terms

  1. Acquisition of instance attributes via labeled and related instances

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
    July 2010
    944 pages
    ISBN:9781450301534
    DOI:10.1145/1835449
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. distributional similarities
    2. information extraction
    3. instance attributes
    4. labeled instances
    5. unstructured text

    Qualifiers

    • Research-article

    Conference

    SIGIR '10
    Sponsor:

    Acceptance Rates

    SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A Pattern Driven Graph Ranking Approach to Attribute Extraction for Knowledge GraphIEEE Transactions on Industrial Informatics10.1109/TII.2021.307372618:2(1250-1259)Online publication date: Feb-2022
    • (2016)A joint model for Entity Set Expansion and Attribute Extraction from web search queriesProceedings of the Thirtieth AAAI Conference on Artificial Intelligence10.5555/3016100.3016336(3101-3107)Online publication date: 12-Feb-2016
    • (2016)Unsupervised Extraction of Popular Product Attributes from E-Commerce Web Sites by Considering Customer ReviewsACM Transactions on Internet Technology10.1145/285705416:2(1-17)Online publication date: 15-Apr-2016
    • (2016)An Analysis of the Relation Between Similarity Positions and Attributes of Concepts by Distance GeometryChinese Lexical Semantics10.1007/978-3-319-49508-8_38(405-415)Online publication date: 27-Nov-2016
    • (2015)Fast and Space-Efficient Entity Linking for QueriesProceedings of the Eighth ACM International Conference on Web Search and Data Mining10.1145/2684822.2685317(179-188)Online publication date: 2-Feb-2015
    • (2015)Context-specific intention awareness through web query in robotic caregiving2015 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA.2015.7139455(1962-1967)Online publication date: May-2015
    • (2015)Discovering and understanding word level user intent in Web search queriesWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2014.07.01030:C(22-38)Online publication date: 1-Jan-2015
    • (2014)Structured Information Extraction from Natural Disaster Events on TwitterProceedings of the 5th International Workshop on Web-scale Knowledge Representation Retrieval & Reasoning10.1145/2663792.2663794(1-8)Online publication date: 3-Nov-2014
    • (2014)A study of age gaps between online friendsProceedings of the 25th ACM conference on Hypertext and social media10.1145/2631775.2631800(98-106)Online publication date: 1-Sep-2014
    • (2014)Aggregated searchACM Computing Surveys10.1145/252381746:3(1-31)Online publication date: 1-Jan-2014
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media