skip to main content
article
Free access

Adaptive web information extraction

Published: 01 May 2006 Publication History

Abstract

The Amorphic system works to extract Web information for use in business intelligence applications.

References

[1]
Arasu A. and Garcia-Molina, H. Extracting structured data from Web pages. ACM SIGMOD Record (June 2003), 337--348.
[2]
Chidlovskii, B. Automatic repairing of Web wrappers by combining redundant views. In Proceedings of IEEE Conf. Tools with AI (Nov. 2002), 399--406.
[3]
Cohen, W., Hurst, M., and Jensen, L. A flexible learning system for wrapping tables and lists in HTML documents. In Proceedings of the Conf. on WWW (2002), 232--241.
[4]
Embley, D., Campbell, D., Smith, R., and Liddle, S. Ontology-based extraction and structuring of information from data-rich unstructured documents. In Proceedings of the Conf. on Info. and Knowledge Management (Nov. 1998), 52--59.
[5]
Embley, D.W., Jiang, Y., and Ng, Y.K. Record-boundary discovery in Web documents. ACM SIGMOD Record 28, 2 (June 1999), 467--478.
[6]
Gregg, D. and Walczak, S. Exploiting the Information Web. IEEE Trans. on System, Man and Cybernetics Part C (forthcoming 2006).
[7]
Knoblock, C., Leramn, K., Minton, S., and Muslea, I. Accurately and reliably extracting data from the Web: A machine learning approach. Bulletin IEEE Computer Society Technical Committee on Data Engineering 23, 4 (2000), 33--41.
[8]
Kushmerick, N., Weld, D., and Doorenbos, R. Wrapper induction for information extraction. In Proceedings of the Conf. on AI (1997), 729--735.
[9]
Laender, A.H.F., Ribeiro-Neto, B.A., da Silva, A.S., and Teixeira, J.S. Surveys: A brief survey of web data extraction tools. ACM SIGMOD Record 31, 2 (June 2002), 84--93.
[10]
Lerman, K., Minton, S., and Knoblock, C. Wrapper maintenance: A machine learning approach. J. of AI Research 18 (Feb. 2003), 149--181.
[11]
Muslea, I., Minton, S., and Knoblock, C. A hierarchical approach to wrapper induction. In Proceedings on Autonomous Agents (1999), 190--197.
[12]
Srivastava J. and Cooley, R. Web business intelligence: Mining the Web for actionable knowledge. J. on Computing 15, 2 (2003), 191--207.

Cited By

View all
  • (2023)WIERTProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i11.26546(13166-13173)Online publication date: 7-Feb-2023
  • (2022)Building Self-Healing Feature Based on Faster R-CNN Deep Learning Technique in Web Data Extraction SystemsJournal of Information & Knowledge Management10.1142/S021964922250029021:02Online publication date: 28-Apr-2022
  • (2021)Criação de uma base de valores imobiliários geo-referenciados a partir da extração de dados da internetRevista de Tecnologia Aplicada10.48005/2237-3713rta2021v10n2p5164(51-64)Online publication date: 4-Nov-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 49, Issue 5
Two decades of the language-action perspective
May 2006
125 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/1125944
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2006
Published in CACM Volume 49, Issue 5

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)172
  • Downloads (Last 6 weeks)25
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)WIERTProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i11.26546(13166-13173)Online publication date: 7-Feb-2023
  • (2022)Building Self-Healing Feature Based on Faster R-CNN Deep Learning Technique in Web Data Extraction SystemsJournal of Information & Knowledge Management10.1142/S021964922250029021:02Online publication date: 28-Apr-2022
  • (2021)Criação de uma base de valores imobiliários geo-referenciados a partir da extração de dados da internetRevista de Tecnologia Aplicada10.48005/2237-3713rta2021v10n2p5164(51-64)Online publication date: 4-Nov-2021
  • (2016)Pattern matching for extraction of core contents from news web pages2016 Second International Conference on Web Research (ICWR)10.1109/ICWR.2016.7498440(13-18)Online publication date: Apr-2016
  • (2015)Organisational Barriers to Including Web Data in Traditional BI PracticeProceedings of the 2015 Annual Research Conference on South African Institute of Computer Scientists and Information Technologists10.1145/2815782.2815798(1-10)Online publication date: 28-Sep-2015
  • (2013)Website Information Extraction Based on DOM-ModelApplied Mechanics and Materials10.4028/www.scientific.net/AMM.347-350.2889347-350(2889-2893)Online publication date: Aug-2013
  • (2012)Structured AJAX Data Extraction Based on Agricultural OntologyJournal of Integrative Agriculture10.1016/S2095-3119(12)60068-911:5(784-791)Online publication date: May-2012
  • (2010)Method combination for information extractionProceedings of the 11th International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing on International Conference on Computer Systems and Technologies10.1145/1839379.1839471(511-514)Online publication date: 17-Jun-2010
  • (2010)Webpage Segments Classification with Incremental Knowledge AcquisitionU- and E-Service, Science and Technology10.1007/978-3-642-17644-9_9(79-87)Online publication date: 2010
  • (2010)Information Pre-Processing using Domain Meta-Ontology and Rule Learning SystemCanadian Semantic Web10.1007/978-1-4419-7335-1_10(207-217)Online publication date: 30-Jul-2010
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media