skip to main content
10.1145/1296951.1296969acmconferencesArticle/Chapter ViewAbstractPublication PageswikisymConference Proceedingsconference-collections
Article

Connecting wikis and natural language processing systems

Published: 21 October 2007 Publication History

Abstract

We investigate the integration of Wiki systems with automated natural language processing (NLP) techniques. The vision is that of a "self-aware" Wiki system reading, understanding, transforming, and writing its own content, as well as supporting its users in information analysis and content development. We provide a number of practical application examples, including index generation, question answering, and automatic summarization, which demonstrate the practicability and usefulness of this idea. A system architecture providing the integration is presented, as well as first results from an initial implementation based on the GATE framework for NLP and the MediaWiki system.

References

[1]
S. Ananiadou and J. McNaught, editors. Text Mining for Biology and Biomedicine. Artech House, 2006.
[2]
A. Bairoch, R. Apweiler, C. H. Wu, W. C. Barker, B. Boeckmann, S. Ferro, E. Gasteiger, H. Huang, R. Lopez, M. Magrane, M. J. Martin, D. A. Natale, C. O'Donovan, N. Redaschi, and L.-S. L. Yeh. The Universal Protein Resource (UniProt). Nucleic Acids Research, 33(suppl 1):D154--D159, January 2005.
[3]
S. Bergler, R. Witte, M. Khalife, Z. Li, and F. Rudzicz. Using Knowledge-poor Coreference Resolution for Text Summarization. In Proceedings of the HLT/NAACL Workshop on Text Summarization (DUC 2003). Document Understanding Conference, 2003. http://www-nlpir.nist.gov/projects/duc/pubs/2003final.papers/concordia.final.pdf.
[4]
S. Bergler, R. Witte, Z. Li, M. Khalife, Y. Chen, M. Doandes, and A. Andreevskaia. Multi-ERSS and ERSS 2004. In Proceedings of the HLT/NAACL Workshop on Text Summarization (DUC 2004). Document Understanding Conference, 2004. http://www-nlpir.nist.gov/projects/duc/pubs/2004papers/concordia.witte.pdf.
[5]
K. Bontcheva, V. Tablan, D. Maynard, and H. Cunningham. Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Engineering, 2004.
[6]
H. Cunningham. GATE, a General Architecture for Text Engineering. Computers and the Humanities, 36:223--254, 2002. http://gate.ac.uk.
[7]
H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A framework and graphical development environment for robust NLP tools and applications. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics, 2002. http://gate.ac.uk.
[8]
R. Feldman and J. Sanger. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, 2006.
[9]
D. Jurafsky and J. H. Martin. Speech and Language Processing. Prentice Hall, 2000.
[10]
A. Kiryakov, B. Popov, I. Terziev, D. Manov, and D. Ognyanoffe. Semantic Annotation, Indexing, and Retrieval. Journal of Web Semantics, 2(1), 2005.
[11]
M. Krötzsch, D. Vrandeci, and M. Völkel. Semantic MediaWiki. In I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo, editors, The Semantic Web -- ISWC 2006, volume 4273 of LNCS, pages 935--942. Springer, 2006.
[12]
I. Mani. Automatic Summarization. John Benjamins B.V., 2001.
[13]
C. D. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999.
[14]
P. Morville. Ambient Findability. O'Reilly, 2005.
[15]
P. Perera and R. Witte. A Self-Learning Context-Aware Lemmatizer for German. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), pages 636--643, Vancouver, British Columbia, Canada, October 6--8 2005. Association for Computational Linguistics. http://www.aclweb.org/anthology/H/H05/H05-1080.
[16]
S. Schaffert. IkeWiki: A Semantic Wiki for Collaborative Knowledge Management. In WETICE, pages 388--396. IEEE Computer Society, 2006.
[17]
B. Shanks. WikiGateway: a library for interoperability and accelerated wiki development. In D. Riehle, editor, Int. Sym. Wikis, pages 53--66. ACM, 2005.
[18]
R. Witte. An Integration Architecture for User-Centric Document Creation, Retrieval, and Analysis. In Proceedings of the VLDB Workshop on Information Integration on the Web (IIWeb'04), pages 141--144, Toronto, Canada, August 30 2004. http://rene-witte.net/downloads/witte iiweb04.pdf.
[19]
R. Witte and C. J. O. Baker. Combining Biological Databases and Text Mining to support New Bioinformatics Applications. In Natural Language Processing and Information Systems: 10th International Conference on Applications of Natural Language to Information Systems (NLDB 2005), volume 3513 of LNCS, pages 310--321, Alicante, Spain, June 15--17 2005. Springer-Verlag.
[20]
R. Witte and S. Bergler. Fuzzy Clustering for Topic Analysis and Summarization of Document Collections. In Z. Kobti and D. Wu, editors, Proc. of the 20th Canadian Conference on Artificial Intelligence (Canadian A.I. 2007), LNAI 4509, pages 476--488, Montréal, Québec, Canada, May 28--30 2007. Springer.
[21]
R. Witte and S. Bergler. Next-Generation Summarization: Contrastive, Focused, and Update Summaries. In International Conference on Recent Advances in Natural Language Processing (RANLP 2007), Borovets, Bulgaria, September 27-29 2007.
[22]
R. Witte, P. Gerlach, M. Joachim, T. Kappler, R. Krestel, and P. Perera. Engineering a Semantic Desktop for Building Historians and Architects. In Proceedings of the Semantic Desktop Workshop at the ISWC, volume 175 of CEUR Workshop Proceedings, pages 138--152, Galway, Ireland, November 6 2005. http://CEUR-WS.org/Vol-175/34 witte engineeringsd final.pdf.
[23]
R. Witte, T. Kappler, and C. J. O. Baker. Ontology Design for Biomedical Text Mining. In Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences, chapter 13, pages 281--313. Springer, 2007.
[24]
R. Witte, R. Krestel, and S. Bergler. ERSS 2005: Coreference-Based Summarization Reloaded. In Proceedings of Document Understanding Workshop (DUC), Vancouver, B.C., Canada, October 9-10 2005. http://duc.nist.gov/pubs/2005papers/ukarlsruhe.witte.pdf.
[25]
R. Witte, R. Krestel, and S. Bergler. Context-based Multi-Document Summarization using Fuzzy Coreference Cluster Graphs. In Proceedings of Document Understanding Workshop (DUC), New York City, NY, USA, June 8-9 2006. http://duc.nist.gov/pubs/2005papers/ukarlsruhe.witte.pdf.
[26]
R. Witte, R. Krestel, and S. Bergler. Generating Update Summaries for DUC 2007. In Proceedings of Document Understanding Workshop (DUC) at NAACL-HLT 2007, Rochester, NY, USA, April 26--27 2007. http://duc.nist.gov/pubs/2005papers/ukarlsruhe.witte.pdf.
[27]
M. M. Wood, S. J. Lydon, V. Tablan, D. Maynard, and H. Cunningham. Populating a Database from Parallel Texts Using Ontology-Based Information Extraction. In 9th International Conference on Applications of Natural Language to Information Systems (NLDB), volume 3136 of LNCS. Springer, 2004.
[28]
T. Zesch, I. Gurevych, and M. Mühlhäuser. Analyzing and Accessing Wikipedia as a Lexical Semantic Resource. In G. Rehm, A. Witt, and L. Lemnitzer, editors, Data Structures for Linguistic Resources and Applications, pages 197--205. Gunter Narr, Tübingen, Tuebingen, Germany, 2007.

Cited By

View all
  • (2012)Supporting wiki users with natural language processingProceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration10.1145/2462932.2462976(1-4)Online publication date: 27-Aug-2012
  • (2012)Natural language processing for MediaWikiProceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration10.1145/2462932.2462946(1-10)Online publication date: 27-Aug-2012
  • (2011)WikuluProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations10.5555/2002440.2002453(74-79)Online publication date: 21-Jun-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WikiSym '07: Proceedings of the 2007 international symposium on Wikis
October 2007
190 pages
ISBN:9781595938619
DOI:10.1145/1296951
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Wiki/NLP integration
  2. self-aware wiki system

Qualifiers

  • Article

Conference

WikiSym07
WikiSym07: International Symposium on Wikis
October 21 - 25, 2007
Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 69 of 145 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2012)Supporting wiki users with natural language processingProceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration10.1145/2462932.2462976(1-4)Online publication date: 27-Aug-2012
  • (2012)Natural language processing for MediaWikiProceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration10.1145/2462932.2462946(1-10)Online publication date: 27-Aug-2012
  • (2011)WikuluProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Systems Demonstrations10.5555/2002440.2002453(74-79)Online publication date: 21-Jun-2011
  • (2011)Ontology Extraction for Knowledge ReuseIEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans10.1109/TSMCA.2011.213271341:4(798-809)Online publication date: 1-Jul-2011
  • (2011)Integrating Wiki Systems, Natural Language Processing, and Semantic Technologies for Cultural Heritage Data ManagementLanguage Technology for Cultural Heritage10.1007/978-3-642-20227-8_12(213-230)Online publication date: 26-Apr-2011
  • (2010)Improving Cross-Language Information Retrieval by Harnessing the Social WebHandbook of Research on Web 2.0, 3.0, and X.010.4018/978-1-60566-384-5.ch016(277-295)Online publication date: 2010
  • (2010)The openFuXML Wiki EngineProceedings of the 2010 Second International Conference on Mobile, Hybrid, and On-Line Learning10.1109/eLmL.2010.19(93-98)Online publication date: 10-Feb-2010
  • (2009)An architecture to support intelligent user interfaces for Wikis by means of Natural Language ProcessingProceedings of the 5th International Symposium on Wikis and Open Collaboration10.1145/1641309.1641328(1-10)Online publication date: 25-Oct-2009
  • (2009)On the problem of Wiki texts indexingJournal of Computer and Systems Sciences International10.1134/S106423070904015748:4(616-624)Online publication date: 21-Aug-2009
  • (2009)SmartWiki: Support for high-quality requirements engineering in a collaborative setting2009 ICSE Workshop on Wikis for Software Engineering10.1109/WIKIS4SE.2009.5069994(25-35)Online publication date: May-2009
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media