|
ABSTRACT
The goal of giving a well-defined meaning to information is currently shared by endeavors such as the Semantic Web as well as by current trends within Knowledge Management. They all depend on the large-scale formalization of knowledge and on the availability of formal metadata about information resources. However, the question how to provide the necessary formal metadata in an effective and efficient way is still not solved to a satisfactory extent. Certainly, the most effective way to provide such metadata as well as formalized knowledge is to let humans encode them directly into the system, but this is neither efficient nor feasible. Furthermore, as current social studies show, individual knowledge is often less powerful than the collective knowledge of a certain community.As a potential way out of the knowledge acquisition bottleneck, we present a novel methodology that acquires collective knowledge from the World Wide Web using the GoogleTM API. In particular, we present PANKOW, a concrete instantiation of this methodology which is evaluated in two experiments: one with the aim of classifying novel instances with regard to an existing ontology and one with the aim of learning sub-/superconcept relations.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
E. Agirre, O. Ansa, E. Hovy, and D. Martinez. Enriching very large ontologies using the WWW. In Proceedings of the ECAI Ontology Learning Workshop, 2000.
|
| |
2
|
K. Ahmad, M. Tariq, B. Vrusias, and C. Handy. Corpus-based thesaurus construction for image retrieval in specialist domains. In Proceedings of the 25th European Conference on Advances in Information Retrieval (ECIR), pages 502--510, 2003.
|
| |
3
|
|
| |
4
|
|
| |
5
|
T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific American, 284(5):34--43, 2001.
|
| |
6
|
G. Bisson, C. Nedellec, and L. Canamero. Designing clustering methods for ontology building - The Mo'K workbench. In Proceedings of the ECAI Ontology Learning Workshop, pages 13--19, 2000.
|
| |
7
|
C. Brewster, F. Ciravegna, and Y. Wilks. Background and foreground knowledge in dynamic ontology construction. In Proceedings of the SIGIR Semantic Web Workshop, 2003.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
 |
13
|
|
| |
14
|
P. Cimiano, A. Hotho, and S. Staab. Comparing conceptual, divisive and agglomerative clustering for learning taxonomies from text. In Proceedings of the European Conference on Artificial Intelligence, pages 435--439, 2004.
|
| |
15
|
F. Ciravegna. Adaptive information extraction from text by rule induction and generalization. In Proceedings of tht 17th International Joint Conference on Artificial Intelligence (IJCAI 2001), pages 1251--1256, 2001.
|
| |
16
|
F. Ciravegna, A. Dingli, D. Guthrie, and Y. Wilks. Integrating Information to Bootstrap Information Extraction from Web Sites. In Proceedings of the IJCAI Workshop on Information Integration on the Web, pages 9--14, 2003.
|
 |
17
|
|
 |
18
|
Oren Etzioni , Michael Cafarella , Doug Downey , Stanley Kok , Ana-Maria Popescu , Tal Shaked , Stephen Soderland , Daniel S. Weld , Alexander Yates, Web-scale information extraction in knowitall: (preliminary results), Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988687]
|
| |
19
|
O. Etzioni, M. Cafarella, D. Downey, A-M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Methods for domain-independent information extraction from the web: An experimental comparison. In Proceedings of the AAAI Conference, pages 391--398, 2004.
|
| |
20
|
R. Evans. A framework for named entity recognition in the open domain. In Proceedings of the Recent Advances in Natural Language Processing (RANLP-2003), pages 137--144, 2003.
|
| |
21
|
D. Faure and C. Nedellec. A corpus-based conceptual clustering method for verb frames and ontology. In P. Velardi, editor, Proceedings of the LREC Workshop on Adapting lexical and corpus resources to sublanguages and applications, pages 5--12, 1998.
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
G. Grefenstette. The WWW as a resource for example-based MT tasks. In Proceedings of ASLIB'99 Translating and the Computer 21, 1999.
|
| |
27
|
|
| |
28
|
|
| |
29
|
|
| |
30
|
L. Hirschman and N. Chinchor. Muc-7 named entity task definition. In Proceedings of the 7th Message Understanding Conference (MUC-7), 1997.
|
| |
31
|
|
| |
32
|
|
 |
33
|
|
| |
34
|
A. Maedche, V. Pekar, and S. Staab. Ontology learning part one - on discovering taxonomic relations from the web. In Web Intelligence, pages 301--322. Springer Verlag, 2002.
|
| |
35
|
|
| |
36
|
K. Markert, N. Modjeska, and M. Nissim. Using the web for nominal anaphora resolution. In EACL Workshop on the Computational Treatment of Anaphora, 2003.
|
| |
37
|
M. Poesio, T. Ishikawa, S. Schulte im Walde, and R. Viera. Acquiring lexical knowledge for anaphora resolution. In Proceedings of the 3rd Conference on Language Resources and Evaluation, 2002.
|
| |
38
|
|
 |
39
|
Dragomir R. Radev , Hong Qi , Zhiping Zheng , Sasha Blair-Goldensohn , Zhu Zhang , Weiguo Fan , John Prager, Mining the web for answers to natural language questions, Proceedings of the tenth international conference on Information and knowledge management, October 05-10, 2001, Atlanta, Georgia, USA
[doi> 10.1145/502585.502610]
|
| |
40
|
|
| |
41
|
|
| |
42
|
Steffen Staab , Christian Braun , Ilvio Bruder , Antje Düsterhöft , Andreas Heuer , Meike Klettke , Günter Neumann , Bernd Prager , Jan Pretzel , Hans-Peter Schnurr , Rudi Studer , Hans Uszkoreit , Burkhard Wrenger, GETESS - Searching the Web Exploiting German Texts, Proceedings of the Third International Workshop on Cooperative Information Agents III, p.113-124, July 31-August 02, 1999
|
| |
43
|
|
| |
44
|
|
CITED BY 5
|
Xian Zhang , Yu Hao , Xiaoyan Zhu , Ming Li , David R. Cheriton, Information distance from a question to an answer, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
|
Yutaka Matsuo , Junichiro Mori , Masahiro Hamasaki , Keisuke Ishida , Takuichi Nishimura , Hideaki Takeda , Koiti Hasida , Mitsuru Ishizuka, POLYPHONET: an advanced social network extraction system from the web, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
Yutaka Matsuo , Junichiro Mori , Masahiro Hamasaki , Takuichi Nishimura , Hideaki Takeda , Koiti Hasida , Mitsuru Ishizuka, POLYPHONET: An advanced social network extraction system from the Web, Web Semantics: Science, Services and Agents on the World Wide Web, v.5 n.4, p.262-278, December, 2007
|
|