| Estimating required recall for successful knowledge acquisition from the web |
| Full text |
Pdf
(301 KB)
|
| Source
|
International World Wide Web Conference
archive
Proceedings of the 15th international conference on World Wide Web
table of contents
Edinburgh, Scotland
POSTER SESSION: Browsers and UI, web engineering, hypermedia & multimedia, security, and accessibility
table of contents
Pages: 969 - 970
Year of Publication: 2006
ISBN:1-59593-323-9
|
|
Author
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 38, Citation Count: 0
|
|
|
ABSTRACT
Information on the Web is not only abundant but also redundant. This redundancy of information has an important consequence on the relation between the recall of an information gathering system and its capacity to harvest the core information of a certain domain of knowledge. This paper provides a new idea for estimating the necessary Web coverage of a knowledge acquisition system in order to achieve a certain desired coverage of the contained core information.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
Zhiqiang Bi , Christos Faloutsos , Flip Korn, The "DGX" distribution for mining massive, skewed data, Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, p.17-26, August 26-29, 2001, San Francisco, California
[doi> 10.1145/502512.502521]
|
| |
2
|
S. Bornholdt and H. Ebel. World Wide Web scaling exponent from Simon's 1955 model. Physical Review E, 64:035104, 2001.
|
| |
3
|
Mark Craven , Dan DiPasquo , Dayne Freitag , Andrew McCallum , Tom Mitchell , Kamal Nigam , Seán Slattery, Learning to construct knowledge bases from the World Wide Web, Artificial Intelligence, v.118 n.1-2, p.69-113, April 2000
[doi> 10.1016/S0004-3702(00)00004-7
]
|
| |
4
|
O. Etzioni, M. J. Cafarella, D. Downey, A.-M. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Methods for domain-independent information extraction from the Web: An experimental comparison. In Proc. AAAI, pages 391--398. AAAI, 2004.
|
| |
5
|
W. Gatterbauer, B. Krüpl, W. Holzinger, and M. Herzog. Web information extraction using eupeptic data in Web tables. In Proc. RAWS, pages 41--48. VSB-TU Ostrava, 2005.
|
| |
6
|
P. G. Ipeirotis and L. Gravano. Distributed search over the hidden web: hierarchical database sampling and selection. In Proc. VLDB, pages 394--405. Morgan Kaufmann, 2002.
|
 |
7
|
David A. Shamma , Sara Owsley , Kristian J. Hammond , Shannon Bradshaw , Jay Budzik, Network arts: exposing cultural reality, Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, May 19-21, 2004, New York, NY, USA
[doi> 10.1145/1013367.1013375]
|
| |
8
|
Wikipedia. Pareto principle, 2006. Available: http://en.wikipedia.org/wiki/Pareto_principle (February 2006).
|
|