skip to main content
10.1145/1378889.1378916acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Harvana: harvesting community tags to enrich collection metadata

Published: 16 June 2008 Publication History

Abstract

Collaborative, social tagging and annotation systems have exploded on the Internet as part of the Web 2.0 phenomenon. Systems such as Flickr, Del.icio.us, Technorati, Connotea and LibraryThing, provide a community-driven approach to classifying information and resources on the Web, so that they can be browsed, discovered and re-used. Although social tagging sites provide simple, user-relevant tags, there are issues associated with the quality of the metadata and the scalability compared with conventional indexing systems. In this paper we propose a hybrid approach that enables authoritative metadata generated by traditional cataloguing methods to be merged with community annotations and tags. The HarvANA (Harvesting and Aggregating Networked Annotations) system uses a standardized but extensible RDF model for representing the annotations/tags and OAI-PMH to harvest the annotations/tags from distributed community servers. The harvested annotations are aggregated with the authoritative metadata in a centralized metadata store. This streamlined, interoperable, scalable approach enables libraries, archives and repositories to leverage community enthusiasm for tagging and annotation, augment their metadata and enhance their discovery services. This paper describes the HarvANA system and its evaluation through a collaborative testbed with the National Library of Australia using architectural images from PictureAustralia.

References

[1]
Yahoo! Inc. 2008. Flickr http://www.flickr.com/
[2]
Del.icio.us (2008). Del.icio.us: Social Bookmarking. http://del.icio.us/
[3]
Lund, B., Hammond, T., Flack, M., and Hannay, T. 2005. Social Bookmarking Tools (II): A Case Study - Connotea, D-Lib Magazine 11(4), April 2005. http://www.dlib.org/dlib/april05/lund/04lund.html
[4]
LibraryThing. 2008. http://www.librarything.com/
[5]
O'Reilly T. 2005. What Is Web 2.0. O'Reilly Network. September 30, 2005 http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html
[6]
Shirky, C. 2005. Ontology is Overrated: Categories, Links, and Tags. Retrieved 17 January, 2007, from http://www.shirky.com/writings/ontology_overrated.html
[7]
Merholz, P. 2004. Metadata for the masses. Retrieved 17 January, 2007, from http://www.adaptivepath.com/publications/essays/archives/000361.php
[8]
Kroski, E. 2005. The Hive Mind: Folksonomies and user-based tagging. Retrieved 17 January, 2007, from http://infotangle.blogsome.com/2005/12/07/the-hive-mind-folksonomies-and-user-based-tagging/
[9]
Guy, M., and Tonkin, E. 2006. "Folksonomies: Tidying up tags?," D-Lib Magazine, Volume 12, Number 1, January, 2006. http://www.dlib.org/dlib/january06/guy/01guy.html
[10]
Kipp, M. E. and Campbell, G. D. 2006. 'Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices', Proceedings of the Annual General Meeting of the American Society for Information Science and Technology, Austin, TX, November 3-8, 2006.
[11]
Vander Wal, T. 2005. Explaining and showing broad and narrow folksonomies. Feb 21, 2005. http://personalinfocloud.com/2005/02/explaining_and_.html
[12]
TagCommons 2007. Ontologies Vs. Formats Vs. Schema Vs. API. March 2, 2007. http://tagcommons.org/
[13]
The Commons 2008 - The Flickr and Library of Congress Pilot Project http://www.flickr.com/commons
[14]
Chun, S, Cherry R, Hiwiller D, Trant J. and Wyman, B 2006. "Steve.museum: An Ongoing Experiment in Social Tagging, Folksonomy, and Museums," Museums and the Web 2006.
[15]
Marshall, C. 1998. Toward an ecology of hypertext annotation in Proceedings of ACM Hypertext '98, Pittsburgh, PA (June 20-24, 1998) pp. 40-49.
[16]
Reeve, L. and Han, H. 2005. Survey of semantic annotation platforms. In Proceedings of the 2005 ACM Symposium on Applied Computing (Santa Fe, New Mexico, March 13 - 17, 2005). L. M. Liebrock, Ed. SAC '05. ACM, New York, NY, 1634-1638. DOI= http://doi.acm.org/10.1145/1066677.1067049
[17]
Sazedj P. and Pinto, H. S. 2005. Time to evaluate: Targeting Annotation Tools, Semannot 2005, Nov. 2005 .
[18]
Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E. Ciravegna, F., 2006. 'Semantic annotation for knowledge management: Requirements and a survey of the state of the art', Web Semantics: Science, Services and Agents on the World Wide Web, vol. 4, no. 1, 14-28 (2006).
[19]
Speller, E., Collaborative tagging, folksonomies, distributed classification or ethnoclassification: a literature review http://informatics.buffalo.edu/org/lsj/articles/speller_2007_2_collaborative.php
[20]
Chen, C., Oakes, M., and Tait, J. 2006. A location annotation system for personal photos. In Proceedings of the 29th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Seattle, Washington, USA, August 06 - 11, 2006). SIGIR '06. ACM, New York, NY, 726-726.
[21]
Fu, X., Ciszek, T., Marchionini, G. and Solomon, P. 2005. Annotating the Web: An Exploratory Study of Web Users' Needs for Personal Annotation Tools. In Grove, Andrew, Eds. Proceedings 68th Annual Meeting of the American Society for Information Science and Technology (ASIST) 42, Charlotte (US).
[22]
Groza, T., Handschuh, S., Möller K. and Decker, S. 2007. SALT - Semantically Annotated LaTeX for scientific publications. In Proceedings of the 4th European Semantic Web Conference (ESWC 2007). Innsbruck, Austria, 2007
[23]
Agosti, M. Ferro, N., Frommholz, I., Panizzi, E., Putz, W. and Thiel, U. 2006. Integration of the DiLAS Annotation Service into Digital Library Infrastructures. In: Proc. of the Workshop on Digital Libraries in the Context of Users' Broader Activities (DL-CUBA 2006). June 2006, Chapel Hill, NC, USA.
[24]
Fernandes, M., Alho, M., Martins, J. A., Pinto, J. S., and Almeida, P. 2005. Web Annotation System Based on Web Services. In Proceedings of the international Conference on Next Generation Web Services Practices (August 22 - 26, 2005). NWESP. IEEE Computer Society, Washington, DC.
[25]
W3C, 2004. OWL Web Ontology Language Overview, W3C Recommendation, Eds. D. McGuinness, F. van Harmelen, 10 February, 2004 http://www.w3.org/TR/owl-features/
[26]
W3C, 2004. Resource Description Framework (RDF). RDF Core Working Group, 2004.http://www.w3.org/RDF/
[27]
Porter, J. 2005. I've heard of folksonomies, Now how do I apply them to my site? Retrieved 17 January, 2007, from http://www.bokardo.com/archives/applying_folksonomies/
[28]
Smith, G. 2004. Folksonomy: social classification. Retrieved 17 January, 2007, from http://atomiq.org/archives/2004/08/folksonomy_social_classification.html
[29]
Merholz, P. 2004. Ethnoclassification and vernacular vocabularies. Retrieved 17 January, 2007, from http://www.peterme.com/archives/000387.html
[30]
Mejias, U. A. 2004. Bookmark, classify and share: A mini-ethnography of social practices in a distributed classification community. Retrieved 17 January, 2007, from http://ideant.typepad.com/ideant/2004/12/a_delicious_stu.html
[31]
Hammond, T., Hannay, T. Lund, B., Scott, J., 2005. Social Bookmarking Tools A General Review, D-Lib Magazine, April 2005, Volume 11 Number 4.
[32]
Winer, D. 2007. RSS 2.0 at Harvard Law. RSS 2.0 Specification. http://cyber.law.harvard.edu/rss/rss.html
[33]
Nejdl, W., Wolf, B., Qu, C., Decker, S., Sintek, M., Naeve, A., Nilsson, M., Palmér, M., and Risch, T. 2002. EDUTELLA: a P2P networking infrastructure based on RDF. In Proceedings of the 11th international Conference on World Wide Web (Honolulu, Hawaii, USA, May 07
[34]
Cai M. and Frank, M. 2004. RDFPeers: A Scalable Distributed RDF Repository based on A Structured Peer-to-Peer Network. In International World Wide Web Conference (WWW), 2004. http://citeseer.ist.psu.edu/cai04rdfpeers.html
[35]
Tummarello, G. Morbidoni, C. Bachmann-Gmür, R. Erling, O. 2007. "RDFSync: efficient remote synchronization of RDF models", ISWC 2007, Korea, November 2007
[36]
Heine,F. 2006. "Scalable P2P based RDF Querying", ACM International Conference Proceeding Series; Vol. 152, Proceedings of the 1st International conference on Scalable Information Systems, Hong Kong Article No. 17 2006
[37]
Lagoze, C, Van de Sompel, H., Nelson, M., and Warner, S. "The Open Archives Initiative Protocol for Metadata Harvesting, Version 2.0". June 2002. http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm
[38]
Henry, J., Liu, X., Hochstenbach, P. and Van de Sompel, H. 2004. "The multi-faceted use of the OAI-PMH in the LANL Repository," Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, June 7-11 2004, Tucson, AZ, USA. pp 11--20.
[39]
Koivunen, M.-R. and Kahan, J., Annotea: an open RDF infrastructure for shared Web annotations. In Proceedings of the 10th international conference on World Wide Web. Hong Kong. ACM Press (2001)
[40]
Rhaptos, 2004. Zannot: Zope Annotea Server. http://rhaptos.org/downloads/zope/zannot/
[41]
Mozdev, 2008. Annozilla (Annotea on Mozilla). http://annozilla.mozdev.org/
[42]
W3C 2001. annoChump Overview. 5 December, 2001. http://www.w3.org/2001/09/chump/
[43]
Schroeter, R., Hunter, J., and Kosovic, D. 2003. Vannotea - A Collaborative Video Indexing, Annotation and Discussion System For Broadband Networks. In Knowledge Markup and Semantic Annotation Workshop, K-CAP 2003, Florida 2003
[44]
Schroeter, R., Hunter, J., Guerin, J., Khan I. and Henderson, M. 2006. "A Synchronous Multimedia Annotation System for Secure Collaboratories" 2nd IEEE International Conference on E-Science and Grid Computing (eScience 2006). Amsterdam. December 2006. p 41.
[45]
W3C 2002. XML Pointer Language (XPointer). W3C Working Draft 16 August, 2002. http://www.w3.org/TR/xptr/
[46]
Oasis 2008. eXtensible Access Control Markup Language (XACML), http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml
[47]
Internet2 2008. Shibboleth. http://shibboleth.internet2.edu/
[48]
National Library of Australia PictureAustralia http://www.pictureaustralia.org/
[49]
Brickley D. and Miller L. 2007. FOAF Vocabulary Speciifcation 0.91. November 2007 http://xmlns.com/foaf/spec/
[50]
Golbeck J. 2006. Generating Predictive Movie Recommendations from Trust in Social Networks. Proceedings 4th International iTrust Conference, 2006, Pisa, Italy. pp 93--104
[51]
Golder, S. and Huberman B. A. 2006. "Usage Patterns of Collaborative Tagging Systems." Journal of Information Science, 32(2). 198--208
[52]
Rosenfeld, L. 2005. Folksonomies? How about Metadata Ecologies? http://louisrosenfeld.com/home/bloug_archive/000330.html
[53]
Pind, L. 2005. Folksonomies: How we can improve the tags. Retrieved 17 January, 2007, from http://pinds.com/articles/2005/01/23/folksonomies-how-we-can-improve-the-tags
[54]
Vuorikari, Riina 2007. Folksonomies, social bookmarking and tagging: the state-of-the-art, Special Insight Reports, http://insight.eun.org/shared/data/insight/documents/specialreports/Specia_Report_Folksonomies.pdf
[55]
Microsoft Research 2008. TagBooster: A System for Ranking and Suggesting tags. http://research.microsoft.com/~milanv/tagbooster.htm
[56]
Mejias, U. A. 2005. Tag literacy. Retrieved 17 January, 2007, http://blog.ulisesmejias.com/2005/04/26/tag-literacy/
[57]
Khan, I., Schroeter R. and Hunter, J. 2006. "Implementing a Secure Annotation Service", International Provenance and Annotation Workshop, Chicago, USA. 3 - 5 May 2006.
[58]
Smith, D. A., Lambert, J. and schraefel, m. c. 2008. Rich Tags: Cross-Repository Browsing. In: Open Repositories Conference 2008 (OR 2008), April 2008, Southampton, UK
[59]
Apperly R., Irving R. and Reynolds P. 1989. A pictorial guide to identifying Australian architecture : styles and terms from 1788 to the present. Angus & Robertson, 1989.
[60]
Hearst M. and Rosner, D. 2008. Tagclouds: Data Analysis tool or Social Signaller? HICSS 2008, Social Spaces Minitrack, January 2008, Hawaii http://flamenco.berkeley.edu/papers/tagclouds.pdf
[61]
HarvANA Demo http://maenad.itee.uq.edu.au:8080/harvana/
[62]
Gruber T. 2007. Collective Knowledge Systems: Where the Social Web meets the Semantic Web. Journal of Web Semantics (2007)

Cited By

View all
  • (2019)TOMS: A Linked Open Data System for Collaboration and Distribution of Cultural Heritage Artifact Collections of National Museums in ThailandNew Generation Computing10.1007/s00354-019-00063-137:4(479-498)Online publication date: 10-Aug-2019
  • (2017)Linked Data Annotation Without the Pointy Brackets: Introducing Recogito 2Journal of Map & Geography Libraries10.1080/15420353.2017.130730313:1(111-132)Online publication date: 11-May-2017
  • (2015)A conceptual framework for middle-up-down semantic annotation of online 3D scenesProceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)10.1109/ICOSC.2015.7050853(464-469)Online publication date: Feb-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
JCDL '08: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
June 2008
490 pages
ISBN:9781595939982
DOI:10.1145/1378889
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. annotation
  2. digital collections
  3. folksonomy
  4. harvesting
  5. metadata
  6. ontology
  7. social tagging

Qualifiers

  • Research-article

Conference

JCDL08
JCDL08: Joint Conference on Digital Libraries
June 16 - 20, 2008
PA, Pittsburgh PA, USA

Acceptance Rates

JCDL '08 Paper Acceptance Rate 33 of 117 submissions, 28%;
Overall Acceptance Rate 415 of 1,482 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)TOMS: A Linked Open Data System for Collaboration and Distribution of Cultural Heritage Artifact Collections of National Museums in ThailandNew Generation Computing10.1007/s00354-019-00063-137:4(479-498)Online publication date: 10-Aug-2019
  • (2017)Linked Data Annotation Without the Pointy Brackets: Introducing Recogito 2Journal of Map & Geography Libraries10.1080/15420353.2017.130730313:1(111-132)Online publication date: 11-May-2017
  • (2015)A conceptual framework for middle-up-down semantic annotation of online 3D scenesProceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)10.1109/ICOSC.2015.7050853(464-469)Online publication date: Feb-2015
  • (2014)Virtual Communities as Contributors for Digital Objects Metadata GenerationCyber Behavior10.4018/978-1-4666-5942-1.ch061(1182-1198)Online publication date: 2014
  • (2013)Harvesting of semantic metadata from distributed 3D web content2013 6th International Conference on Human System Interactions (HSI)10.1109/HSI.2013.6577822(193-200)Online publication date: Jun-2013
  • (2013)Validating OntoElect Methodology in Refining ICTERI Scope OntologyInformation Systems: Methods, Models, and Applications10.1007/978-3-642-38370-0_12(128-139)Online publication date: 2013
  • (2013)Quantifying Ontology Fitness in OntoElect Using Saturation- and Vote-Based MetricsInformation and Communication Technologies in Education, Research, and Industrial Applications10.1007/978-3-319-03998-5_8(136-162)Online publication date: 2013
  • (2012)A Service Component Model and Implementation for Institutional RepositoriesAdvanced Design Approaches to Emerging Software Systems10.4018/978-1-60960-735-7.ch004(61-81)Online publication date: 2012
  • (2012)A Service Component Model and Implementation for Institutional RepositoriesGrid and Cloud Computing10.4018/978-1-4666-0879-5.ch208(466-486)Online publication date: 2012
  • (2012)A Service Component Model and Implementation for Institutional RepositoriesGrid and Cloud Computing10.4018/978-1-4666-0879-5.ch2.8(466-486)Online publication date: 30-Apr-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media