skip to main content
10.1145/1998076.1998145acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

A research agenda for data curation cyberinfrastructure

Published:13 June 2011Publication History

ABSTRACT

In 2008, the National Science Foundation released the DataNet solicitation, which presents an ambitious vision for a comprehensive data curation cyberinfrastructure in support of fourth paradigm science. The program subsequently funded two projects, DataONE and the Data Conservancy. The authors put forth an uncertainty framework for understanding the larger socio-cultural issues that influence the progress of DataNet projects and cyberinfrastructure projects in general. This framework highlights the key technical, organizational, scientific, and institutional contexts that the projects must consider as they mature.

References

  1. Sustainable Digital Data Preservation and Access Network Partners (DataNet). 2007. http://www.nsf.gov/pubs/2007/nsf07601/nsf07601.htm.Google ScholarGoogle Scholar
  2. Arms, W.Y. What are the Alternatives to Peer Review? Quality Control in Scholarly Publishing on the Web. Journal of Electronic Publishing 8, 1 (2002).Google ScholarGoogle ScholarCross RefCross Ref
  3. Arms, W.Y. Implementation and Innovation in the NSDL. 2008. http://nsdlreflections.files.wordpress.com/2008/09/nsdl.Google ScholarGoogle Scholar
  4. Avery, P. Open science grid: Building and sustaining general cyberinfrastructure using a collaborative approach. First Monday 12, 2007, 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  5. Bechhofer, S., Ainsworth, J., Bhagat, J., et al. Why Linked Data is Not Enough for Scientists. Sixth IEEE e--Science conference (e-Science 2010), (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bijker, W.E., Hughes, T.P., and Pinch, T.J. The Social construction of technological systems: new directions in the sociology and history of technology. MIT Press, Cambridge, Mass., 1987.Google ScholarGoogle Scholar
  7. Bizer, C., Cyganiak, R., and Heath, T. How to Publish Linked Data on the Web. 2007. http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/.Google ScholarGoogle Scholar
  8. Borkum, M., Lagoze, C., Frey, J.G., and S., C. A Semantic eScience Platform for Chemistry. IEEE EScience, (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brown, J.S. and Duguid, P. The social life of documents. first monday 1, 1 (1996).Google ScholarGoogle Scholar
  10. Choudhury, S. Collecting for Digital Repositories: Data Perspective. ALA CDER, (2009).Google ScholarGoogle Scholar
  11. 1Christensen, C.M., Horn, M.B., and Johnson, C.W. Disrupting class: how disruptive innovation will change the way the world learns. McGraw-Hill, New York, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Christensen, C.M. The innovator's dilemma: when new technologies cause great firms to fail. Harvard Business School Press, Boston, Mass., 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Committee on the Fundamentals of Computer Science: Challenges and Opportunities. Computer Science: Reflections on the Field, Reflections from the Field. Computer Science and Telecommunications Board, The National Academies Press, Washington, DC, 2004.Google ScholarGoogle Scholar
  14. Cragin, M.H., Smith, L.C., Palmer, C.L., and Heidorn, P.B. Extending the Data Curation Curriculum to Practicing LIS Professionals. Proceedings of DigCCurr2009 Digital Curation: Practice, Promise and Prospects, Citeseer, 92.Google ScholarGoogle Scholar
  15. Cummings, J., Finholt, T.A., Foster, I., Kesselman, C., and Lawrence, K.A. Beyond Being There: A Blueprint for Advancing the Design, Development, and Evaluation of Virtual Organization. Arlington, VA, 2008.Google ScholarGoogle Scholar
  16. Edwards, P.N. Y2K: Millennial reflections on computers as infrastructure. History and technology 15, 1 (1998), 7--29.Google ScholarGoogle Scholar
  17. Edwards, P.N., Jackson, S.J., Bowker, G.C., and Knobel, C.P. Understanding Infrastructure: Dynamics, Tensions, and Design. National Science Foundation, 2007.Google ScholarGoogle Scholar
  18. Engestrom, J., Mierttinen, R., and Punamäki-Gitai, R.-L. Activity Theory and Individual and Social Transformation. Cambridge University Press, Cambridge, 1999.Google ScholarGoogle Scholar
  19. Friedlander, A. Emerging infrastructure : the growth of railroads. Corporation for National Research Initiatives, Reston, Va., 1995.Google ScholarGoogle Scholar
  20. Friedlander, A. Natural monopoly and universal service : telephones and telegraphs in the U.S. communications infrastructure, 1837--1940. Corporation for National Research Initiatives, Reston, Va., 1995.Google ScholarGoogle Scholar
  21. Friedlander, A. Power and light : electricity in the U.S. energy infrastructure, 1870--1940. Corporation for National Research Initiatives, Reston, Va., 1996.Google ScholarGoogle Scholar
  22. Friedlander, A. "In God We Trust": All others pay Cash: Banking as an American infrastructure, 1800 to 1935. Corporation for National Research Initiatives, Reston, VA, 1996.Google ScholarGoogle Scholar
  23. Fulker, D.W. Collaboration, Alignment and Leadership. 2008. http://nsdlreflections.wordpress.com/2008/09/25/collaboration-alignment-and-leadership-by-david-fulker.Google ScholarGoogle Scholar
  24. Garfield, E. and Welljams-Dorof, A. Citation data: Their use as quantitative indicators for science and technology evaluation and policy-making. Public Policy 19, 5 (1992), 321--327.Google ScholarGoogle Scholar
  25. Guthrie, K., Griffiths, R., and Maron, N. Sustainability and Revenue Models for Online Academic Resources. An Ithaka Report, (2008).Google ScholarGoogle Scholar
  26. Han, H., Giles, C.L., Zha, H., Li, C., and Tsioutsiouliklis, K. Two supervised learning approaches for name disambiguation in author citations. Proceedings of the 4th ACM/IEEE joint conference on Digital libraries, ACM/IEEE (2004). Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Hey, T., Tansley, S., and Tolle, K., eds. The Fourth Paradigm. Microsoft Research, Redmond, WA, 2009.Google ScholarGoogle Scholar
  28. Higgins, S. The DCC curation lifecycle model. International Journal of Digital Curation 3, 1 (2008).Google ScholarGoogle ScholarCross RefCross Ref
  29. Hilf, E.B., Kappenberg, B., and Roosendaal, H.E. Author identification: the benefit of being able to identify researchers uniquely. The Euroscientist, 5 (2008).Google ScholarGoogle Scholar
  30. Hochachka, W.M., Caruana, R., Fink, D., et al. Data-mining discovery of pattern and process in ecological systems. Journal of Wildlife Management 71, 7 (2007).Google ScholarGoogle ScholarCross RefCross Ref
  31. Kahin, B. Cyberinfrastructure and innovation policy. First Monday 12, 6 (2007), 1--17.Google ScholarGoogle ScholarCross RefCross Ref
  32. Katz, R.N. and Gandel, P.B. The Tower, the Cloud, and Posterity. EDUCAUSE, Inc., Boulder, CO, 2008.Google ScholarGoogle Scholar
  33. Kelling, S., Cook, R., Damoulas, T., et al. Estimating species distributions, across space through time and with features of the environment. In Data Intensive Science. 2011.Google ScholarGoogle Scholar
  34. Kelling, S., Hochachka, W.M., Fink, D., et al. Data-intensive Science: A New Paradigm for Biodiversity Studies. BioScience 59, 7 (2009), 613--620.Google ScholarGoogle Scholar
  35. Kunze, J., Cook, R.B., Cruse, P., Tenopir, C., Vision, T.J., and Michener, W.K. Defining the Data Citation Problem in the DataNet Context. AGU Fall Meeting Abstracts 1, (2009), 08.Google ScholarGoogle Scholar
  36. Lagoze, C., Payette, S., Shin, E., and Wilper, C. Fedora: An Architecture for Complex Objects and their Relationships. International Journal of Digital Libraries 6, 2 (2005), 124--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lagoze, C. Lost Identity: The Assimilation of Digital Libraries into the Web. PhD, 2010, Cornell University.Google ScholarGoogle Scholar
  38. Latour, B. Reassembling the social: an introduction to actor-network-theory. Oxford University Press, Oxford ; New York, 2005.Google ScholarGoogle Scholar
  39. Levy, D.M. and Marshall, C.C. Going Digital: A look at assumptions underlying digital libraries. Communications of the ACM 38, 4 (1995), 77--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Levy, D.M. Fixed or Fluid? Document Stability and New Media. ECHT '94 Proceedings of the 1994 ACM European conference on Hypermedia technology, ACM Press (1994). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Levy, D.M. Scrolling forward: making sense of documents in the digital age. Arcade Publishers, New York, 2001.Google ScholarGoogle Scholar
  42. Loukides, M. Data Science? The future belongs to the companies and people that turn data into products. 2010.Google ScholarGoogle Scholar
  43. Ludaeshcer, B., Altintas, I., Berkley, C., et al. Scientific workflow management and the Kepler system: Research Articles. Concurrency and Computation: Practice and Experience 18, 10 (2006), 1039--1065. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Madin, J., Bowers, S., Schildhauer, M., Krivov, S., Pennington, D., and Villa, F. An ontology for describing and synthesizing ecological observation data. Ecological Informatics 2, 3 (2007), 279--296.Google ScholarGoogle ScholarCross RefCross Ref
  45. Maron, N., Smith, K.K., and Loy, M. Sustaining Digital Resources: An On-the-Ground View of Projects Today: Ithaka Case Studies in Sustainability. Ithaka S+ R, 2009.Google ScholarGoogle Scholar
  46. Nardi, B.A. and O'Day, V. Information ecologies: using technology with heart. MIT Press, Cambridge, Mass., 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Palmer, C.L. and Cragin, M.H. Scholarship and disciplinary practices. Annual review of information science and technology 42, 1 (2008), 163--212. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Pollard, T.J. and Wilkinson, J.M. Making Datasets Visible and Accessible: DataCite's First Summer Meeting. Ariadne, 64 (2010).Google ScholarGoogle Scholar
  49. Renear, A.H., Sacchi, S., and Wicket, K.M. Definitions of Dataset in the Scientific and Technical Literature. Proceedings of the 73rd ASIS&T Annual Meeting, (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Roosendaal, H. and Geurts, P. Forces and functions in scientific communication: an analysis of their interplay. Coooperative Research Information Systems in Physics, (1997).Google ScholarGoogle Scholar
  51. Schummer, J. Scientific communication across disciplines. In R. Holliman, J. Thomas, S. Smidt, E. Scanlon and E. Whitelegg, eds., Practising Science Communication in the Information Age. Oxford University Press, 2009, 54--66.Google ScholarGoogle Scholar
  52. Sokvitne, L. An Evaluation of the Effectiveness of Current Dublin Core Metadata for Retrieval. VALA 2000, Victorian Association for Library Automation (2000).Google ScholarGoogle Scholar
  53. Star, S.L. and Ruhleder, K. Steps toward an ecology of infrastructure: Design and access for large information spaces. ACM Conference on Computer Supported Cooperative Work, (1994), 111--134.Google ScholarGoogle Scholar
  54. Van House, N.A. and Cronin, B. Science and Technology Studies and Information Studies. Annual Review of Information Science and Technology 38, (2004), 3--36.Google ScholarGoogle Scholar
  55. Velden, T. and Lagoze, C. The transformation of scientific communication systems in the digital age: towards a methodology for comparing scientific communication cultures. Proceedings of Workshop Oxford e-Research 08, (2008).Google ScholarGoogle Scholar
  56. Wallis, J.C., Borgman, C.L., Mayernik, M.S., and Pepe, A. Moving archival practices upstream: An exploration of the life cycle of ecological sensing data in collaborative field research. International Journal of Digital Curation 3, 1 (2008).Google ScholarGoogle ScholarCross RefCross Ref
  57. Weissman, V. and Lagoze, C. Towards a policy language for humans and computers. In Lecture Notes in Computer Science. Springer, 2004.Google ScholarGoogle Scholar
  58. Willinsky, J. The access principle: the case for open access to research and scholarship. MIT Press, Cambridge, Mass., 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Yang, K.H., Peng, H.T., Jiang, J.Y., Lee, H.M., and Ho, J.M. Author name disambiguation for citations using topic and web correlation. Research and Advanced Technology for Digital Libraries, (2008), 185--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Zhang, R., Shen, D., Kou, Y., and Nie, T. Author Name Disambiguation for Citations on the Deep Web. Lecture Notes in Computer Science 6185/2010, (2010), 198--209.. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A research agenda for data curation cyberinfrastructure

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        JCDL '11: Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
        June 2011
        500 pages
        ISBN:9781450307444
        DOI:10.1145/1998076

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 June 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate415of1,482submissions,28%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader