skip to main content
10.1145/2232817.2232844acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Content-based layouts for exploratory metadata search in scientific research data

Authors Info & Claims
Published:10 June 2012Publication History

ABSTRACT

Today's digital libraries (DLs) archive vast amounts of information in the form of text, videos, images, data measurements, etc. User access to DL content can rely on similarity between metadata elements, or similarity between the data itself (content-based similarity). We consider the problem of exploratory search in large DLs of time-oriented data. We propose a novel approach for overview-first exploration of data collections based on user-selected metadata properties. In a 2D layout representing entities of the selected property are laid out based on their similarity with respect to the underlying data content. The display is enhanced by compact summarizations of underlying data elements, and forms the basis for exploratory navigation of users in the data space. The approach is proposed as an interface for visual exploration, leading the user to discover interesting relationships between data items relying on content-based similarity between data items and their respective metadata labels. We apply the method on real data sets from the earth observation community, showing its applicability and usefulness.

References

  1. M. Agosti, S. Berretti, G. Brettlecker, A. D. Bimbo, N. Ferro, N. Fuhr, D. A. Keim, C.-P. Klas, T. Lidy, D. Milano, M. C. Norrie, P. Ranaldi, A. Rauber, H.-J. Schek, T. Schreck, H. Schuldt, B. Signer, and M. Springmann. Delosdlms - the integrated delos digital library management system. In DELOS Conference, pages 36--45, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, and Z. Ives. Dbpedia: A nucleus for a web of open data. In Semantic Web Conf., pages 11--15. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Bernard, J. Brase, D. Fellner, O. Koepler, J. Kohlhammer, T. Ruppert, T. Schreck, and I. Sens. A visual digital library approach for time-oriented scientific primary data. Springer International Journal of Digital Libraries, ECDL 2010 Special Issue, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Bernard, T. Ruppert, M. Scherer, J. Kohlhammer, and T. Schreck. Reference list of 269 sources used for exploratory search. doi:10.1594/pangaea.778638, 2012.Google ScholarGoogle Scholar
  5. J. Bollen and H. Van de Sompel. An architecture for the aggregation and analysis of scholarly usage data. In Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, JCDL '06, pages 298--307, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. L. Borgman, J. C. Wallis, and N. Enyedy. Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. International Journal on Digital Libraries, 7(1-2):17--30, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Bremm, T. von Landesberger, J. Bernard, and T. Schreck. Assisted descriptor selection based on visual comparative data analysis. Comput. Graph. Forum, 30(3):891--900, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Cao, D. Gotz, J. Sun, and H. Qu. Dicon: Interactive visual analysis of multidimensional clusters. IEEE Trans. Vis. Comput. Graph., 17(12):2581--2590, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Cha. Comprehensive survey on distance/similarity measures between probability density functions. International Journal of Mathematical Models and Methods in Applied Sciences, 1(4):300--307, 2007.Google ScholarGoogle Scholar
  10. K.-H. Cheung, E. Lim, M. Samwald, H. Chen, L. N. Marenco, M. Holford, T. M. Morse, P. Mutalik, G. M. Shepherd, and P. L. Miller. Approaches to neuroscience data integration. Briefings in Bioinformatics, 10(4):345--353, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  11. T. Deselaers, D. Keysers, and H. Ney. Features for image retrieval: an experimental comparison. Information Retrieval, 11(2):77--107, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Duke, M. Day, R. Heery, L. A. Carr, and S. J. Coles. Enhancing access to research data: the challenge of crystallography. In JCDL, pages 46--55. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Eitz, K. Hildebrand, T. Boubekeur, and M. Alexa. A descriptor for large scale image retrieval based on sketched feature lines. In Eurographics Symposium on Sketch-Based Interfaces and Modeling, pages 29--36. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Hey, S. Tansley, and K. Tolle, editors. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, Redmond, Washington, 2009.Google ScholarGoogle Scholar
  16. T. Hey and A. Trefethen. Cyberinfrastructure for e-Science. Science, 308(5723):817--821, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  17. D. Hull, S. R. Pettifer, and D. B. Kell. Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web, 2008.Google ScholarGoogle Scholar
  18. D. Keim, J. Kohlhammer, G. Ellis, and F. Mansmann. Mastering the information age: solving problems with visual analytics. Eurographics, 2011.Google ScholarGoogle Scholar
  19. D. Keim, F. Mansmann, J. Schneidewind, and H. Ziegler. Challenges in visual data analysis. In Information Visualization, pages 9--16, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Inf. Systems, 3(3):263--286, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  21. E. Keogh and S. Kasetty. On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Mining and Knowledge Discovery, 7(4):349--371, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Lagoze, S. Payette, E. Shin, and C. Wilper. Fedora: an architecture for complex objects and their relationships. Int. J. Digit. Libr., 6(2):124--138, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Lew, N. Sebe, C. Djeraba, and R. Jain. Content-based multimedia information retrieval: State of the art and challenges. Transactions on Multimedia Computing, Communications, and Applications, 2(1):1--19, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Noack. An energy model for visual graph clustering. In International Symposium on Graph Drawing, pages 425--436. Springer-Verlag, 2003.Google ScholarGoogle Scholar
  25. A. Ohmura, E. G. Dutton, B. Forgan, C. Frohlich, H. Gilgen, H. Hegner, A. Heimo, G. Konig-Langlo, B. mcarthur, G. Muller, R. Philipona, R. Pinker, C. H. Whitlock, K. Dehne, and M. Wild. Baseline surface radiation network (BSRN/WCRP): New precision radiometry for climate research. Bull. Amer. Met. Soc., 79:2115--2136, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  26. PANGAEA Data Publisher for Earth & Environmental Science. http://www.pangaea.de/.Google ScholarGoogle Scholar
  27. A. Powell, M. Nilsson, A. Naeve, and P. Johnston. Dublin core metadata initiative - abstract model, 2005. White Paper.Google ScholarGoogle Scholar
  28. A. Rauber and M. Frühwirth. Automatically analyzing and organizing music archives. In ECDL, pages 402--414. Springer, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Rueger. Multimedia Information Retrieval. Morgan and Claypool Publishers, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Scherer, J. Bernard, and T. Schreck. Retrieval and exploratory search in multivariate research data repositories using regressional features. In JCDL, pages 363--372. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sloan Digital Sky Survey. http://www.sdss.org/.Google ScholarGoogle Scholar
  32. L. D. Stein. Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges. Nature Reviews Genetics, 9(9):678--688, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  33. The DataCite consortium. DataCite: Helping you to find, access, and reuse data. http://datacite.org/.Google ScholarGoogle Scholar
  34. R. Torres, S. M. McNee, M. Abel, J. A. Konstan, and J. Riedl. Enhancing digital libraries with techlens+. In Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, JCDL '04, pages 228--236, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Tsatsaronis, I. Varlamis, S. Torge, M. Reimann, K. Nørvåg, M. Schroeder, and M. Zschunke. How to become a group leader? or modeling author types based on graph mining. TPDL, pages 15--26, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Vesanto. SOM-based data visualization methods. Intelligent Data Analysis, 3(2):111--126, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  37. T. von Landesberger, A. Kuijper, T. Schreck, J. Kohlhammer, J. vanWijk, J.-D. Fekete, and D. Fellner. Visual analysis of large graphs: State-of-the-art and future research challenges. 2011.Google ScholarGoogle Scholar
  38. White and Roth. Exploratory Search - Beyond the query-response paradigm. Morgan and Claypool, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. J. Williams. A perspective of publicly accessible/open-access chemistry databases. Drug discovery today, 13(11-12):495--501, 2008.Google ScholarGoogle Scholar
  40. I. H. Witten, R. J. Mcnab, S. J. Boddie, and D. Bainbridge. Greenstone: A comprehensive open-source digital library software system. In International Conference on Digital Libraries. ACM, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Wolfram Alpha. WolframjAlpha: Computational Knowledge Engine. http://www.wolframalpha.com/.Google ScholarGoogle Scholar
  42. B. Wong, S. Choudhury, C. Rooney, R. Chen, and K. Xu. Invisque: technology and methodologies for interactive information visualization and analytics in large library collections. TPDL, pages 227--235, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Content-based layouts for exploratory metadata search in scientific research data

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
            June 2012
            458 pages
            ISBN:9781450311540
            DOI:10.1145/2232817

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 10 June 2012

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate415of1,482submissions,28%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader