ABSTRACT
Today's digital libraries (DLs) archive vast amounts of information in the form of text, videos, images, data measurements, etc. User access to DL content can rely on similarity between metadata elements, or similarity between the data itself (content-based similarity). We consider the problem of exploratory search in large DLs of time-oriented data. We propose a novel approach for overview-first exploration of data collections based on user-selected metadata properties. In a 2D layout representing entities of the selected property are laid out based on their similarity with respect to the underlying data content. The display is enhanced by compact summarizations of underlying data elements, and forms the basis for exploratory navigation of users in the data space. The approach is proposed as an interface for visual exploration, leading the user to discover interesting relationships between data items relying on content-based similarity between data items and their respective metadata labels. We apply the method on real data sets from the earth observation community, showing its applicability and usefulness.
- M. Agosti, S. Berretti, G. Brettlecker, A. D. Bimbo, N. Ferro, N. Fuhr, D. A. Keim, C.-P. Klas, T. Lidy, D. Milano, M. C. Norrie, P. Ranaldi, A. Rauber, H.-J. Schek, T. Schreck, H. Schuldt, B. Signer, and M. Springmann. Delosdlms - the integrated delos digital library management system. In DELOS Conference, pages 36--45, 2007. Google ScholarDigital Library
- S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, and Z. Ives. Dbpedia: A nucleus for a web of open data. In Semantic Web Conf., pages 11--15. Springer, 2007. Google ScholarDigital Library
- J. Bernard, J. Brase, D. Fellner, O. Koepler, J. Kohlhammer, T. Ruppert, T. Schreck, and I. Sens. A visual digital library approach for time-oriented scientific primary data. Springer International Journal of Digital Libraries, ECDL 2010 Special Issue, 2011. Google ScholarDigital Library
- J. Bernard, T. Ruppert, M. Scherer, J. Kohlhammer, and T. Schreck. Reference list of 269 sources used for exploratory search. doi:10.1594/pangaea.778638, 2012.Google Scholar
- J. Bollen and H. Van de Sompel. An architecture for the aggregation and analysis of scholarly usage data. In Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries, JCDL '06, pages 298--307, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- C. L. Borgman, J. C. Wallis, and N. Enyedy. Little science confronts the data deluge: habitat ecology, embedded sensor networks, and digital libraries. International Journal on Digital Libraries, 7(1-2):17--30, 2007. Google ScholarDigital Library
- S. Bremm, T. von Landesberger, J. Bernard, and T. Schreck. Assisted descriptor selection based on visual comparative data analysis. Comput. Graph. Forum, 30(3):891--900, 2011. Google ScholarDigital Library
- N. Cao, D. Gotz, J. Sun, and H. Qu. Dicon: Interactive visual analysis of multidimensional clusters. IEEE Trans. Vis. Comput. Graph., 17(12):2581--2590, 2011. Google ScholarDigital Library
- S. Cha. Comprehensive survey on distance/similarity measures between probability density functions. International Journal of Mathematical Models and Methods in Applied Sciences, 1(4):300--307, 2007.Google Scholar
- K.-H. Cheung, E. Lim, M. Samwald, H. Chen, L. N. Marenco, M. Holford, T. M. Morse, P. Mutalik, G. M. Shepherd, and P. L. Miller. Approaches to neuroscience data integration. Briefings in Bioinformatics, 10(4):345--353, 2009.Google ScholarCross Ref
- T. Deselaers, D. Keysers, and H. Ney. Features for image retrieval: an experimental comparison. Information Retrieval, 11(2):77--107, 2008. Google ScholarDigital Library
- R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley, 2001. Google ScholarDigital Library
- M. Duke, M. Day, R. Heery, L. A. Carr, and S. J. Coles. Enhancing access to research data: the challenge of crystallography. In JCDL, pages 46--55. ACM, 2005. Google ScholarDigital Library
- M. Eitz, K. Hildebrand, T. Boubekeur, and M. Alexa. A descriptor for large scale image retrieval based on sketched feature lines. In Eurographics Symposium on Sketch-Based Interfaces and Modeling, pages 29--36. ACM, 2009. Google ScholarDigital Library
- T. Hey, S. Tansley, and K. Tolle, editors. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, Redmond, Washington, 2009.Google Scholar
- T. Hey and A. Trefethen. Cyberinfrastructure for e-Science. Science, 308(5723):817--821, 2005.Google ScholarCross Ref
- D. Hull, S. R. Pettifer, and D. B. Kell. Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web, 2008.Google Scholar
- D. Keim, J. Kohlhammer, G. Ellis, and F. Mansmann. Mastering the information age: solving problems with visual analytics. Eurographics, 2011.Google Scholar
- D. Keim, F. Mansmann, J. Schneidewind, and H. Ziegler. Challenges in visual data analysis. In Information Visualization, pages 9--16, 2006. Google ScholarDigital Library
- E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Inf. Systems, 3(3):263--286, 2001.Google ScholarCross Ref
- E. Keogh and S. Kasetty. On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Mining and Knowledge Discovery, 7(4):349--371, 2003. Google ScholarDigital Library
- C. Lagoze, S. Payette, E. Shin, and C. Wilper. Fedora: an architecture for complex objects and their relationships. Int. J. Digit. Libr., 6(2):124--138, 2006. Google ScholarDigital Library
- M. Lew, N. Sebe, C. Djeraba, and R. Jain. Content-based multimedia information retrieval: State of the art and challenges. Transactions on Multimedia Computing, Communications, and Applications, 2(1):1--19, 2006. Google ScholarDigital Library
- A. Noack. An energy model for visual graph clustering. In International Symposium on Graph Drawing, pages 425--436. Springer-Verlag, 2003.Google Scholar
- A. Ohmura, E. G. Dutton, B. Forgan, C. Frohlich, H. Gilgen, H. Hegner, A. Heimo, G. Konig-Langlo, B. mcarthur, G. Muller, R. Philipona, R. Pinker, C. H. Whitlock, K. Dehne, and M. Wild. Baseline surface radiation network (BSRN/WCRP): New precision radiometry for climate research. Bull. Amer. Met. Soc., 79:2115--2136, 1998.Google ScholarCross Ref
- PANGAEA Data Publisher for Earth & Environmental Science. http://www.pangaea.de/.Google Scholar
- A. Powell, M. Nilsson, A. Naeve, and P. Johnston. Dublin core metadata initiative - abstract model, 2005. White Paper.Google Scholar
- A. Rauber and M. Frühwirth. Automatically analyzing and organizing music archives. In ECDL, pages 402--414. Springer, 2001. Google ScholarDigital Library
- S. Rueger. Multimedia Information Retrieval. Morgan and Claypool Publishers, 2010. Google ScholarDigital Library
- M. Scherer, J. Bernard, and T. Schreck. Retrieval and exploratory search in multivariate research data repositories using regressional features. In JCDL, pages 363--372. ACM, 2011. Google ScholarDigital Library
- Sloan Digital Sky Survey. http://www.sdss.org/.Google Scholar
- L. D. Stein. Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges. Nature Reviews Genetics, 9(9):678--688, 2008.Google ScholarCross Ref
- The DataCite consortium. DataCite: Helping you to find, access, and reuse data. http://datacite.org/.Google Scholar
- R. Torres, S. M. McNee, M. Abel, J. A. Konstan, and J. Riedl. Enhancing digital libraries with techlens+. In Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries, JCDL '04, pages 228--236, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
- G. Tsatsaronis, I. Varlamis, S. Torge, M. Reimann, K. Nørvåg, M. Schroeder, and M. Zschunke. How to become a group leader? or modeling author types based on graph mining. TPDL, pages 15--26, 2011. Google ScholarDigital Library
- J. Vesanto. SOM-based data visualization methods. Intelligent Data Analysis, 3(2):111--126, 1999.Google ScholarCross Ref
- T. von Landesberger, A. Kuijper, T. Schreck, J. Kohlhammer, J. vanWijk, J.-D. Fekete, and D. Fellner. Visual analysis of large graphs: State-of-the-art and future research challenges. 2011.Google Scholar
- White and Roth. Exploratory Search - Beyond the query-response paradigm. Morgan and Claypool, 2009. Google ScholarDigital Library
- A. J. Williams. A perspective of publicly accessible/open-access chemistry databases. Drug discovery today, 13(11-12):495--501, 2008.Google Scholar
- I. H. Witten, R. J. Mcnab, S. J. Boddie, and D. Bainbridge. Greenstone: A comprehensive open-source digital library software system. In International Conference on Digital Libraries. ACM, 2000. Google ScholarDigital Library
- Wolfram Alpha. WolframjAlpha: Computational Knowledge Engine. http://www.wolframalpha.com/.Google Scholar
- B. Wong, S. Choudhury, C. Rooney, R. Chen, and K. Xu. Invisque: technology and methodologies for interactive information visualization and analytics in large library collections. TPDL, pages 227--235, 2011. Google ScholarDigital Library
Index Terms
- Content-based layouts for exploratory metadata search in scientific research data
Recommendations
Guided discovery of interesting relationships between time series clusters and metadata properties
i-KNOW '12: Proceedings of the 12th International Conference on Knowledge Management and Knowledge TechnologiesVisual cluster analysis provides valuable tools that help analysts to understand large data sets in terms of representative clusters and relationships thereof. Often, the found clusters are to be understood in context of belonging categorical, numerical ...
Is visualization usable for displaying web search results in an exploratory search context?
PROMISE'12: Proceedings of the 2012 international conference on Information Retrieval Meets Information VisualizationInformation visualization is defined as an interactive and graphic amplifying cognition. Moreover, the field of information retrieval is the original scope of information visualization. Nevertheless, many problems remain. The exploratory research ...
Visualizations in exploratory search: a user study with stock market information
i-KNOW '12: Proceedings of the 12th International Conference on Knowledge Management and Knowledge TechnologiesIn this paper we present an approach that integrates interactive visualizations in the exploratory search process. In this model visualizations can act as hubs where large amounts of information are made accessible in easy user interfaces. Through ...
Comments