skip to main content
research-article

Towards a theory of search queries

Published:12 October 2010Publication History
Skip Abstract Section

Abstract

The need to manage diverse information sources has triggered the rise of very loosely structured data models, known as dataspace models. Such information management systems must allow querying in simple ways, mostly by a form of searching. Motivated by these developments, we propose a theory of search queries in a general model of dataspaces. In this model, a dataspace is a collection of data objects, where each data object is a collection of data items. Basic search queries are expressed using filters on data items, following the basic model of Boolean search in information retrieval. We characterize semantically the class of queries that can be expressed by searching. We apply our theory to classical relational databases, where we connect search queries to the known class of fully generic queries, and to dataspaces where data items are formed by attribute-value pairs. We also extend our theory to a more powerful, associative form of searching, where one can ask for objects that are similar to objects satisfying given search conditions. Such associative search queries are shown to correspond to a very limited kind of joins. We show that the basic search language extended with associative search can exactly define the queries definable in a restricted fragment of the semijoin algebra working on an explicit relational representation of the dataspace.

References

  1. Abiteboul, S., Hull, R., and Vianu, V. 1995. Foundations of Databases. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agrawal, R., Somani, A., and Xu, Y. 2001. Storage and querying of e-commerce data. In Proceedings of the 27th International Conference on Very Large Data Bases. 149--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aho, A., and Ullman, J. 1979. Universality of data retrieval languages. In Conference Record of the 6th ACM Symposium on Principles of Programming Languages. 110--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Alkhateeb, F., Baget, J.-F., and Euzenat, J. 2009. Extending SPARQL with regular expression patterns (for querying RDF). J. Web Sem. 7, 2, 57--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Beeri, C., Milo, T., and Ta-Shma, P. 1996. On genericity and parametricity. In Proceedings of the 15th ACM Symposium on Principles of Database Systems. 104--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Beeri, C., Milo, T., and Ta-Shma, P. 1997. Towards a language for the fully generic queries. In Database Programming Languages, S. Cluet and R. Hull, Eds. Lecture Notes in Computer Science. Springer, 239--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Blackburn, P., de Rijke, M., and Venema, Y. 2001. Modal Logic. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Blackburn, P., van Benthem, J., and Wolter, F., Eds. 2007. Handbook of Modal Logic. Elsevier.Google ScholarGoogle Scholar
  9. Chandra, A., and Harel, D. 1980. Computable queries for relational data bases. J. Comput. Syst. Sci. 21, 2, 156--178.Google ScholarGoogle ScholarCross RefCross Ref
  10. DeCandia, G., Hastorun, D. Jampani, M., Kakulapati, G., Lakshman, A., Pilahin, A., Sivasubramanian, S., Vosshall, P., and Vogels, W. 2007. Dynamo: Amazon's highly available key-value store. SIGOPS Oper. Syst. Rev. 41, 6, 205--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dittrich, J.-P., and Vaz Salles, M. 2006. iDM: A unified and versatile data model for personal dataspace management. In Proceedings of the 32nd International Conference on Very Large Data Bases. 367--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dong, X., and Halevy, A. 2007. Indexing dataspaces. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 43--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fletcher, G. H. L. 2008. An algebra for basic graph patterns. Logic in Databases (LID). Informal Proceedings.Google ScholarGoogle Scholar
  14. Fletcher, G. H. L., Van den Bussche, J., Van Gucht, D., and Vansummeren, S. 2009. Towards a theory of search queries. In Proceedings of the 12th International Conference on Database Theory. 201--211. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Golenberg, K., Kimelfeld, B., and Sagiv, Y. 2008. Keyword proximity search in complex data graphs. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 927--940. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Goranko, V., and Otto, M. 2007. Model theory of modal logic. in Handbook of Hoda Logic, P. Blackburn, J. van Benthem, and F. Wolter, Ed., Chapter 5.Google ScholarGoogle Scholar
  17. Gutierrez, C., Hurtado, C., and Mendelzon, A. 2004. Foundations of semantic Web databases. In Proceedings of the 23rd ACM Symposium on Principles of Database Systems. 95--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Halevy, A., Franklin, M., and Maier, D. 2006. Principles of dataspace systems. In Proceedings of the 25th ACM Symposium on Principles of Database Systems. 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jagadish, H., Chapman A., Elkiss, A., Jayapandian, M., Li, Y., Nandi, A., and Yu, C. 2007. Making database systems usable. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 13--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jain, M., Mendhekar, A., and Van Gucht, D. 1995. A uniform data model for relational data and meta-data query processing. In Advances in Data Management '95, Tata McGraw-Hill, 146--165.Google ScholarGoogle Scholar
  21. Leinders, D., Marx, M., Tyszkiewicz, J., and Van den Bussche, J. 2005. The semijoin algebra and the guarded fragment. J. Logic Lang. Inform. 14, 331--343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Leinders, D., and Van den Bussche, J. 2007. On the complexity of division and set joins in the relational algebra. J. Comput. Syst. Sci. 73, 4, 538--549. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Litwin, W., Ketabchi, M., and Krishnamurthy, R. 1991. First order normal form for relational databases and multidatabases. SIGMOD Rec. 20, 4, 74--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Manning, C., Raghavan, P., and Shütze, H. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Olston, C., Reed, B., Srivastava, U., Kumar, R., and Tomkins, A. 2008. Pig latin: a not-so-foreign language for data processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 1099--1110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Pérez, J., Arenas, M., and Gutierrez, C. 2008. nSPARQL: A navigational language for RDF. In Proceedings of the International Semantic Web Conference. 66--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Pérez, J., Arenas, M., and Gutierrez, C. 2009. Semantics and complexity of SPARQL. ACM Trans. Datab. Syst. 34, 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Qin, L., Yu, J.X., and Chang, L. 2009. Keyword search in databases: the power of RDBMS. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 681--694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ross, K., and Janevski, A. 2005. Querying faceted databases. In Proceedings of the 2nd International Workshop on Semantic Web and Databases. Lecture Notes in Computer Science, vol. 3372, Springer, 199--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sarawagi, S., and Kirpal, A. 2004. Efficient set joins on similarity predicates. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 743--754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ullman, J. 1989. Principles of Database and Knowledge-Base Systems. Vol. II. Computer Science Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. W3C. 2004. RDF primer. W3C Recommendation.Google ScholarGoogle Scholar
  33. W3C. 2008. SPARQL query language for RDF. W3C Recommendation.Google ScholarGoogle Scholar

Index Terms

  1. Towards a theory of search queries

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Database Systems
            ACM Transactions on Database Systems  Volume 35, Issue 4
            November 2010
            230 pages
            ISSN:0362-5915
            EISSN:1557-4644
            DOI:10.1145/1862919
            Issue’s Table of Contents

            Copyright © 2010 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 12 October 2010
            • Accepted: 1 April 2010
            • Revised: 1 March 2010
            • Received: 1 October 2009
            Published in tods Volume 35, Issue 4

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader