skip to main content
research-article

On the Complexity of Query Result Diversification

Authors Info & Claims
Published:26 May 2014Publication History
Skip Abstract Section

Abstract

Query result diversification is a bi-criteria optimization problem for ranking query results. Given a database D, a query Q, and a positive integer k, it is to find a set of k tuples from Q(D) such that the tuples are as relevant as possible to the query, and at the same time, as diverse as possible to each other. Subsets of Q(D) are ranked by an objective function defined in terms of relevance and diversity. Query result diversification has found a variety of applications in databases, information retrieval, and operations research.

This article investigates the complexity of result diversification for relational queries. (1) We identify three problems in connection with query result diversification, to determine whether there exists a set of k tuples that is ranked above a bound with respect to relevance and diversity, to assess the rank of a given k-element set, and to count how many k-element sets are ranked above a given bound based on an objective function. (2) We study these problems for a variety of query languages and for the three objective functions proposed in Gollapudi and Sharma [2009]. We establish the upper and lower bounds of these problems, all matching, for both combined complexity and data complexity. (3) We also investigate several special settings of these problems, identifying tractable cases. Moreover, (4) we reinvestigate these problems in the presence of compatibility constraints commonly found in practice, and provide their complexity in all these settings.

Skip Supplemental Material Section

Supplemental Material

References

  1. S. Abiteboul, R. Hull, and V. Vianu. 1995. Foundations of Databases. Addison-Wesley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Adomavicius and A. Tuzhilin. 2005. Towards the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Engin. 17, 6, 734--749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal, S. Gollapudi, A. Halverson, and S. Leong. 2009. Diversifying search results. In Proceedings of the International Conference on Web Search and Web Data Mining (WSDM'09). 5--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Amer-Yahia. 2011. Recommendation projects at yahoo! IEEE Data Engin. Bull. 34, 2, 69--77.Google ScholarGoogle Scholar
  5. S. Amer-Yahia, F. Bonchi, C. Castillo, E. Feuerstein, I. Mendez-Diaz, and P. Zabala. 2013. Complexity and algorithms for composite retrieval. In Proceedings of the International Conference on World Wide Web (WWW'13). 79--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Berbeglia and G. Hahn. 2010. Counting feasible solutions of the traveling salesman problem with pickups and deliveries is #P-complete. Discr. Appl. Math. 157, 11, 2541--2547. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Borodin, H. C. Lee, and Y. Ye. 2012. Max-sum diversification, monotone submodular functions and dynamic updates. In Proceedings of the ACM SIGACT-SIGMOD Symposium on Principles of Database Systems (PODS'12). 155--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. Capannini, F. M. Nardini, R. Perego, and F. Silvestri. 2011. Efficient diversification of web search results. Proc. VLDB Endow. 4, 7, 451--459. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Z. Chen and T. Li. 2007. Addressing diverse user preferences in SQL-query-resul navigation. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'07). 641--652. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Demidova, P. Fankhauser, X. Zhou, and W. Nejdl. 2010. DivQ: Diversification for keyword search over structured databases. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'10). 331--338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Deng and W. Fan. 2013. On the complexity of query result diversification. Proc. VLDB Endow. 6, 8, 577--588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Deng, W. Fan, and F. Geerts. 2012. On the complexity of package recommendation problems. In Proceedings of the ACM SIGACT-SIGMOD Symposium on Principles of Database Systems (PODS'12). 261--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Drosou and E. Pitoura. 2009. Diversity over continuous data. IEEE Data Engin. Bull. 32, 4.Google ScholarGoogle Scholar
  14. M. Drosou and E. Pitoura. 2010. Search result diversification. SIGMOD Rec. 39, 1, 41--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Durand, M. Hermann, and P. G. Kolaitis. 2005. Subtractive reductions and complete problems for counting complexity classes. Theor. Comput. Sci. 340, 3, 496--513. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Fagin, A. Lotem, and M. Naor. 2003. Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci. 66, 4, 614--656. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Feuerstein, P. A. Heiber, J. Martinez-Viademonte, and R. A. Baeza-Yates. 2007. New stochastic algorithms for scheduling ads in sponsored search. In Proceedings of the 5th Latin American Web Congress (LA-WEB'07). 22--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Fraternali, D. Martinenghi, and M. Tagliasacchi. 2012. Top-k bounded diversification. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'12). 421--432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Gollapudi and A. Sharma. 2009. An axiomatic approach for result diversification. In Proceedings of the 18th International Conference on World Wide Web (WWW'09). 381--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. A. Hemaspaandra and H. Vollmer. 1995. The satanic notations: Counting classes beyond #P and other definitional adventures. SIGACT News 26, 1, 2--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. I. F. Ilyas, G. Beskales, and M. A. Soliman. 2008. A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40, 4, 11:1--11:58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. W. Jin and J. M. Patel. 2011. Efficient and generic evaluation of ranked queries. In Proceedings of the ACM SIGMOD International Conference on Management of data (SIGMOD'11). 601--612. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Koutrika, B. Bercovitz, and H. Garcia-Molina. 2009. FlexRecs: Expressing and combining flexible recommendations. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'09). 745--758. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. E. Ladner. 1989. Polynomial space counting problems. SIAM J. Comput. 18, 6, 1087--1097. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. Lappas, K. Liu, and E. Terzi. 2009. Finding a team of experts in social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'09). 467--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Li, M. A. Soliman, K. C.-C. Chang, and I. F. Ilyas. 2005. RankSQL: Supporting ranking queries in relational database management systems. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB'05). 1342--1345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Z. Liu, P. Sun, and Y. Chen. 2009. Structured search result differentiation. Proc. VLDB Endow. 2, 1, 313--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. E. Minack, G. Demartini, and W. Nejdl. 2009. Current approaches to search result diversification. In Proceedings of the 1st International Workshop on Living Web.Google ScholarGoogle Scholar
  29. C. H. Papadimitriou. 1994. Computational Complexity. Addison-Wesley.Google ScholarGoogle Scholar
  30. A. G. Parameswaran, H. Garcia-Molina, and J. D. Ullman. 2010. Evaluating, combining and generalizing recommendations with prerequisites. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM'10). 919--928. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. G. Parameswaran, P. Venetis, and H. Garcia-Molina. 2011. Recommendation systems with complex constraints: A course recommendation perspective. ACM Trans. Inf. Syst. 29, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. O. A. Prokopyev, N. Kong, and D. L. Martinez-Torres. 2009. The equitable dispersion problem. Euro. J. Oper. Res. 197, 1, 59--67.Google ScholarGoogle ScholarCross RefCross Ref
  33. K. Schnaitter and N. Polyzotis. 2008. Evaluating rank joins with optimal cost. In Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS'08). 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Stefanidis, M. Drosou, and E. Pitoura. 2010. Perk: Personalized keyword search in relational databases through preferences. In Proceedings of the 13th International Conference on Extending Database Technology (EDBT'10). 585--596 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. L. Valiant. 1979. The complexity of computing the permanent. Theor. Comput. Sci. 8, 2, 189--201.Google ScholarGoogle ScholarCross RefCross Ref
  36. M. Y. Vardi. 1982. The complexity of relational query languages. In Proceedings of the 14th Annual ACM Symposium on Theory of Computing (STOC'82). 137--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S. A. Yahia. 2008. Efficient computation of diverse query results. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE'08). 228--236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. R. Vieira, H. L. Razente, M. C. N. Barioni, M. Hadjieleftheriou, D. Srivastava, Traina., and V. J. Tsotras. 2011. On query result diversification. In Proceedings of the 27th IEEE International Conference on Data Engineering (ICDE'11). 1163--1174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. M. Xie, L. V. S. Lakshmanan, and P. T. Wood. 2012. Composite recommendations: From items to packages. Frontiers Comput. Sci. 6, 3, 264--277.Google ScholarGoogle ScholarCross RefCross Ref
  40. C. Yu, L. Lakshmanan, and S. Amer-Yahia. 2009a. It takes variety to make a world: Diversification in recommender systems. In Proceedings of the 12th International Conference on Extending Database Technology (EDBT'09). 368--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. C. Yu, L. V. Lakshmanan, and S. Amer-Yahia. 2009b. Recommendation diversification using explanations. In Proceedings of the IEEE International Conference on Data Engineering (ICDE'09). 1299--1302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Zhang and N. Hurley. 2008. Avoiding monotony: Improving the diversity of recommendation lists. In Proceedings of the ACM Conference on Recommender Systems (RecSys'08). 123--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. C.-N. Ziegler, S. M. Mcnee, J. A. Konstan, and G. Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW'05). 22--32. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On the Complexity of Query Result Diversification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Database Systems
        ACM Transactions on Database Systems  Volume 39, Issue 2
        May 2014
        336 pages
        ISSN:0362-5915
        EISSN:1557-4644
        DOI:10.1145/2627748
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 May 2014
        • Accepted: 1 March 2014
        • Revised: 1 January 2014
        • Received: 1 May 2013
        Published in tods Volume 39, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader