skip to main content
10.1145/1055558.1055580acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Computational complexity of itemset frequency satisfiability

Published: 14 June 2004 Publication History

Abstract

Computing frequent itemsets is one of the most prominent problems in data mining. We introduce a new, related problem, called FREQSAT: given some itemset-interval pairs, does there exist a database such that for every pair the frequency of the itemset falls in the interval? It is shown in this paper that FREQSAT is not finitely axiomatizable and that it is NP-complete. We also study cases in which other characteristics of the database are given as well. These characteristics can complicate FREQSAT even more. For example, when the maximal number of duplicates of a transaction is known, FREQSAT becomes PP-hard. We describe applications of FREQSAT in frequent itemset mining algorithms and privacy in data mining.

References

[1]
R. Agrawal, T. Imilienski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. ACM SIGMOD, pages 207--216, 1993.
[2]
R. Agrawal, and R. Srikant. Privacy-preserving data mining. In Proc. ACM SIGMOD, pages 439--450, 2000.
[3]
Y. Bastide, R. Taouil, N. Pasquier, G. Stumme, and L. Lakhal. Mining frequent patterns with counting inference. SIGKDD Explorations, 2(2):66--75, 2000.
[4]
R. J. Bayardo, Efficiently mining long patterns from databases. In Proc. ACM SIGMOD, pages 85--93, 1998.
[5]
T. Calders, Deducing bounds on the support of itemsets. In Database Support for Data Mining Applications, LNCS 2682, Springer, to appear 2004.
[6]
T. Calders and B. Goethals. Mining all non-derivable frequent itemsets. In Proc. PKDD Int. Conf. Principles of Data Mining and Knowledge Discovery, pages 74--85. Springer, 2002.
[7]
V. Chvátal. Recognizing intersection patterns. Annals of Discrete Mathematics - Combinatorics 79, 8(I):249--251, 1980.
[8]
M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-completeness. Freeman, New York, 1979.
[9]
G. Georgakopoulos, D. Kavvadias, and C. H. Papadimitriou. Probabilistic satisfiability. Journal of Complexity, 4:1--11, 1988.
[10]
J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Algorithms for association rule mining - a general survey and comparison. SIGKDD Explorations, 2(1):58--64, 2000.
[11]
T. Lukasiewicz. Probabilistic logic programming with conditional constraints. ACM Transactions on Computational Logic, 2(3):289--339, 2001.
[12]
H. Mannila and H. Toivonen. Multiple uses of frequent sets and condensed representations. In Proc. KDD Int. Conf. Knowledge Discovery in Databases, 1996.
[13]
T. Mielikäinen. On inverse frequent set mining. In Workshop on Privacy Preserving Data Mining, 2003.
[14]
N. Nilsson. Probabilistic logic. Artificial Intelligence, 28:71--87, 1986.
[15]
C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.

Cited By

View all
  • (2022)Machine learning methods for generating high dimensional discrete datasetsWIREs Data Mining and Knowledge Discovery10.1002/widm.145012:2Online publication date: 18-Jan-2022
  • (2019)Extending inverse frequent itemsets mining to generate realistic datasets: complexity, accuracy and emerging applicationsData Mining and Knowledge Discovery10.1007/s10618-019-00643-133:6(1736-1774)Online publication date: 1-Nov-2019
  • (2019)Items2Data: Generating Synthetic Boolean Datasets from ItemsetsDatabases Theory and Applications10.1007/978-3-030-12079-5_6(79-90)Online publication date: 23-Jan-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '04: Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2004
350 pages
ISBN:158113858X
DOI:10.1145/1055558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2004

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS04

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Machine learning methods for generating high dimensional discrete datasetsWIREs Data Mining and Knowledge Discovery10.1002/widm.145012:2Online publication date: 18-Jan-2022
  • (2019)Extending inverse frequent itemsets mining to generate realistic datasets: complexity, accuracy and emerging applicationsData Mining and Knowledge Discovery10.1007/s10618-019-00643-133:6(1736-1774)Online publication date: 1-Nov-2019
  • (2019)Items2Data: Generating Synthetic Boolean Datasets from ItemsetsDatabases Theory and Applications10.1007/978-3-030-12079-5_6(79-90)Online publication date: 23-Jan-2019
  • (2016)A Contrary Recurrent Item Set Hierarchy in PPDDMNational Academy Science Letters10.1007/s40009-016-0460-239:4(283-286)Online publication date: 15-Apr-2016
  • (2016)A transversal hypergraph approach for the frequent itemset hiding problemKnowledge and Information Systems10.1007/s10115-015-0862-347:3(625-645)Online publication date: 1-Jun-2016
  • (2016)Interactive knowledge discovery from hidden data through sampling of frequent patternsStatistical Analysis and Data Mining10.1002/sam.113229:4(205-229)Online publication date: 1-Aug-2016
  • (2013)Solving inverse frequent itemset mining with infrequency constraints via large-scale linear programsACM Transactions on Knowledge Discovery from Data10.1145/2541268.25412717:4(1-39)Online publication date: 25-Dec-2013
  • (2013)Association rule hiding methodsWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.10823:1(28-36)Online publication date: 1-Jan-2013
  • (2012)Tractable reasoning problems with fully-characterized association rulesProceedings of the 16th East European conference on Advances in Databases and Information Systems10.1007/978-3-642-33074-2_21(282-295)Online publication date: 18-Sep-2012
  • (2012)Count constraints and the inverse OLAP problemProceedings of the 7th international conference on Foundations of Information and Knowledge Systems10.1007/978-3-642-28472-4_20(352-369)Online publication date: 5-Mar-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media