skip to main content
10.1145/2505515.2505577acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Boolean satisfiability for sequence mining

Published: 27 October 2013 Publication History

Abstract

In this paper, we propose a SAT-based encoding for the problem of discovering frequent, closed and maximal patterns in a sequence of items and a sequence of itemsets. Our encoding can be seen as an improvement of the approach proposed in [8] for the sequences of items. In this case, we show experimentally on real world data that our encoding is significantly better. Then we introduce a new extension of the problem to enumerate patterns in a sequence of itemsets. Thanks to the flexibility and to the declarative aspects of our SAT-based approach, an encoding for the sequences of itemsets is obtained by a very slight modification of that for the sequences of items.

References

[1]
R. Agrawal and R. Srikant. Mining sequential patterns. In A. L. P. C. Philip S. Yu, editor, Proceedings of the Eleventh International Conference on Data Engineering (ICDE'1995), pages 3--14. IEEE Computer Society, 1995.
[2]
H. Arimura and T. Uno. An efficient polynomial space and polynomial delay algorithm for enumeration of maximal motifs in a sequence. Journal of Combinatorial Optimization, 13, 2007.
[3]
R. Asin, R. Nieuwenhuis, A. Oliveras, and E. Rodriguez-Carbonell. Cardinality networks: a theoretical and empirical study. Constraints, 16(2):195--221, 2011.
[4]
O. Bailleux and Y. Boufkhad. Efficient CNF Encoding of Boolean Cardinality Constraints. In 9th International Conference on Principles and Practice of Constraint Programming - CP 2003, pages 108--122, 2003.
[5]
O. Bailleux, Y. Boufkhad, and O. Roussel. A translation of pseudo boolean constraints to sat. Journal on Satisfiability, Boolean Modeling and Computation (JSAT), 2(1-4), 2006.
[6]
A. Biere, M. J. H. Heule, H. van Maaren, and T. Walsh, editors. Handbook of Satisfiability, volume 185 of Frontiers in AI and Applications. IOS Press, 2009.
[7]
E. Coquery, S. Jabbour, and L. Sais. A constraint programming approach for enumerating motifs in a sequence. In M. Spiliopoulou, H. Wang, D. J. Cook, J. Pei, W. Wang, O. R. Zaïane, and X. Wu, editors, ICDM Workshops, pages 1091--1097. IEEE, 2011.
[8]
E. Coquery, S. Jabbour, L. Saïs, and Y. Salhi. A SAT-Based Approach for Discovering Frequent, Closed and Maximal Patterns in a Sequence. In 20th European Conference on Artificial Intelligence ECAI, pages 258--263, 2012.
[9]
M. Davis, G. Logemann, and D. W. Loveland. A machine program for theorem-proving. Communications of the ACM, 5(7):394--397, 1962.
[10]
L. De Raedt, T. Guns, and S. Nijssen. Constraint Programming for Itemset Mining. In Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD'2008), pages 204--212, Las Vegas, Nevada, USA, August 24-27 2008.
[11]
M. Gebser, B. Kaufmann, A. Neumann, and T. Schaub. clasp : A conict-driven answer set solver. In 9th International Conference Logic Programming and Nonmonotonic Reasoning (LPNMR'2007), volume 4483 of Lecture Notes in Computer Science, pages 260--265. Springer, 2007.
[12]
K. Gouda, M. Hassaan, and M. J. Zaki. Prism: A primal-encoding approach for frequent sequence mining. In Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), pages 487--492, Omaha, Nebraska, USA, October 28-31 2007. IEEE Computer Society.
[13]
O. Grumberg, A. Schuster, and A. Yadgar. Memory efficient all-solutions sat solver and its application for reachability analysis. In In Proceedings of the 5th International Conference on Formal Methods in Computer-Aided Design (FMCAD), pages 275--289. Springer, 2004.
[14]
T. Guns, S. Nijssen, and L. D. Raedt. Itemset mining: A constraint programming perspective. Artificial Intelligence, 175(12-13):1951--1983, 2011.
[15]
Y. Hamadi, S. Jabbour, and L. Sais. Learning from conicts in propositional satis ability. 4OR, 10(1):15--32, 2012.
[16]
S. Jabbour, L. Sais, and Y. Salhi. A pigeon-hole based encoding of cardinality constraints. In In proceedings of the 29th International Conference on Logic Programming (ICLP'2013), August 24-29 2013.
[17]
J. P. Marques-Silva and K. A. Sakallah. GRASP - A New Search Algorithm for Satis ability. In Proceedings of IEEE/ACM CAD, pages 220--227, 1996.
[18]
L. Parida, I. Rigoutsos, A. Floratos, D. Platt, and Y. Gao. Pattern Discovery on Character Sets and Real-valued Data: Linear Bound on Irredundant Motifs and an Efficient Polynomial Time Algorithm. In ACM-SIAM Symposium on Discrete Algorithms, pages 297--308, 2000.
[19]
L. Parida, I. Rigoutsos, and D. Platt. An output-sensitive exible pattern discovery algorithm. In proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching (CPM'2001), volume 2089 of Lecture Notes in Computer Science, pages 131--142. Springer, 2001.
[20]
N. Pisanti, M. Crochemore, R. Grossi, and M.-F. Sagot. Bases of motifs for generating repeated patterns with wild cards. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB'2005), 2(1):40--50, 2005.
[21]
C. Sinz. Towards an Optimal CNF Encoding of Boolean Cardinality Constraints. In 11th International Conference on Principles and Practice of Constraint Programming - CP 2005, pages 827--831, 2005.
[22]
G. Tseitin. On the complexity of derivations in the propositional calculus. In H. Slesenko, editor, Structures in Constructives Mathematics and Mathematical Logic, Part II, pages 115--125, 1968.
[23]
J. P. Warners. A linear-time transformation of linear inequalities into conjunctive normal form. Information Processing Letters, 68(2):63--69, 1998.
[24]
L. Zhang, C. F. Madigan, M. W. Moskewicz, and S. Malik. Efficient conict driven learning in Boolean satisfiability solver. In IEEE/ACM CAD'2001, pages 279--285, 2001.
[25]
W. Zhao and W. Wu. Asig: An all-solution sat solver for cnf formulas. In 11th International Conference on Computer-Aided Design and Computer Graphics, CAD/Graphics 2009, Huangshan, China, August 19-21 (CAD/Graphics), pages 508--513. IEEE, 2009.

Cited By

View all
  • (2023)A symbolic approach to computing disjunctive association rules from dataProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/237(2133-2141)Online publication date: 19-Aug-2023
  • (2023)HardSATGEN: Understanding the Difficulty of Hard SAT Formula Generation and A Strong Structure-Hardness-Aware BaselineProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599837(4414-4425)Online publication date: 6-Aug-2023
  • (2021)Mining Closed High Utility Itemsets based on Propositional SatisfiabilityData & Knowledge Engineering10.1016/j.datak.2021.101927136:COnline publication date: 1-Nov-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
October 2013
2612 pages
ISBN:9781450322638
DOI:10.1145/2505515
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data mining
  2. propositional satisfiability and modeling

Qualifiers

  • Research-article

Conference

CIKM'13
Sponsor:
CIKM'13: 22nd ACM International Conference on Information and Knowledge Management
October 27 - November 1, 2013
California, San Francisco, USA

Acceptance Rates

CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A symbolic approach to computing disjunctive association rules from dataProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/237(2133-2141)Online publication date: 19-Aug-2023
  • (2023)HardSATGEN: Understanding the Difficulty of Hard SAT Formula Generation and A Strong Structure-Hardness-Aware BaselineProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599837(4414-4425)Online publication date: 6-Aug-2023
  • (2021)Mining Closed High Utility Itemsets based on Propositional SatisfiabilityData & Knowledge Engineering10.1016/j.datak.2021.101927136:COnline publication date: 1-Nov-2021
  • (2021)Towards a Compact SAT-Based Encoding of Itemset Mining TasksIntegration of Constraint Programming, Artificial Intelligence, and Operations Research10.1007/978-3-030-78230-6_11(163-178)Online publication date: 5-Jul-2021
  • (2020)A SAT-Based Approach for Mining High Utility Itemsets from Transaction DatabasesBig Data Analytics and Knowledge Discovery10.1007/978-3-030-59065-9_8(91-106)Online publication date: 14-Sep-2020
  • (2019)Bee swarm optimization for solving the MAXSAT problem using prior knowledgeSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-017-2956-123:9(3095-3112)Online publication date: 1-May-2019
  • (2018)On Maximal Frequent Itemsets Mining with ConstraintsPrinciples and Practice of Constraint Programming10.1007/978-3-319-98334-9_36(554-569)Online publication date: 23-Aug-2018
  • (2018)On Maximal Frequent Itemsets EnumerationProceedings of the Ninth International Conference on Soft Computing and Pattern Recognition (SoCPaR 2017)10.1007/978-3-319-76357-6_15(151-160)Online publication date: 10-Mar-2018
  • (2017)Local and global symmetry breaking in itemset miningAnnals of Mathematics and Artificial Intelligence10.1007/s10472-016-9528-480:1(91-112)Online publication date: 1-May-2017
  • (2017)Hybrid ASP-Based Approach to Pattern MiningRules and Reasoning10.1007/978-3-319-61252-2_14(199-214)Online publication date: 14-Jun-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media